Towards a Fully-Parallelizable Algebraic Multigrid

Robert L. Falgout * and Van Emden Henson**

rfalgout@llnl.gov

vhenson@nps.navy.mil



*Center for Applied Scientific Computing, Lawrence Livermore National Laboratory, P.O. Box 808, L-316, Livermore, California 94551
**Naval Postgraduate School, Department of Mathematics, Code Ma/Hv 1411 Cunningham Road rm 341, Monterey, Ca. 93943-5216


Abstract

Algebraic multigrid (AMG) is currently undergoing something of a renaissance. Developed in the nineteen seventies, interest in AMG has increased greatly in recent times, largely because of the apparent applicability of AMG to large, sparse problems discretized on unstructured grids. AMG offers the potential of true multigrid performance on such problems, which are extremely difficult to solve efficiently using other methods.

AMG works by using the algebraic properties of the operator matrix to select some of the equations as "coarse-grid" equations. As in geometric multigrid, the error is smoothed on the full (fine-grid) problem by relaxation, the residual is then transferred to the coarse equations, where the error can be computed at greatly reduced cost. The error is then transferred back to the fine grid, where it is used to correct the original approximation. Recursive application of this technique leads to a multigrid algorithm.

AMG thus consists of two distinct phases. First, a "setup" phase occurs, during which the selection of coarse- and fine-grid equations is made. The intergrid transfer operators are constructed in this phase. Next, the "solution" phase occurs, in which the multigrid cycling takes place.

Many of the problems of current interest are extremely large, with systems having millions of equations and unknowns. Efficient solution of such problems, therefore, will demand the application of parallel procesing technology. Multigrid lends itself well to parallelization, and as a result, the solution phase is easily parallelized, using the same techniques as are used in geometric multigrid. However, the fundamental algorithms of the setup phase are inherently serial in nature, and efforts to increase the efficiency through parallelization have not been terribly successful.

This paper describes new algorithms for parallel implementation of the setup phase. Much of the setup phase is essentially a graph coloring problem, and the algorithms are thus closely related to the question of parallel implementation of graph coloring algorithms. Discussion of the algorithms centers around the mapping of the original equations to the processors, the simultaneous coloring of the individual graphs within processors, and the need for communication to resolve conflicting assignments.