Parallel U-Cycle Multigrid Method

Dexuan Xie

Courant Institute of Mathematical Sciences, New York University,
251 Mercer Street, New York, NY 10012

L. Ridgway Scott

Department of Mathematics, University of Houston, Houston, TX 77204


A simple way to avoid idle processors in implementing the multigrid method based on a sequence of nested grids, \Omega_j for j = 1, 2,..., l with \Omega_1 being the coarsest grid, on a parallel computer is to select a proper grid \Omega_j with 1 < j < l as the new coarsest grid. The aim of this paper is to show that such an approach is an efficient way to implement the multigrid method on a large scale, multiprocessor computer. In particular, the variant of the V-cycle generated by this approach, which is called the U-cycle, is studied in this paper. It is shown that the convergence rate \rho(j) of the U-cycle with grid \Omega_j being the coarsest grid is a decreasing function of j, and the coarsest grid equations of the U-cycle can be solved approximately without increasing the total number of U-cycle iterations. Then, a parallel U-cycle is defined by using domain partitioning techniques, which can be implemented on a MIMD multiprocessor computer without any idle processors. An analysis of the time complexity of the parallel U-cycle shows that the parallel U-cycle is fully scalable, and can have super-linear speedup in comparison to the original V-cycle. Further, the performance of the parallel U-cycle in the memory-constrained case is discussed. Numerical results are presented showing that the U-cycle can have a better performance than the V-cycle on a sequential computer while the parallel U-cycle has a high efficiency on a large scale, MIMD multiprocessor computer. Experiments are presented for both the Intel Paragon and the IBM SP2.