PetScKrylovSolver sometimes crashes in parallel

Bug #1036992 reported by Benjamin Kehlet
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
DOLFIN
Won't Fix
Undecided
Unassigned

Bug Description

When I run the unit test test/unit/la/python/KrylovSolver.py with three processes it crashes frequently with the following error message:

Warning -- row partitioning does not line up! Partitioning incomplete!
[2]PETSC ERROR: MatHYPRE_IJMatrixCreate() line 76 in src/dm/da/utils/mhyp.c
[2]PETSC ERROR: PCSetUp_HYPRE() line 112 in src/ksp/pc/impls/hypre/hypre.c
[2]PETSC ERROR: PCSetUp() line 795 in src/ksp/pc/interface/precon.c
[2]PETSC ERROR: KSPSetUp() line 237 in src/ksp/ksp/interface/itfunc.c
[2]PETSC ERROR: KSPSolve() line 353 in src/ksp/ksp/interface/itfunc.c
[2]PETSC ERROR: PCDestroy_HYPRE() line 204 in src/ksp/pc/impls/hypre/hypre.c
[2]PETSC ERROR: PCDestroy() line 83 in src/ksp/pc/interface/precon.c
[2]PETSC ERROR: KSPDestroy() line 695 in src/ksp/ksp/interface/itfunc.c

or with this message:

Inconsistent partitioning -- HYPRE_IJVectorCreate
[2]PETSC ERROR: VecHYPRE_IJVectorCreate() line 19 in src/vec/vec/impls/hypre/vhyp.c
[2]PETSC ERROR: PCSetUp_HYPRE() line 118 in src/ksp/pc/impls/hypre/hypre.c
[2]PETSC ERROR: PCSetUp() line 795 in src/ksp/pc/interface/precon.c
[2]PETSC ERROR: KSPSetUp() line 237 in src/ksp/ksp/interface/itfunc.c
[2]PETSC ERROR: KSPSolve() line 353 in src/ksp/ksp/interface/itfunc.c
[2]PETSC ERROR: PCDestroy_HYPRE() line 201 in src/ksp/pc/impls/hypre/hypre.c
[2]PETSC ERROR: PCDestroy() line 83 in src/ksp/pc/interface/precon.c
[2]PETSC ERROR: KSPDestroy() line 695 in src/ksp/ksp/interface/itfunc.c

Note that this happens only in about 10% of the runs, so there seems to be a race condition somewhere.

The following code (which is essentially a part of the unit test) reproduces the bug on my computer (Ubuntu 12.04 with PetSc installed through apt-get) when run with

mpirun -np 3 python KrylovSolver.py

from dolfin import *

# Assemble system
mesh = UnitSquare(32, 32)

V = FunctionSpace(mesh, 'CG', 1)
bc = DirichletBC(V, Constant(0.0), lambda x, on_boundary: on_boundary)
u = TrialFunction(V); v = TestFunction(V);
u1 = Function(V)

# Forms
a, L = inner(grad(u), grad(v))*dx, Constant(1.0)*v*dx
a_L = inner(grad(u1), grad(v))*dx

# Assemble linear algebra objects
A = assemble(a)
b = assemble(L)
bc.apply(A, b)

# Get solution vector
tmp = Function(V)
x = tmp.vector()

# Get solution vector
x_petsc = down_cast(x)

# With PETScPreconditioner interface
solver = PETScKrylovSolver("gmres", PETScPreconditioner("amg"))
solver.solve(A, x_petsc, down_cast(b))

Revision history for this message
Garth Wells (garth-wells) wrote :

I think that this is a bug in the Ubuntu PETSc package.

Revision history for this message
Johannes Ring (johannr) wrote :

I could not reproduce this in Ubuntu 12.04 with the libpetsc3.1-dev package.

Revision history for this message
Benjamin Kehlet (benjamik) wrote :

Garth: Very likely.
Johannes: Strange, we got the same problem on Anders's laptop. Did you with "-np 3" and many times?

Revision history for this message
Johannes Ring (johannr) wrote :

Depends on what you mean by many times. I ran it 50 times in a for loop:

  for i in `seq 1 50`; do mpirun -np 3 python KrylovSolver.py; done

I did this twice and I didn't get any error. However, I tried it again now, and now I got the same error as you reported.

(I didn't get the error on Debian unstable with the libpetsc3.2-dev package.)

Revision history for this message
Anders Logg (logg) wrote : Re: [Bug 1036992] Re: PetScKrylovSolver sometimes crashes in parallel

On Wed, Aug 15, 2012 at 08:21:59AM -0000, Garth Wells wrote:
> I think that this is a bug in the Ubuntu PETSc package.

No, I have the same problem on my laptop with a newly built PETSc 3.2-p6.

--
Anders

Revision history for this message
Johannes Ring (johannr) wrote :

I tried again with a default Dorsal setup (no PETSc Ubuntu package) on Ubuntu 12.04. This time I ran the test program for over 30 minutes in a while loop and I didn't recieve any errors.

Revision history for this message
Garth Wells (garth-wells) wrote :

On 15 August 2012 12:42, Anders Logg <email address hidden> wrote:
> On Wed, Aug 15, 2012 at 08:21:59AM -0000, Garth Wells wrote:
>> I think that this is a bug in the Ubuntu PETSc package.
>
> No, I have the same problem on my laptop with a newly built PETSc
> 3.2-p6.
>

Try with an up-to-date PETsc (v.3.3).

I have seen this problem before, but not for a long time. I don't use
the Ubuntu PETSc package, nor an Ubuntu MPI package (which have had
bugs in the past).

Garth

> --
> Anders
>
> --
> You received this bug notification because you are a member of DOLFIN
> Core Team, which is subscribed to DOLFIN.
> https://bugs.launchpad.net/bugs/1036992
>
> Title:
> PetScKrylovSolver sometimes crashes in parallel
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/dolfin/+bug/1036992/+subscriptions

Revision history for this message
Garth Wells (garth-wells) wrote :

Not inclined to dig into the inner workings of Hypre, and I haven't seen this error for a long time, so marking as 'won't fix'.

Changed in dolfin:
status: New → Won't Fix
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.