- Package:
- src:petsc
- Source:
- petsc
- Submitter:
- Henning
- Date:
- 2021-06-07 11:03:03 UTC
- Severity:
- normal
- Tags:
- Blocked By:
-
Bug Title 961108 0
openmpi: providing 64-bit MPI normal stable testing unstable about 6 years ago
989497 1
superlu-dist: provide 64-bit build normal stable testing unstable about 5 years ago
961976 0
mumps: provide 64-bit build normal stable testing unstable about 6 years ago
989550 5
suitesparse: enhance 64-bit support in suitesparse normal stable testing unstable about 3 years ago
Dear Maintainer,
*** Reporter, please consider answering these questions, where appropriate ***
* What led up to the situation?
Applying the library using matrix dimensions higher than 46340.
* What exactly did you do (or not do) that was effective (or
ineffective)?
Setting the dimension to 54455
* What was the outcome of this action?
Got a report that the product of the dimension is exceeding the limit,
I should compile the PETSc libraries with the --with-64-bit-indices option.
I assume that singed 4 Byte integers are used to describe the indices
and the product of the indices.
* What outcome did you expect instead?
Just a normal diagonalization. Compiling the library with this option
solved the situation.
*** End of the template - remove these template lines ***
Nice bug. There are ramifications for upgrading all pointers to 64 bit. Probably we don't want to do that without being explicit about it. 64 bit pointers should be enabled right through the computational library stack, or there will be a mismatch at some point causing failure. This ties in with developments of the BLAS packages. These now have 64 bit variants (libblas64-dev etc). BLAS is at the bottom of the stack, 64 bit pointers need to be activated each step of the way upwards. Need to check 64-bit PETSc is consistent with scalapack and MUMPS, for instance.
Hi Gard, bringing this question over to petsc bug#953116. I was assuming we'd carry the two bit builds, "petsc-dev" and "petsc64-dev", at least in the medium term. This would follow what's in place with BLAS. It's also the practice in CRAY which offers both cray-petsc and cray-petsc-64 modules. But it's a good question to consider. Certainly. I can anticipate it might be quite disruptive if the standard package just jumps to 64 bit. I imagine that would break things. One question to consider is why petsc doesn't just use 64 bits in the first place on 64 bit systems. I was under the impression that a 32 bit build actually runs faster on a 64 bit system, in the sense of getting twice as much done per clock cycle. That you only need the 64 bit build if you actually need that much address space (i.e. if your mesh carrys that many degree of freedom DOFs) I guess we should clear up whether that's true or not. It would be regrettable to drop 32 bit if it means performance on smaller jobs is diminished. That could be a helpful tool. We could include it the -dev packages. Drew
Hi, the Debian project is discussing whether we should start providing a 64 bit build of PETSc (which means we'd have to upgrade our entire computational library stack, starting from BLAS and going through MPI, MUMPS, etc). A default PETSc build uses 32 bit addressing to index vectors and matrices. 64 bit addressing can be switched on by configuring with --with-64-bit-indices=1, allowing much larger systems to be handled. My question for petsc-maint is, is there a reason why 64 bit indexing is not already activated by default on 64-bit systems? Certainly C pointers and type int would already be 64 bit on these systems. Is it a question of performance? Is 32 bit indexing executed faster (in the sense of 2 operations per clock cycle), such that 64-bit addressing is accompanied with a drop in performance? In that case we'd only want to use 64-bit PETSc if the system being modelled is large enough to actually need it. Or is there a different reason that 64 bit indexing is not switched on by default? Drew
Drew Parsons <dparsons@debian.org> writes: You don't need to change BLAS or MPI. Umm, x86-64 Linux is LP64, so int is 32-bit. ILP64 is relatively exotic these days. Sparse iterative solvers are entirely limited by memory bandwidth; sizeof(double) + sizeof(int64_t) = 16 incurs a performance hit relative to 12 for int32_t. It has nothing to do with clock cycles for instructions, just memory bandwidth (and usage, but that is less often an issue). It's just about performance, as above. There are two situations in which 64-bit is needed. Historically (supercomputing with thinner nodes), it has been that you're solving problems with more than 2B dofs. In today's age of fat nodes, it also happens that a matrix on a single MPI rank has more than 2B nonzeros. This is especially common when using direct solvers. We'd like to address the latter case by only promoting the row offsets (thereby avoiding the memory hit of promoting column indices): https://gitlab.com/petsc/petsc/-/issues/333 I wonder if you are aware of any static analysis tools that can flag implicit conversions of this sort: int64_t n = ...; for (int32_t i=0; i<n; i++) { ... } There is -fsanitize=signed-integer-overflow (which generates a runtime error message), but that requires data to cause overflow at every possible location.
I see, the PETSc API allows for PetscBLASInt and PetscMPIInt distinct from PetscInt. That gives us more flexibility. (In any case, the Debian BLAS maintainer is already providing blas64 packages. We've started discussions about MPI). But what about MUMPS? Would MUMPS need to be built with 64 bit support to work with 64-bit PETSc? (the MUMPS docs indicate that its 64 bit support needs 64-bit versions of BLAS, SCOTCH, METIS and MPI). oh ok. I had assumed int was 64 bit on x86-64. Thanks for the correction. Thanks Jed. That's good justification for us to keep our current 32-bit built then, and provide a separate 64-bit build alongside it. An interesting extra challenge. I'll ask the Debian gcc team and the Science team if they have ideas about this. Drew
In MUMPS's manual, it is called full 64-bit. Out of the same memory bandwidth concern, MUMPS also supports selective 64-bit, in a sense it only uses int64_t for selected variables. One can still use it with 32-bit BLAS, MPI etc. We support selective 64-bit MUMPS starting from petsc-3.13.0
If I remember correctly - the 'full 64-bit' mode relies on fortran compiler option '-i8' - which is basally equivalent to ILP64 - and this mode only works with ILP64 MPI, BLAS etc from Intel-MPI/MKL We haven't tried using MUMPS in this mode with PETSc Satish
Note: OpenBLAS supports 64bit indices. MKL has bunch of packages built as ILP64 [MPICH/OpenMPI - as far as I know is LP64] The primary reason PETSc defaults to 32bit indices is - this is the compiler default on LP64 systems. If debian is building ILP64 system [with compilers defaulting to 64-bit integers] - that would mean all packages would be ILP64 [obviously most packages are not tested in this mode - so might break]
Thanks Junchao. Sounds like we can get started on providing 64-bit MUMPS and PETSc without needing to wait for MPI then. That's good timing with 3.13. Drew
If I understand correctly, the Debian systems are LP64 (so gcc defaults to int=int32_t). Our user who started these discussions with Bug#953116 reports that --with-64-bit-indices is working fine for his local build. But he may not have tested using MUMPS in 64-bit PETSc. This will be the interesting test. I'll start with the 64-bit build of MUMPS and see how tests hold up. Drew
Hi Jed, Thomas Schiex from Debian Science has replied to this question, suggesting clang-static-analyzer or lgtm: For open source projects, a few online static analyzers are available and usable for free. This kind of integer type mismach will be caught by most of them. Possibly clang-static-analyzer will do the job. Otherwise, an easy one is lgtm for example. See https://lgtm.com/ (I have no link with them except as an open source software developer using their services for free). There are other tools (mostly geared towards security) available for free for open source software but I just forgot their name. Any web search tool should help you here. Thomas
Drew Parsons <dparsons@debian.org> writes: I had tried this first, but I think it requires significant work to implement. This looks interesting, but it isn't obvious how to implement this sort of check in their language. They have a bunch of examples, but they seem simpler.
... The PETSc mumps tests seem to be robust with respect to 64 bit. (64 bit MUMPS in the form of -DPORD_INTSIZE64, not all-integer -DINTSIZE64) That is, 32 bit PETSc passes its tests with 64 bit (PORD) MUMPS and 64 bit PETSc passes its tests with 32 bit MUMPS. The test in question that's passing is src/snes/tutorials/ex19, run with 'make runex19_fieldsplit_mumps' Perhaps it's not stress-testing 64 bit conditions. Drew
Could you provide more details, e.g., the error stack trace?
Hi Junchao, PETSc's mumps test runs fine, there is no error to trace as
such, just a diff with the reference output.
With 32-bit PETSc and 64-bit [PORD] MUMPS,
$ mpirun -n 2 ./ex19 -pc_type fieldsplit -pc_fieldsplit_block_size 4
-pc_fieldsplit_type SCHUR -pc_fieldsplit_0_fields 0,1,2
-pc_fieldsplit_1_fields 3 -fieldsplit_0_pc_type lu -fieldsplit_1_pc_type
lu -snes_monitor_short -ksp_monitor_short
-fieldsplit_0_pc_factor_mat_solver_type mumps
-fieldsplit_1_pc_factor_mat_solver_type mumps
returns the result:
lid velocity = 0.0625, prandtl # = 1., grashof # = 1.
0 SNES Function norm 0.239155
0 KSP Residual norm 0.235858
1 KSP Residual norm < 1.e-11
1 SNES Function norm 6.81968e-05
0 KSP Residual norm 2.30906e-05
1 KSP Residual norm < 1.e-11
2 SNES Function norm < 1.e-11
Number of SNES iterations = 2
where output/ex19_fieldsplit_5.out has
lid velocity = 0.0625, prandtl # = 1., grashof # = 1.
0 SNES Function norm 0.239155
0 KSP Residual norm 0.239155
1 KSP Residual norm < 1.e-11
1 SNES Function norm 6.81968e-05
0 KSP Residual norm 6.81968e-05
1 KSP Residual norm < 1.e-11
2 SNES Function norm < 1.e-11
Number of SNES iterations = 2
So the diff in this case is
$make runex19_fieldsplit_mumps
3c3
< 0 KSP Residual norm 0.239155
---
6c6
< 0 KSP Residual norm 6.81968e-05
---
clone 961185 -1 clone 953116 -2 thanks MUMPS can enable 64-bit ordering (PORD) and 64-bit PETSc can use that, while other integers remain default 32 bit (in BLAS, MPI etc). This is the fast&easy® 64-bit build. Full 64-bit requires 64-bit MPI. The cloned versions of these bugs will continue to track progress towards 64-bit MPI (and other packages).