#325936 lam4-dev: simple example fails on Alpha

Package:
lam4-dev
Source:
lam
Description:
Development of parallel programs using LAM
Submitter:
Thimo Neubauer
Date:
2010-01-21 03:51:03 UTC
Severity:
important
#325936#5
Date:
2005-08-31 21:37:06 UTC
From:
To:
I've tried the following trivial program:

#include <mpi.h>

int
main (int argc, char **argv)
{
  MPI_Init (&argc, &argv);
  MPI_Finalize ();
}

Compile it:

riff /tmp> mpic++.lam lamtest.cc -o lamtest
/usr/lib/lam/include/mpi2cxx/functions_inln.h: In function 'void PMPI::Pcontrol(int, ...)':
/usr/lib/lam/include/mpi2cxx/functions_inln.h:249: warning: cannot pass objects of non-POD type 'struct va_list' through '...'; call will abort at runtime

Ok, pretty weird error but IIRC it emitted it in older LAM-versions as
well.

Running without a lamd fails normally:

riff /tmp> ./lamtest
-----------------------------------------------------------------------------

It seems that there is no lamd running on the host riff.

This indicates that the LAM/MPI runtime environment is not operating.
The LAM/MPI runtime environment is necessary for MPI programs to run
(the MPI program tired to invoke the "MPI_Init" function).

Please run the "lamboot" command the start the LAM/MPI runtime
environment.  See the LAM/MPI documentation for how to invoke
"lamboot" across multiple machines.
-----------------------------------------------------------------------------
Exit 215

So let's do it correctly:

riff /tmp> lamboot

LAM 7.1.1/MPI 2 C++/ROMIO - Indiana University

riff /tmp> ./lam
lam-thimo@riff/  lamtest*
riff /tmp> ./lamtest
[hangs]

Calling "lamnodes" in another shell hangs too, it doesn't if the test
program wasn't started.

I've attached a full strace of lamtest. If I read it correctly the
program opens a socket

chdir("/tmp/lam-thimo@riff")            = 0
socket(PF_FILE, SOCK_STREAM, 0)         = 3
connect(3, {sa_family=AF_FILE, path="lam-kernel-socket"}, 19) = 0
chdir("/tmp")

communicates for quite a while until

writev(3, [{"\r\0\0@F\270W?\0\0\0\0\17\0\0@\0\0\0\0\0\0\0\0\0\0\0\0"..., 72}, {ptrace: umoven: Input/output error

and then filehandle 3 seems broken.

Cheers
   Thimo

#325936#10
Date:
2005-08-31 21:56:00 UTC
From:
To:
Oops... forgot the attachment...

Cheers
  Thimo

#325936#15
Date:
2010-01-21 03:26:02 UTC
From:
To:
Hi;

On Lenny, lam seems to work ok on alpha.  The simple MPI program runs as
expected, as does lamnodes.

Cheers,
Craig

maint@nevrast:~$ cat lamtest.cc
#include <mpi.h>

int
main (int argc, char **argv)
{
   MPI_Init (&argc, &argv);
   MPI_Finalize ();
}
maint@nevrast:~$ mpic++.lam lamtest.cc -o lamtest
In file included from /usr/lib/lam/include/mpicxx.h:152,
                  from /usr/lib/lam/include/mpi.h:1128,
                  from lamtest.cc:1:
/usr/lib/lam/include/mpi2cxx/functions_inln.h: In function 'void
PMPI::Pcontrol(int, ...)':
/usr/lib/lam/include/mpi2cxx/functions_inln.h:249: warning: cannot pass
objects of non-POD type 'struct va_list' through '...'; call will abort
at runtime
maint@nevrast:~$ lamboot

LAM 7.1.2/MPI 2 C++/ROMIO - Indiana University

maint@nevrast:~$ lamnodes
n0	localhost:1:origin,this_node
maint@nevrast:~$ ./lamtest
maint@nevrast:~$ echo $?
0