posted Saturday, February 17, 2007 2:53 PM by dongarra | 1 Comments

• FT-MPI

FT-MPI is a full 1.2 MPI specification implementation that provides process
level fault tolerance at the MPI API level. FT-MPI has been developed in the
frame of the HARNESS (Heterogeneous Adaptive Reconfigurable Networked SyStem)
project with the goal of providing the end-user a communication library
containing an MPI API, which benefits from the fault-tolerance already
found in the HARNESS system.

Current Status

 Currently, FT-MPI has been compiled under Cygwin, Windows Subsystem
 for UNIX Applications (SUA) and native Windows. There is presently no
 possibility to start the daemons automatically, as the only supported
 method (SSH) is not natively available in the Windows environment.
 However, once the daemons are manually started, we have been able to
 spawn as many applications as necessary. Also, as the daemons are
 started manually, security is provided by the Windows user log-on.
 We also only have current support for BSD-like TCP, i.e. using read
 and write. But as of yet, there is no support for any direct WinSock2 functions.

Future Work

 Most of our future work will be focused on Open MPI. We plan to
 tighten the security for starting the applications, to provide
 full support for the XML format supported by the windows batch
 scheduler, memory and processor affinity, support for the Windows
 registry, completely dynamic MPI libraries and internal modules.
 Moreover, we know that the performances can be improved by at
 least another 20% (and that's a minimum).

Additional information about FT-MPI can be found on the website –
http://icl.cs.utk.edu/ftmpi/.

• Open MPI

 Open MPI is an open source implementation of both the MPI-1 and MPI-2
 documents and combines technologies and resources from several other
 projects (FT-MPI, LA-MPI, LAM/MPI, and PACX-MPI) in order to build
 the best MPI library available.

Current Status

 Currently and like FT-MPI, we have compiled Open MPI under Cygwin,
 Windows Subsystem for UNIX Applications (SUA) and native Windows.
 The most used and tested way to compile has been under native Windows.
 We have provided solutions and project files for Visual C Express,
 allowing us to compile Open MPI both as a static or a dynamic library.
 Support for C++, Fortran 77 as well as Fortran 90 is automatically built.
 We are also able to start daemons locally, using Windows functionality
 (spawn and/or CreateProcess) and we can start jobs on the cluster with CCS
 (using submit). However, so far the only available communication framework
 is on top of WinSock2, but work on Direct Socket is in progress. The
 Visual C compiler (VC) is used as a backend for mpicc, which allows us to
 compile the user applications in a normal environment.

 Integration with the parallel debugger is in progress, however the lack of
 comprehensive documentation make this task difficult. We have the same problem
 for accessing the high performance socket interface. The sparse documentation
 available on MSDN or the Web does not provide enough insight for a smooth transition.

 Performance results compared with the Microsoft MPI have shown that Open MPI
 performed faster over both shared memory and TCP, by a factor of ~10%. No
 application benchmark has been run in order to compare these 2 MPI implementations further.

Future Work

 Once the support for Direct Socket is completed, we will benchmark again
 and we expect a larger performance gap between these 2 MPI libraries.
 We still need to define the behavior of MPI in the event a failure occurs
 at the process level.

For more information about Open MPI, visit the website at -
http://icl.cs.utk.edu/open-mpi/.