Skip to content
README 10.7 KiB
Newer Older
Rui Machado's avatar
Rui Machado committed
******************************************************************************
				  GPI-2
			  http://www.gpi-site.com
Rui Machado's avatar
Rui Machado committed
		  	      Version: 1.3.0
			  Copyright (C) 2013-2019
Rui Machado's avatar
Rui Machado committed
			     Fraunhofer ITWM
Rui Machado's avatar
Rui Machado committed
******************************************************************************

1. INTRODUCTION
===============

GPI-2 is the second generation of GPI (www.gpi-site.com). GPI-2
implements the GASPI specification (www.gaspi.de), an API
specification which originates from the ideas and concepts of GPI.

GPI-2 is an API for asynchronous communication. It provides a
flexible, scalable and fault tolerant interface for parallel
applications.


2. INSTALLATION
===============

Requirements:
-------------

The current version of GPI-2 has the following requirements.

Software:
Rui Machado's avatar
Rui Machado committed
- libibverbs v1.1.6 (Verbs library from OFED) if running on Infiniband.
- ssh server running on compute nodes (requiring no password).
Rui Machado's avatar
Rui Machado committed
- gawk (GNU Awk) and sed utilities.
Rui Machado's avatar
Rui Machado committed
- Infiniband/RoCE device or Ethernet device.
To install GPI-2, it needs to be configured using the script `./configure`.
The available options and the relevant environment variables are printed
by `./configure --help`.
Rui Machado's avatar
Rui Machado committed

By default the production, debugging and statistic libraries are compiled
for infiniband. Fortran modules are also build by default, as well as the
html documentation, if Fortran compilers and doxygen, respectively, are
available. Support for MPI (mixed mode) is also available if it is requested
and the libraries (compilers) are found. Thus for example, the command line:
Rui Machado's avatar
Rui Machado committed

`CC=mpiicc FC=mpiifort ./configure --prefix=$HOME/local --with-infiniband=/usr --with-mpi --with-slurm`

will do the configuration using MPI-Intel compilers, the path to infiniband `/usr`
and `slurm` as batch system, using as target directory `$HOME/local`.
The command line:
`./configure --with-ethernet`
will configure the (environment defined) CC and FC compilers for a TCP connection,
using as target directory `/usr/local`.

As it is usual, after configuration, the command line:

`make -j$NPROC`

will compile in parallel the libraries, the Fortran modules, the tests, the documentation and
the tutorial code.

After succesful completion, the user can define the working hosts in `tests/machines`
and run the predefined tests by:

`make check`

Finally

`make install`

will install the `gaspi_*` scripts (according to the batch system) in
the EPREFIX/bin directory, the libraries in EPREFIX/lib, and the headers and Fortran modules in
EPREFIX/include.
Rui Machado's avatar
Rui Machado committed

Rui Machado's avatar
Rui Machado committed
Infiniband support (default)
----------------------------

Rui Machado's avatar
Rui Machado committed
GPI-2 requires the libibverbs from the OFED stack. You can pass the
path of your OFED installation to the install script using the option
Rui Machado's avatar
Rui Machado committed
(--with-infiniband) for that, in case the install script is not able to find it:
Rui Machado's avatar
Rui Machado committed

Rui Machado's avatar
Rui Machado committed
     ./install.sh --with-infiniband<=full_path_to_ofed>
Rui Machado's avatar
Rui Machado committed

You can see the install script usage:
Rui Machado's avatar
Rui Machado committed
    ./install.sh -h

Ethernet support
----------------

To install GPI-2 on a system without Infiniband, you can use the
Rui Machado's avatar
Rui Machado committed
option --with-ethernet. With this option standard TCP sockets are
used.
Rui Machado's avatar
Rui Machado committed

Rui Machado's avatar
Rui Machado committed
Such support is primarily targetted at the development of GPI-2
applications without the need to access a system with Infiniband, with
less focus on performance.
Rui Machado's avatar
Rui Machado committed

MPI Interoperability
--------------------
Rui Machado's avatar
Rui Machado committed

If you plan to use GPI-2 with MPI to, for instance, start an
incremental port of a large application or to use some libraries that
require MPI, you can enable MPI interoperability with the option:
Rui Machado's avatar
Rui Machado committed

     ./install.sh --with-mpi<=path_to_mpi_installation>

If you don't provide a path to your MPI installation and mpirun is in
your PATH, the installation script will take that MPI installation.
Rui Machado's avatar
Rui Machado committed

This will enable you to use GPI-2 with MPI where the only constraint
is that MPI_Init() must be invoked before gaspi_proc_init(). For this
MPI+GPI2 mixed mode, it is assumed that you start the application with
mpirun (or mpiexec, etc.).
Rui Machado's avatar
Rui Machado committed

Rui Machado's avatar
Rui Machado committed
Note that this option will build GPI-2 with MPI, requiring you to link
the MPI library to your GPI-2 application (even if you are not using
MPI). If you are interested in using GPI-2 only, do not build GPI-2
with this option.

Rui Machado's avatar
Rui Machado committed
3. BUILDING GPI-2
=================

Rui Machado's avatar
Rui Machado committed
You can build GPI-2 on your own. There are the following make targets:
Rui Machado's avatar
Rui Machado committed

all	  - Build everything
gpi	  - Build the GPI-2 library (including debug version)
Rui Machado's avatar
Rui Machado committed
fortran	  - Build Fortran 90 bindings
Rui Machado's avatar
Rui Machado committed
mic	  - Build the GPI-2 library for the MIC
tests	  - Build provided tests
docs	  - Generate documentation (requires doxygen)
clean	  - Clean-up

4. BUILDING GPI-2 APPLICATIONS
==============================

GPI-2 provides two libraries: libGPI2.a and libGPI2-dbg.a.

The libGPI2.a aims at high-performance and is to be used in production
whereas the libGPI2-dbg.a provides a debug version, with extra
parameter checking and debug messages and is to be used to debug and
during development.


5. RUNNING GPI-2 APPLICATIONS
=============================

The gaspi_run utility is used to start and run GPI-2
applications. A machine file with the hostnames of nodes where the
application will run, must be provided.

For example, to start 1 process per node (on 4 nodes), the machine
file looks like:

node01
node02
node03
node04

Similarly, to start 2 processes per node (on 4 nodes):

node01
node01
node02
node02
node03
node03
node04
node04

The gaspi_run utility is invoked as follows:

	gaspi_run -m <machinefile> [OPTIONS] <path GASPI program>

Rui Machado's avatar
Rui Machado committed
IMPORTANT: The path to the program must exist on all nodes where the
program should be started.
Rui Machado's avatar
Rui Machado committed
The gaspi_run utility has the following further options [OPTIONS]:

  -b <binary file> Use a different binary for first node (master).
     	     	   The master (first entry in the machine file) is
		   started with a different application than the rest
		   of the nodes (workers).
Rui Machado's avatar
Rui Machado committed
  -N               Enable NUMA for processes on same node. With this
  		   option it is only possible to start the same number
		   of processes as NUMA nodes present on the system.
		   The processes running on same node will be set with
		   affinity to the proper NUMA node.
Rui Machado's avatar
Rui Machado committed
  -n <procs>       Start as many <procs> from machine file.
     		   This option is used to start less processes than
		   those listed in the machine file.
Rui Machado's avatar
Rui Machado committed
  -d               Run with GDB (debugger) on master node. With this
  		   option, GDB is started in the master node, to allow
		   debugging the application.
Rui Machado's avatar
Rui Machado committed

  -p               Ping hosts before starting the binary to make sure
  		   they are available.

Rui Machado's avatar
Rui Machado committed
  -h               Show help.


5. THE GASPI_LOGGER
===================

Rui Machado's avatar
Rui Machado committed
The gaspi_logger utility is used to view and separate the output from
all nodes when the function gaspi_printf is called. The gaspi_logger
is started, on another session, on the master node. The output of the
application, when using gaspi_printf, will be redirected to the
Rui Machado's avatar
Rui Machado committed
gaspi_logger. Other I/O routines (e.g. printf) will not.

Rui Machado's avatar
Rui Machado committed
A further separation of output (useful for debugging) can be achieved
by using the routine gaspi_printf_to which sends the output to the
gaspi_logger started on a particular node. For example,

  gaspi_printf_to(1, "Hello 1\n");

will display the string "Hello 1" in the gaspi_logger started on rank
Rui Machado's avatar
Rui Machado committed
1.
Rui Machado's avatar
Rui Machado committed


6. GPI-2 WITH CO-PROCESSOR (INTEL XEON PHI)
==========================================

GPI-2 can be used with the Intel Xeon Phi (MIC) where the MIC is used
as one node (running possibly more than one GPI-2 process).

To use GPI-2 on the MIC, you need to compile it using the Intel
compiler with the -mmic option. The Makefile includes a build target
mic for that end but this build target is not used by the installation
script. After successful compilation, GPI-2 for the MIC can be found
under lib64/mic. If you are having problems with the compilation, make
Rui Machado's avatar
Rui Machado committed
sure the OFED_PATH and OFED_MIC_PATH in src/make.inc is setup correctly.
Rui Machado's avatar
Rui Machado committed

Assuming that the MIC(s) is set up properly, GPI-2 requires:

- ssh connectivity (requiring no password)

- grouping of local host and MIC(s) by using the tool micctrl

  micctrl --initdefaults
  micctrl --addbridge=br0 --type=internal --ip=172.31.1.254
  micctrl --network=static --bridge=br0 --ip=172.31.1.1

 where for instance, mic0: 172.31.1.1, mic1: 172.31.1.2 etc.

The MIC must be visible to the whole cluster network. For instance,
if your cluster network is 192.168.1.0 on eth0 and you want to map
mic0 to 192.168.1.100 do:

iptables -t nat -A PREROUTING -d 192.168.1.100 -i eth0 -j DNAT --to-destination 172.31.1.1
iptables -t nat -A POSTROUTING -s 172.31.1.1 -o eth0 -j SNAT --to-source 192.168.1.100

After setup, the MIC hostname can simply be added to the machinefile
and be used as another host.

Rui Machado's avatar
Rui Machado committed
8. TROUBLESHOOTING AND KNOWN ISSUES
===================================
Rui Machado's avatar
Rui Machado committed

Rui Machado's avatar
Rui Machado committed
If you're having troubles when building GPI-2 with support for
Infiniband, make sure you have the OFED stack correctly installed and
running. You can change the OFED path in the definitions file make.inc
in the source directory (src) to the fix path in your system.
Rui Machado's avatar
Rui Machado committed

When installing GPI-2 with MPI mixed-mode support (using the option
--with-mpi<=path_to_mpi_installation>) and the installation is failing
when trying to build the tests due to missing libraries, you can
provide extra paths through available variables and be able to find
the missing paths. These are: GPI2_EXTRA_CFLAGS, GPI2_EXTRA_LIBS_PATH,
GPI2_EXTRA_LIBS.

For example:
   GPI2_EXTRA_LIBS=-lmpl ./install.sh --with-mpi -p /opt/GPI2


Environment variables
---------------------

You might have some trouble when your application requires some
dynamically set environment setting (e.g. the LD_LIBRARY_PATH), for
instance, through the module system of your jobs batch
system. Currently, neither the gaspi_run or the GPI-2 library take
care of such environment settings. To this situation there are 2
workarounds:

i) you set the required environment variables in your shell
initialization file (e.g. ~/.bashrc).

ii) you create an executable shell script which sets the required
environment variables and then starts the application. Then you can
use gaspi_run to start the application, providing the shell script as
the application to execute.

    gaspi_run -m machinefile ./my_wrapper_script.sh

where my_wrapper_script.sh contains:

      #!/bin/sh

      LD_LIBRARY_PATH=$LD_LIBRARY_PATH:<path_to_my_lib>

      <path_to_my_application>/my_application <my_app_args>

      exit $?

If you're running in MPI mixed-mode, starting your application with
mpirun/mpiexec, this should not be an issue.


Lena Oden's avatar
Lena Oden committed
9. UP COMING FEATURES
Rui Machado's avatar
Rui Machado committed
=====================

GPI-2 is on-going work and more features are still to come. Here are
some that are in our roadmap:

- support to add spare nodes (fault tolerance)
- better debugging possibilities


Rui Machado's avatar
Rui Machado committed
10. MORE INFORMATION
====================
Rui Machado's avatar
Rui Machado committed

For more information, check the GPI-2 website ( www.gpi-site.com ) and
don't forget to subscribe to the GPI-2 mailing list. You subscribe it
at https://listserv.itwm.fraunhofer.de/mailman/listinfo/gpi2-users