1. Installation of OmpSs-2¶
The first step is choosing a directory where you will install OmpSs-2.
In this document this directory will be referred to as the INSTALLATION_PREFIX
directory.
We recommend you to set an environment variable INSTALLATION_PREFIX
with the desired installation directory.
For instance:
$ export INSTALLATION_PREFIX=$HOME/installation/ompss-2
1.1. Installation of ovni (optional)¶
The ovni project implements a fast instrumentation library that records small events during the execution of programs to later investigate how the execution happened using Paraver.
The Nanos6 runtime system can be configured with ovni to extract execution traces. Building Nanos6 with ovni is optional but highly recommended. This page is just a quick summary of the installation of ovni. For a more detailed information check ovni GitHub and ovni online documentation.
Get an ovni release tarball from https://github.com/bsc-pm/ovni.
Unpack the tarball and enter the just created directory:
$ tar -xvf ovni-xxx.tar.gz $ cd ovni-xxx
Export a target installation directory for ovni:
$ export OVNI_PREFIX=$HOME/installation/ovni
Create and move to a build directory:
$ mkdir build $ cd build
Configure it with CMake:
$ cmake -DCMAKE_INSTALL_PREFIX=$OVNI_PREFIX
Build and install:
$ make $ make install
Note
ovni may need other packages as dependencies (e.g., an MPI installation) and may use several options with the CMake command. Please check the ovni online documentation for more details.
1.2. Installation of Nanos6¶
Nanos6 is a runtime that implements the OmpSs-2 parallel programming model, developed by the Programming Models group at the Barcelona Supercomputing Center.
Nanos6 can be obtained from the GitHub public repository or by contacting us at pm-tools@bsc.es.
1.2.1. Build requirements¶
To install Nanos6 the following tools and libraries must be installed:
1.2.2. Optional libraries and tools¶
In addition to the build requirements, the following libraries and tools enable additional features:
Extrae to generate execution traces for offline performance analysis with Paraver
CUDA to enable CUDA tasks
PGI or NVIDIA HPC-SDK to enable OpenACC tasks
PAPI to generate real-time statistics of hardware counters
PQOS to generate real-time statistics of hardware counters
DLB to enable dynamic management and sharing of computing resources
jemalloc to use jemalloc as the default memory allocator, providing better performance than the default glibc implementation. Jemalloc software must be compiled with
--enable-stats
and--with-jemalloc-prefix=nanos6_je_
to link with the runtimeovni (>= 1.5.0) to generate execution traces for performance analysis with Paraver
1.2.3. Build procedure¶
Nanos6 uses the standard GNU automake and libtool toolchain. When cloning from a repository, the building environment must be prepared through the command below. When the code is distributed through a tarball, it usually does not need that command.
$ ./autogen.sh
Note
The autogen.sh
script was recently introduced. Please use this script instead
of the autoreconf
command; otherwise the autotools configuration will fail.
Then execute the following commands:
$ ./configure --prefix=$INSTALLATION_PREFIX ...other options...
$ make
$ make install
where $INSTALLATION_PREFIX
is the directory into which to install Nanos6.
The configure script accepts the following options:
--with-nanos6-clang=prefix
to specify the prefix of the LLVM/Clang installation that supports OmpSs-2--with-nanos6-mercurium=prefix
to specify the prefix of the Mercurium installation--with-boost=prefix
to specify the prefix of the Boost installation--with-papi=prefix
to specify the prefix of the PAPI installation--with-libnuma=prefix
to specify the prefix of the numactl installation--with-extrae=prefix
to specify the prefix of the extrae installation--with-dlb=prefix
to specify the prefix of the DLB installation--with-papi=prefix
to specify the prefix of the PAPI installation--with-pqos=prefix
to specify the prefix of the PQoS installation--with-cuda[=prefix]
to enable support for CUDA tasks--enable-openacc
to enable support for OpenACC tasks; requires PGI compilers--with-pgi=prefix
to specify the prefix of the PGI compilers installation, in case they are not in$PATH
--enable-monitoring
to enable monitoring and predictions of task/CPU/thread statistics--enable-chrono-arch
to enable an architecture-based timer for the monitoring infrastructure--with-ovni=prefix
to specify the prefix of the ovni installation and enable the ovni instrumentation
The hwloc dependency is mandatory, and, by default, an internal hwloc is embedded to the Nanos6 library. This behavior can be modified through the –with-hwloc option, which can take the following values:
--with-hwloc=embedded
: The hwloc is built and embedded into the Nanos6 library as an internal module. This is useful when user programs may have third-party software (e.g., MPI libraries) that depend on a different hwloc version and may conflict with the one used by Nanos6. In this way, the hwloc library is internal and is only used by Nanos6. This is the default behavior if the option is not present, or no value is provided. See Embedding software dependencies for more information--with-hwloc=pkgconfig
: The hwloc is an external installation and Nanos6 should discover it through the pkg-config tool. Make sure to set thePKG_CONFIG_PATH
if the hwloc is not installed in non-standard directories--with-hwloc=<prefix>
: A prefix of an external hwloc installation
The jemalloc dependency is optional but highly recommended. This allocator
significantly improves the performance of the Nanos6 runtime by optimizing the
memory allocations. By default, an internal jemalloc is embedded to the Nanos6
library. This behavior can be modified through the --with-jemalloc
option,
which can take the following values:
--with-jemalloc=embedded
: The jemalloc is built and embedded into Nanos6 as an internal library. The building process installs the jemalloc headers and libraries in$INSTALLATION_PREFIX/deps/nanos6/jemalloc
and dynamically links our runtime against the jemalloc library. This is the default behavior if the option is not provided. See Embedding software dependencies for more information--with-jemalloc=<prefix>
: A prefix of an external jemalloc installation configured with the--enable-stats
and--with-jemalloc-prefix=nanos6_je_
options--with-jemalloc=no
or--without-jemalloc
: Disable the jemalloc allocator (not recommended)
The location of an external hwloc can be retrieved through pkg-config when specifying
--with-hwloc=pkgconfig
. If it is installed in a non-standard location, pkg-config
can be told where to find it through the PKG_CONFIG_PATH
environment variable. For
instance:
$ export PKG_CONFIG_PATH=/apps/HWLOC/2.0.0/INTEL/lib/pkgconfig:$PKG_CONFIG_PATH
The --with-cuda
flag is needed to enable CUDA tasks. The location of CUDA
can be retrieved automatically, if it is in standard system locations (/usr/lib
,
/usr/include
, etc), or through pkg-config. Alternatively, for non-standard
installation paths, it can be specified using the optional =prefix
of the
parameter.
The --enable-openacc
flag is needed to enable OpenACC tasks. The location of
PGI compilers can be retrieved from the $PATH
variable, if it is not specified
through the --with-pgi
parameter.
Optionally, if you passed any valid LLVM/Clang or Mercurium installation, you can execute the Nanos6 tests by running:
$ make check
1.2.3.1. Embedding software dependencies¶
As mentioned above, there are some software dependencies that may be embedded
into Nanos6. This is the case for hwloc and jemalloc, which will be embedded by
default. The sources of these embedded dependencies are taken from the deps
sub-directory in this repository. Inside the sub-directory, there is a default
hwloc and jemalloc source tarballs. These tarballs are automatically extracted
into deps/hwloc
and deps/jemalloc
by our autogen.sh
script.
These are the source packages that are then built when choosing --with-hwloc=embedded
or --with-jemalloc=embedded
. You may change the embedded software version by
placing the desired tarball inside the deps
folder and re-running autogen.sh
with the option --embed-<SOFTWARE> <VERSION>
. Currently, <SOFTWARE>
can be
hwloc
or jemalloc
and <VERSION>
should be the desired version number
of that software. For instance, a valid option could be:
./autogen.sh --embed-jemalloc 5.3.0
For the moment, the tarballs must follow the format deps/<SOFTWARE>-<VERSION>.tar.gz
.
1.3. Installation of nOS-V¶
The nOS-V runtime is a lightweight tasking library with co-execution capabilities, which is designed to be used as a core for other more complex parallel runtimes and implements the nOS-V tasking API.
nOS-V can be obtained from the GitHub public repository or by contacting us at pm-tools@bsc.es.
1.3.1. Build requirements¶
The following software is required to build and install nOS-V:
automake, autoconf, libtool, make and a C11 compiler
1.3.2. Optional libraries and tools¶
The following software is required to enable additional optional features:
1.3.3. Build procedure¶
nOS-V uses the standard GNU automake and libtool toolchain. When cloning from a repository, the building environment must be prepared through the command below. When the code is distributed through a tarball, it usually does not need that command.
$ autoreconf -fiv
Then execute the following commands:
$ ./configure --prefix=$INSTALLATION_PREFIX ...other options...
$ make
$ make install
where $INSTALLATION_PREFIX
is the directory into which to install nOS-V.
The configure script accepts the following options:
--with-ovni=prefix
to specify the prefix of the ovni installation and enable the ovni instrumentation--with-libnuma=prefix
to specify the prefix of the libnuma installation--with-papi=prefix
to specify the prefix of the PAPI installation and enable monitoring featuresCACHELINE_WIDTH=width
to specify the length in bytes of a cache line in the target architecture
After building, nOS-V tests can be executed using:
$ make check
1.4. Installation of NODES¶
The NODES (nOS-V based OmpSs-2 DEpendency System) runtime is a library designed to work on top of the nOS-V runtime. It includes most of the functionalities from its predecessor, Nanos6, whilst leaving the interaction with the system to nOS-V. NODES implements the OmpSs-2 parallel programming model, developed by the Programming Models group at the Barcelona Supercomputing Center.
NODES can be obtained from the GitHub public repository or by contacting us at pm-tools@bsc.es.
1.4.1. Build requirements¶
The following software is required to build and install NODES:
1.4.2. Optional libraries and tools¶
The following software is required to enable additional optional features:
1.4.3. Build procedure¶
NODES uses the standard GNU automake and libtool toolchain. When cloning from a repository, the building environment must be prepared through the command below. When the code is distributed through a tarball, it usually does not need that command.
$ autoreconf -fiv
Then execute the following commands:
$ ./configure --prefix=$INSTALLATION_PREFIX ...other options...
$ make
$ make install
where $INSTALLATION_PREFIX
is the directory into which to install NODES.
The configure script accepts the following options:
--with-nosv=prefix
to specify the prefix of the nOS-V installation--with-boost=prefix
to specify the prefix of the Boost installation--with-ovni=prefix
to specify the prefix of the ovni installation and enable the ovni instrumentation--with-nodes-clang=prefix
to specify the prefix of the LLVM/Clang installation that supports OmpSs-2
Optionally, passing a valid LLVM/Clang installation when configuring enables executing the NODES tests by running:
$ make check
1.5. Installation of LLVM-based compiler¶
The LLVM website describes a list of build requirements of LLVM.
You should be able to compile and install the LLVM-based compiler with the following commands:
$ cmake -S llvm -B build \
-DCMAKE_BUILD_TYPE=Release \
-DCMAKE_INSTALL_PREFIX=$INSTALLATION_PREFIX \
-DLLVM_ENABLE_PROJECTS=clang \
-DLLVM_INSTALL_TOOLCHAIN_ONLY=ON \
-DCLANG_DEFAULT_OMPSS2_RUNTIME=libnanos6 \
-DCLANG_DEFAULT_NANOS6_HOME=$INSTALLATION_PREFIX \
-DCLANG_DEFAULT_NODES_HOME=$INSTALLATION_PREFIX
$ cd build
$ make
$ make install
The LLVM/Clang can compile OmpSs-2 code and generate the binaries for both Nanos6
and NODES runtime systems. The support for OmpSs-2 and their runtime systems is
always enabled. Deciding whether an OmpSs-2 binary is generated for a specific
runtime is done when compiling the user application with clang
.
During the building stage, LLVM/Clang only needs to know where to find the runtime libraries. The following CMake options are available to specify the path to the runtime libraries and which is the default runtime for OmpSs-2 binaries:
CLANG_DEFAULT_NANOS6_HOME
to specify the default path to the Nanos6 installation. If not provided, the default path is the installation prefix of LLVM/Clang. This Nanos6 path can be overriden when compiling OmpSs-2 applications by defining theNANOS6_HOME
environment variableCLANG_DEFAULT_NODES_HOME
to specify the default path to the NODES installation. If not provided, the default path is the installation prefix of LLVM/Clang. This NODES path can be overriden when compiling OmpSs-2 applications by defining theNODES_HOME
environment variableCLANG_DEFAULT_OMPSS2_RUNTIME
to specify which runtime library is targeted by default if the user did not specify which runtime when compiling an application. The accepted values arelibnanos6
(default) andlibnodes
More details about customizing the LLVM build can be found in the LLVM website
1.6. Installation of Mercurium legacy compiler¶
Important
Mercurium is the OmpSs-2 legacy compiler and is unsupported now. You do not have to install it to use OmpSs-2. We recommend to use the LLVM-based compiler. Check Installation of LLVM-based compiler and LLVM-based compiler.
You can find the build requirements, the configuration flags and the instructions to build Mercurium in the following link: https://github.com/bsc-pm/mcxx
You should be able to compile and install Mercurium with the following commands:
$ autoreconf -fiv
$ ./configure --prefix=$INSTALLATION_PREFIX --enable-ompss-2 --with-nanos6=$INSTALLATION_PREFIX
$ make
$ make install