1. Installation of OmpSs-2

The first step is choosing a directory where you will install OmpSs-2. In this document this directory will be referred to as the INSTALLATION_PREFIX directory. We recommend you to set an environment variable INSTALLATION_PREFIX with the desired installation directory. For instance:

$ export INSTALLATION_PREFIX=$HOME/installation/ompss-2

1.1. Installation of ovni (optional)

The ovni project implements a fast instrumentation library that records small events during the execution of programs to later investigate how the execution happened using Paraver.

The Nanos6 runtime system can be configured with ovni to extract execution traces. Building Nanos6 with ovni is optional but highly recommended. This page is just a quick summary of the installation of ovni. For a more detailed information check ovni GitHub and ovni online documentation.

  1. Get an ovni release tarball from https://github.com/bsc-pm/ovni.

  2. Unpack the tarball and enter the just created directory:

    $ tar -xvf ovni-xxx.tar.gz
    $ cd ovni-xxx
    
  3. Export a target installation directory for ovni:

    $ export OVNI_PREFIX=$HOME/installation/ovni
    
  4. Create and move to a build directory:

    $ mkdir build
    $ cd build
    
  5. Configure it with CMake:

    $ cmake -DCMAKE_INSTALL_PREFIX=$OVNI_PREFIX
    
  6. Build and install:

    $ make
    $ make install
    

Note

ovni may need other packages as dependencies (e.g., an MPI installation) and may use several options with the CMake command. Please check the ovni online documentation for more details.

1.2. Installation of Nanos6

Nanos6 is a runtime that implements the OmpSs-2 parallel programming model, developed by the Programming Models group at the Barcelona Supercomputing Center.

Nanos6 can be obtained from the GitHub public repository or by contacting us at pm-tools@bsc.es.

1.2.1. Build requirements

To install Nanos6 the following tools and libraries must be installed:

  1. automake, autoconf, libtool, pkg-config, make and a C and C++ compiler

  2. boost >= 1.59

  3. hwloc

  4. numactl

1.2.2. Optional libraries and tools

In addition to the build requirements, the following libraries and tools enable additional features:

  1. Extrae to generate execution traces for offline performance analysis with Paraver

  2. CUDA to enable CUDA tasks

  3. PGI or NVIDIA HPC-SDK to enable OpenACC tasks

  4. PAPI to generate real-time statistics of hardware counters

  5. PQOS to generate real-time statistics of hardware counters

  6. DLB to enable dynamic management and sharing of computing resources

  7. jemalloc to use jemalloc as the default memory allocator, providing better performance than the default glibc implementation. Jemalloc software must be compiled with --enable-stats and --with-jemalloc-prefix=nanos6_je_ to link with the runtime

  8. ovni (>= 1.5.0) to generate execution traces for performance analysis with Paraver

1.2.3. Build procedure

Nanos6 uses the standard GNU automake and libtool toolchain. When cloning from a repository, the building environment must be prepared through the command below. When the code is distributed through a tarball, it usually does not need that command.

$ ./autogen.sh

Note

The autogen.sh script was recently introduced. Please use this script instead of the autoreconf command; otherwise the autotools configuration will fail.

Then execute the following commands:

$ ./configure --prefix=$INSTALLATION_PREFIX ...other options...
$ make
$ make install

where $INSTALLATION_PREFIX is the directory into which to install Nanos6.

The configure script accepts the following options:

  1. --with-nanos6-clang=prefix to specify the prefix of the LLVM/Clang installation that supports OmpSs-2

  2. --with-nanos6-mercurium=prefix to specify the prefix of the Mercurium installation

  3. --with-boost=prefix to specify the prefix of the Boost installation

  4. --with-papi=prefix to specify the prefix of the PAPI installation

  5. --with-libnuma=prefix to specify the prefix of the numactl installation

  6. --with-extrae=prefix to specify the prefix of the extrae installation

  7. --with-dlb=prefix to specify the prefix of the DLB installation

  8. --with-papi=prefix to specify the prefix of the PAPI installation

  9. --with-pqos=prefix to specify the prefix of the PQoS installation

  10. --with-cuda[=prefix] to enable support for CUDA tasks

  11. --enable-openacc to enable support for OpenACC tasks; requires PGI compilers

  12. --with-pgi=prefix to specify the prefix of the PGI compilers installation, in case they are not in $PATH

  13. --enable-monitoring to enable monitoring and predictions of task/CPU/thread statistics

  14. --enable-chrono-arch to enable an architecture-based timer for the monitoring infrastructure

  15. --with-ovni=prefix to specify the prefix of the ovni installation and enable the ovni instrumentation

The hwloc dependency is mandatory, and, by default, an internal hwloc is embedded to the Nanos6 library. This behavior can be modified through the –with-hwloc option, which can take the following values:

  • --with-hwloc=embedded: The hwloc is built and embedded into the Nanos6 library as an internal module. This is useful when user programs may have third-party software (e.g., MPI libraries) that depend on a different hwloc version and may conflict with the one used by Nanos6. In this way, the hwloc library is internal and is only used by Nanos6. This is the default behavior if the option is not present, or no value is provided. See Embedding software dependencies for more information

  • --with-hwloc=pkgconfig: The hwloc is an external installation and Nanos6 should discover it through the pkg-config tool. Make sure to set the PKG_CONFIG_PATH if the hwloc is not installed in non-standard directories

  • --with-hwloc=<prefix>: A prefix of an external hwloc installation

The jemalloc dependency is optional but highly recommended. This allocator significantly improves the performance of the Nanos6 runtime by optimizing the memory allocations. By default, an internal jemalloc is embedded to the Nanos6 library. This behavior can be modified through the --with-jemalloc option, which can take the following values:

  • --with-jemalloc=embedded: The jemalloc is built and embedded into Nanos6 as an internal library. The building process installs the jemalloc headers and libraries in $INSTALLATION_PREFIX/deps/nanos6/jemalloc and dynamically links our runtime against the jemalloc library. This is the default behavior if the option is not provided. See Embedding software dependencies for more information

  • --with-jemalloc=<prefix>: A prefix of an external jemalloc installation configured with the --enable-stats and --with-jemalloc-prefix=nanos6_je_ options

  • --with-jemalloc=no or --without-jemalloc: Disable the jemalloc allocator (not recommended)

The location of an external hwloc can be retrieved through pkg-config when specifying --with-hwloc=pkgconfig. If it is installed in a non-standard location, pkg-config can be told where to find it through the PKG_CONFIG_PATH environment variable. For instance:

$ export PKG_CONFIG_PATH=/apps/HWLOC/2.0.0/INTEL/lib/pkgconfig:$PKG_CONFIG_PATH

The --with-cuda flag is needed to enable CUDA tasks. The location of CUDA can be retrieved automatically, if it is in standard system locations (/usr/lib, /usr/include, etc), or through pkg-config. Alternatively, for non-standard installation paths, it can be specified using the optional =prefix of the parameter.

The --enable-openacc flag is needed to enable OpenACC tasks. The location of PGI compilers can be retrieved from the $PATH variable, if it is not specified through the --with-pgi parameter.

Optionally, if you passed any valid LLVM/Clang or Mercurium installation, you can execute the Nanos6 tests by running:

$ make check

1.2.3.1. Embedding software dependencies

As mentioned above, there are some software dependencies that may be embedded into Nanos6. This is the case for hwloc and jemalloc, which will be embedded by default. The sources of these embedded dependencies are taken from the deps sub-directory in this repository. Inside the sub-directory, there is a default hwloc and jemalloc source tarballs. These tarballs are automatically extracted into deps/hwloc and deps/jemalloc by our autogen.sh script.

These are the source packages that are then built when choosing --with-hwloc=embedded or --with-jemalloc=embedded. You may change the embedded software version by placing the desired tarball inside the deps folder and re-running autogen.sh with the option --embed-<SOFTWARE> <VERSION>. Currently, <SOFTWARE> can be hwloc or jemalloc and <VERSION> should be the desired version number of that software. For instance, a valid option could be:

./autogen.sh --embed-jemalloc 5.3.0

For the moment, the tarballs must follow the format deps/<SOFTWARE>-<VERSION>.tar.gz.

1.3. Installation of nOS-V

The nOS-V runtime is a lightweight tasking library with co-execution capabilities, which is designed to be used as a core for other more complex parallel runtimes and implements the nOS-V tasking API.

nOS-V can be obtained from the GitHub public repository or by contacting us at pm-tools@bsc.es.

1.3.1. Build requirements

The following software is required to build and install nOS-V:

  1. automake, autoconf, libtool, make and a C11 compiler

  2. libnuma

1.3.2. Optional libraries and tools

The following software is required to enable additional optional features:

  1. ovni to instrument and generate execution traces for offline performance analysis with Paraver

  2. PAPI to generate real-time statistics of hardware counters

1.3.3. Build procedure

nOS-V uses the standard GNU automake and libtool toolchain. When cloning from a repository, the building environment must be prepared through the command below. When the code is distributed through a tarball, it usually does not need that command.

$ autoreconf -fiv

Then execute the following commands:

$ ./configure --prefix=$INSTALLATION_PREFIX ...other options...
$ make
$ make install

where $INSTALLATION_PREFIX is the directory into which to install nOS-V.

The configure script accepts the following options:

  1. --with-ovni=prefix to specify the prefix of the ovni installation and enable the ovni instrumentation

  2. --with-libnuma=prefix to specify the prefix of the libnuma installation

  3. --with-papi=prefix to specify the prefix of the PAPI installation and enable monitoring features

  4. CACHELINE_WIDTH=width to specify the length in bytes of a cache line in the target architecture

After building, nOS-V tests can be executed using:

$ make check

1.4. Installation of NODES

The NODES (nOS-V based OmpSs-2 DEpendency System) runtime is a library designed to work on top of the nOS-V runtime. It includes most of the functionalities from its predecessor, Nanos6, whilst leaving the interaction with the system to nOS-V. NODES implements the OmpSs-2 parallel programming model, developed by the Programming Models group at the Barcelona Supercomputing Center.

NODES can be obtained from the GitHub public repository or by contacting us at pm-tools@bsc.es.

1.4.1. Build requirements

The following software is required to build and install NODES:

  1. automake, autoconf, libtool, pkg-config, make and a C and C++11 compiler

  2. boost >= 1.71

  3. nOS-V

1.4.2. Optional libraries and tools

The following software is required to enable additional optional features:

  1. ovni (>= 1.5.0) to instrument and generate execution traces for offline performance analysis with Paraver

1.4.3. Build procedure

NODES uses the standard GNU automake and libtool toolchain. When cloning from a repository, the building environment must be prepared through the command below. When the code is distributed through a tarball, it usually does not need that command.

$ autoreconf -fiv

Then execute the following commands:

$ ./configure --prefix=$INSTALLATION_PREFIX ...other options...
$ make
$ make install

where $INSTALLATION_PREFIX is the directory into which to install NODES.

The configure script accepts the following options:

  1. --with-nosv=prefix to specify the prefix of the nOS-V installation

  2. --with-boost=prefix to specify the prefix of the Boost installation

  3. --with-ovni=prefix to specify the prefix of the ovni installation and enable the ovni instrumentation

  4. --with-nodes-clang=prefix to specify the prefix of the LLVM/Clang installation that supports OmpSs-2

Optionally, passing a valid LLVM/Clang installation when configuring enables executing the NODES tests by running:

$ make check

1.5. Installation of LLVM-based compiler

The LLVM website describes a list of build requirements of LLVM.

You should be able to compile and install the LLVM-based compiler with the following commands:

$ cmake -S llvm -B build \
        -DCMAKE_BUILD_TYPE=Release \
        -DCMAKE_INSTALL_PREFIX=$INSTALLATION_PREFIX \
        -DLLVM_ENABLE_PROJECTS=clang \
        -DLLVM_INSTALL_TOOLCHAIN_ONLY=ON \
        -DCLANG_DEFAULT_OMPSS2_RUNTIME=libnanos6 \
        -DCLANG_DEFAULT_NANOS6_HOME=$INSTALLATION_PREFIX \
        -DCLANG_DEFAULT_NODES_HOME=$INSTALLATION_PREFIX
$ cd build
$ make
$ make install

The LLVM/Clang can compile OmpSs-2 code and generate the binaries for both Nanos6 and NODES runtime systems. The support for OmpSs-2 and their runtime systems is always enabled. Deciding whether an OmpSs-2 binary is generated for a specific runtime is done when compiling the user application with clang.

During the building stage, LLVM/Clang only needs to know where to find the runtime libraries. The following CMake options are available to specify the path to the runtime libraries and which is the default runtime for OmpSs-2 binaries:

  • CLANG_DEFAULT_NANOS6_HOME to specify the default path to the Nanos6 installation. If not provided, the default path is the installation prefix of LLVM/Clang. This Nanos6 path can be overriden when compiling OmpSs-2 applications by defining the NANOS6_HOME environment variable

  • CLANG_DEFAULT_NODES_HOME to specify the default path to the NODES installation. If not provided, the default path is the installation prefix of LLVM/Clang. This NODES path can be overriden when compiling OmpSs-2 applications by defining the NODES_HOME environment variable

  • CLANG_DEFAULT_OMPSS2_RUNTIME to specify which runtime library is targeted by default if the user did not specify which runtime when compiling an application. The accepted values are libnanos6 (default) and libnodes

More details about customizing the LLVM build can be found in the LLVM website

1.6. Installation of Mercurium legacy compiler

Important

Mercurium is the OmpSs-2 legacy compiler and is unsupported now. You do not have to install it to use OmpSs-2. We recommend to use the LLVM-based compiler. Check Installation of LLVM-based compiler and LLVM-based compiler.

You can find the build requirements, the configuration flags and the instructions to build Mercurium in the following link: https://github.com/bsc-pm/mcxx

You should be able to compile and install Mercurium with the following commands:

$ autoreconf -fiv
$ ./configure --prefix=$INSTALLATION_PREFIX --enable-ompss-2 --with-nanos6=$INSTALLATION_PREFIX
$ make
$ make install