4. NODES Runtime Options¶
This section describes how to run OmpSs-2 applications using NODES, and which runtime options are available.
4.1. Executing and controlling the number of CPUs¶
NODES applications can be compiled and executed in this way:
# Compile OmpSs-2 program with LLVM/Clang
$ clang -fompss-2=libnodes app.c -o app
# Execute on all available cores of the current session
$ ./app
The number of cores that are used is controlled by running the application through the taskset command.
For instance:
# Execute on cores 0, 1, 2 and 4
$ taskset -c 0-2,4 ./app
4.2. Configuration¶
NODES offers configuration options which allow enabling and tuning optional features. These options are specified through its configuration file (nodes.toml) using TOML 1.0 syntax. The default value for all configuration options is included in $PREFIX/share/nodes.toml, and can be directly modified or used to create other user-defined configuration files.
Additionally, individual configuration options can be overridden through the use of the NODES_CONFIG_OVERRIDE environment variable. Multiple comma-separated options can be specified using this method.
While initializing, NODES will search for and parse a single configuration file. The discovery of such file follows the next order of preference:
The file specified in the
NODES_CONFIGenvironment variable.A file named
nodes.tomlpresent in the current directory.The default configuration file located in
$PREFIX/share/nodes.toml.
After a configuration file has been loaded, individual overrides from NODES_CONFIG_OVERRIDE will be applied.
Consider the following example where we specify the ~/nodes-custom.toml file and then override a configuration variable:
NODES_CONFIG="~/nodes-custom.toml" NODES_CONFIG_OVERRIDE="ovni.enabled=true" ./application
4.3. Environment Variables¶
NODES_CONFIG="$CUSTOM_PATH": Allows users to define the path of a specific configuration file.NODES_CONFIG_OVERRIDE="$OPTIONS": Accepts a comma-separated list of options which will override those defined in the configuration file.NODES_OVNI="1/0": (Deprecated) Enable or disable ovni instrumentation. Please use thenodes.tomlconfiguration file instead.
4.4. Generating ovni traces¶
NODES can generate execution traces with the ovni library, which generates lightweight binary traces, and it is possible to mix ovni-instrumented libraries together with an OmpSs-2 program and obtain a single coherent trace.
To enable the generation of ovni traces, NODES must be configured with the
--with-ovni option. Once NODES has been built with ovni support, it is up to
the user to enable it, as it is disabled by default. To enable ovni instrumentation,
the NODES_OVNI environment variable must be set as follows: export NODES_OVNI=1.
The trace will be left in the ovni/ directory, which can be transformed into a
Paraver trace with the ovniemu utility. The Paraver configuration files
(views) can be found in the ovni/cfg directory.
See the ovni documentation for more details.
4.5. Advanced Mechanisms¶
NODES also supports several advanced mechanisms and prototypes. In this section, we briefly introduce them.
4.5.1. Coroutine Support¶
The use of Coroutines is provided if a compiler with C++20 support is used. To compile with Coroutine support, the -fcoroutines flag must be passed, as shown in the example below:
clang++ -std=c++20 -fcoroutines -o test-coroutines.bin test-coroutines.cpp
For more detailed examples on the usage of Coroutines, check our correctness tests in the tests/correctness/coroutine subdirectory, and the original published paper on coroutines:
Arnau Cinca, Aleix Roca, Kevin Sala, Raúl Peñacoba, David Álvarez, Vicenç Beltran. Enhancing OmpSs-2 Suspendable Tasks by Combining Operating System and User-Level Threads with C++ Coroutines. In 2025 IEEE International Parallel and Distributed Processing Symposium (IPDPS), Milano, Italy, 2025, pp. 67–80. https://doi.org/10.1109/IPDPS64566.2025.00015
4.5.2. Taskiter Construct¶
Given an iterative application with loops presenting an identifiable task directed-acyclic graph (DAG) that repeats itself at each iteration, the taskiter construct optimizes their execution as follows:
The first iteration of the loop is executed, and its generated DAG is recorded.
The recorded DAG is transformed to generate a directed cyclic task graph (DCTG).
The generated DCTG is automatically optimized and utilized to execute the remaining iterations.
The following code, extracted from tests/correctness/taskiter/taskiter-unroll.cpp, shows an example usage of the taskiter clause:
#pragma oss taskiter shared(A) unroll(UNROLL)
for (int i = 0; i < NUM_ITERS; ++i) {
for (int j = 0; j < tilesize; ++j) {
#pragma oss task shared(A) firstprivate(j) inout(A[j])
{
A[j]++;
}
#pragma oss task shared(A) firstprivate(j) inout(A[j])
{
A[j]++;
}
}
}
#pragma oss taskwait
More information about the taskiter construct and its benefits can be found in its original paper.
4.5.3. Inner Parallelism¶
NODES offers a mechanism to parallelize inner runtime code using OmpSs-2 without having to pass through the compiler. Below we list the main features of this mechanism:
It is enabled through the
runtime.enable_inner_parallelismconfigure variable.At the moment, only specific parts of the
taskitercode leverage its potential.InnerParallelism::taskLoopallows parallelizing an inner loop within the NODES runtime as-is.InnerParallelism::taskWaitallows to synchronize the internally created tasks up until that point.Task synchronization falls entirely upon the developer, as the implicit taskwait in programs will not enforce their finalization.
4.6. Debugging¶
By default, NODES is optimized to execute applications as efficiently as possible in terms of performance, and will assume that
the application code is correct. Thus, it will not perform most runtime validity checks.
To enable validity checks, users must compile NODES enabling debug options, by passing the --enable-debug flag at configure time.
This will enable many internal validity checks that may be violated when the application code is incorrect.
To debug an application with a regular debugger, please compile its code with the regular debugging flags.