Executing and controlling number of CPUs
========================================

Nanos6 applications can be compiled and executed in this way:

.. code:: sh

     # Compile OmpSs-2 program with LLVM/Clang
     $ clang -fompss-2 app.c -o app

     # Execute on all available cores of the current session
     $ ./app

The number of cores that are used is controlled by running the application through the ``taskset`` command.
For instance:

.. code:: sh

     # Execute on cores 0, 1, 2 and 4
     $ taskset -c 0-2,4 ./app


Runtime configuration options
=============================

The behaviour of the Nanos6 runtime can be tuned after compilation by means of a configuration file.
All former Nanos6 environment variables are obsolete and will be ignored by the runtime system.
Currently, the supported configuration file format is `TOML v1.0.0-rc1 <https://toml.io/en/v1.0.0-rc.1>`__.
The default configuration file is named ``nanos6.toml`` and can be found in the ``share`` directory of the Nanos6 installation::

     $INSTALLATION_PREFIX/share

.. note::

   The ``nanos6.toml`` file was recently moved from ``$INSTALLATION_PREFIX/share/doc/nanos6/scripts`` to the
   ``$INSTALLATION_PREFIX/share`` directory

To override the default configuration, it is recommended to copy the default file and change the relevant options.
The first configuration file found will be interpreted, according to the following order:

1. The file pointed by the ``NANOS6_CONFIG`` environment variable.
2. The file ``nanos6.toml`` found in the current working directory.
3. The file ``nanos6.toml`` found in the installation path (default file).

The configuration file is organized into different sections and subsections.
Option names have the format ``<section>[.<subsection>].<option_name>``.
For instance, the dependency system that the runtime should load is specified by the ``version.dependencies`` option.
Inside the TOML configuration file, that option is placed inside the ``version`` section as follows:

.. code::

     [version]
         dependencies = "discrete"

Alternatively, if the configuration has to be changed programatically and creating new files is not practical, the configuration options can be overriden using the ``NANOS6_CONFIG_OVERRIDE`` environment variable.
The content of this variable has to be in the format ``option1=value1,option2=value2,option3=value3,...``, providing a comma-separated list of assignations.

For example, you can run the following command to change the dependency implementation and use CTF instrumentation::

     NANOS6_CONFIG_OVERRIDE="version.dependencies=discrete,version.instrument=ctf" ./ompss-program`

By default, the runtime system emits a warning during initialization when it detects the definition of irrelevant environment variables that start with the ``NANOS6`` prefix.
The exceptions are the previous two variables ``NANOS6_CONFIG`` and ``NANOS6_CONFIG_OVERRIDE``, but also the ``NANOS6_HOME`` variable.
This latter is not relevant for the Nanos6 runtime system but it is often defined by users and it is relevant for the OmpSs-2 LLVM-based compiler.
The warning can be disabled by setting the ``loader.warn_envars`` configuration option to ``false``.


Runtime variants
================

There are several Nanos6 runtime variants, each one focusing on different aspects of parallel executions: performance, debugging, instrumentation, etc.
OmpSs-2 applications do not require recompiling their code to run with instrumentation, e.g., to extract Extrae traces or to generate additional information.
This is instead controlled through configration options, at run-time.
Users can select a specific Nanos6 variant when running an application by setting the ``version.instrument``, ``version.debug`` and ``version.dependencies`` configuration variables.
This section explains the different values for these variables.

The instrumentation is specified by the ``version.instrument`` configuration variable and can take the following values:

``version.instrument = none``
  This is the **default** value and does not enable any kind of instrumentation.
  This is the variant that should be used when executing peformance experiments since it is the one that adds no instrumentation overhead.
  Continue reading this section for more information about performance runs with Nanos6.

``version.instrument = ovni``
  Instrumented with `ovni <https://github.com/bsc-pm/ovni>`__ to generate `Paraver <https://tools.bsc.es/paraver>`__ traces. See :ref:`nanos6-ovni-instrumentation`.

``version.instrument = extrae``
  Instrumented to produce `Paraver <https://tools.bsc.es/paraver>`__ traces. See: :ref:`nanos6-extrae-instrumentation`.

``version.instrument = ctf``
  Instrumented to produce CTF traces and convert them to the `Paraver <https://tools.bsc.es/paraver>`__ format. See: :ref:`nanos6-ctf-instrumentation`.

``version.instrument = verbose``
  Instrumented to emit a log of the execution. See: :ref:`nanos6-verbose-instrumentation`.

``version.instrument = lint``
  Instrumented to support the `OmpSs-2@Linter <https://github.com/bsc-pm/ompss-2-linter>`__ tool.

.. note::

  The ``graph`` and ``stats`` instrumentations have been recently removed and are no longer available

By default, Nanos6 loads runtime variants that are compiled using high optimization flags and most of the internal assertions turned off.
This is the configuration that should be used along with the default ``none`` instrumentation for benchmarking experiments.
However, these are not the only options that provide the best performance.
See :ref:`nanos6-benchmarking` for more information.

Sometimes it is useful to run an OmpSs-2 program with debug information for debugging purposes (e.g., when a program crashes).
The runtime system provides the option ``version.debug`` to load a runtime variant that has been compiled without optimizations and with all internal assertions turned on.
The default value for this option is ``false`` (no debug), but can be changed to ``true`` to enable the debug information.
Please note that the runtime system will significantly decrease its performance when enabling this option.
Additionally, all instrumentation variants have their optimized and debug variants.

Finally, the last configuration variable used to specify a runtime variant is the ``version.dependencies``, which is explained in the next section.


Task data dependencies
======================

.. index::
    double: Nanos6; dependencies

The Nanos6 runtime has support for different dependency implementations.
The ``discrete`` dependencies are the default dependency implementation.
This is the most optimized implementation but it does not fully support the OmpSs-2 dependency model since it does not support region dependencies.
In the case the user program requires region dependencies (e.g., to detect dependencies among partial overlapping dependency regions), Nanos6 privides the ``regions`` implementation, which is completely spec-compliant.
This latter is also the only implementation that supports OmpSs-2@Cluster.

The dependency implementation can be selected at run-time through the ``version.dependencies`` configuration variable. The available implementations are:

``version.dependencies = "discrete"``
  **Default** and optimized implementation not supporting region dependencies.
  Region syntax is supported but will behave as a discrete dependency to the first address.
  Scales better than the default implementation thanks to its simpler logic and is functionally similar to traditional OpenMP model.

``version.dependencies = "regions"``
  Supporting all dependency features.
  Default implementation in OmpSs-2@Cluster installations.

In cases where an OmpSs-2 program requires region dependency support, we recommended to add the declarative directive ``assert`` in any of the program source files, as shown below.
Then, before the program is started, the runtime system will check whether the loaded dependency implementation is ``regions`` and will abort the execution if it is not true.

.. code:: c

     #pragma oss assert("version.dependencies==regions")

     int main() {
         // ...
     }

Notice that the ``assert`` directive could also check whether the runtime is using ``discrete`` dependencies.


Task scheduler
==============

.. index::
    double: Nanos6; scheduling

The scheduling infrastructure provides the following configuration options to modify the behavior of the task scheduler:

``scheduler.policy`` (default: ``fifo``)
  Specifies whether ready tasks are added to the ready queue using a FIFO (``fifo``) or a LIFO (``lifo``) policy.

``scheduler.immediate_successor`` (default: ``0.75``)
  Probability of enabling the immediate successor feature to improve cache data reutilization between successor tasks.
  If enabled, when a CPU finishes a task it starts executing the successor task (computed through their data dependencies).

``scheduler.priority`` (default: ``true``)
  Boolean indicating whether the scheduler should consider the task priorities defined by the user in the task's priority clause.


Task worksharings
=================

.. index::
    double: Nanos6; taskfor

.. important::  Task worksharings, which were implemented by the ``task for`` clause, are no longer part of the OmpSs-2 specification.


Stack size
==========

By default, Nanos6 allocates stacks of 8 MB for its worker threads.
In some codes this may not be enough.
For instance, when converting Fortran codes, some global variables may need to be converted into local variables.
This may increase substantially the amount of stack required to run the code and may surpass the space that is available.

To solve that problem, the stack size can be set through the ``misc.stack_size`` configuration variable.
Its value is expressed in bytes but it also accepts the ``K``, ``M``, ``G``, ``T`` and ``E`` suffixes, that are interpreted as power of 2 multipliers.
For instance:

.. code::

     [misc]
         stack_size = "16M"