What is the difference between OpenMP and OmpSs? ================================================ Initial team and creation ------------------------- You must compile with ``--ompss`` flag to enable the OmpSs programming model. While both programming models are pretty similar in many aspects there are some key differences. In OpenMP your program starts with a team of one thread. You can create a new team of threads using ``#pragma omp parallel`` (or a combined parallel worksharing like ``#pragma omp parallel for`` or ``#pragma omp parallel sections``). In OmpSs your program starts with a team of threads but only one runs the ``main`` (or ``PROGRAM`` in Fortran). The remaining threads are waiting for work. You create work using ``#pragma omp task`` or ``#pragma omp for``. One of the threads (including the one that was running ``main``) will pick the created work and execute it. This is the reason why ``#pragma omp parallel`` is ignored by the compiler in OmpSs mode. Combined worksharings like ``#pragma omp parallel for`` and ``#pragma omp parallel sections`` will be handled as if they were ``#pragma omp for`` and ``#pragma omp sections``, respectively. Mercurium compiler will emit a warning when it encounters a ``#pragma omp parallel`` that will be ignored. Worksharings ------------ In OpenMP mode, our worksharing implementation for ``#pragma omp for`` (and ``#pragma omp parallel for``) uses the typical strategy of:: begin-parallel-loop code-of-the-parallel-loop end-parallel-loop In OmpSs mode, the implementation of ``#pragma omp for`` exploits a Nanos++ feature called *slicers*. Basically the compiler creates a task which will create internally several more tasks, each one implementing some part of the iteration space of the parallel loop. .. highlight:: c These two implementations are mostly equivalent except for the following case:: int main(int argc, char** argv) { int i; #pragma omp parallel { int x = 0; #pragma omp for for (i = 0; i < 100; i++) { x++; } } return 0; } In OmpSs, since ``#pragma omp parallel`` is ignored, there will not be an ``x`` variable per thread (like it would happen in OpenMP) but just an ``x`` shared among all the threads running the ``#pragma omp for``. .. highlight:: none