3. Language Directives¶
This chapter describes the OmpSs-2 language, this is, all the necessary elements to understand how an OmpSs-2 application executes and/or behaves in a parallel architecture. OmpSs-2 provides a simple path for users already familiarized with the OpenMP programming model to easily write (or port) their programs to OmpSs-2.
This description is completely guided by the list of OmpSs-2 directives. In each of the following sections we will find a short description of the directive, its specific syntax, the list of clauses (including the list of valid parameters for each clause and a short description for them). In addition, each section finalizes with a simple example showing how this directive can be used in a valid OmpSs-2 program.
As is the case of OpenMP in C and C++, OmpSs-2 directives are specified using the #pragma mechanism (provided by the base language) and in Fortran they are specified using special comments that are identified by a unique sentinel. The sentinel used in OmpSs-2 is oss. Compilers will typically ignore OmpSs-2 directives if support is disabled or not provided.
C/C++ format:
#pragma oss directive-name [clause[ [,] clause] ... ] new-line
Fortran format:
sentinel directive-name [clause[ [,] clause]...]
Where depend on Fortran fixed/free form:
The sentinels for fixed form can be:
!$oss
,c$oss
or*$oss
. Sentinels must start in column 1. Continued directive line must have a character other than a space or a zero in column 6.The sentinel for free form must be
!$oss
. This sentinel can appear in any column as long as is not preceeded by any character different than space. Continued directive line must have an ampersand (&).
3.1. Task construct¶
The programmer can specify a task using the task
construct.
This construct can appear inside any code block of the program, which will mark the following statement as a task.
The syntax of the task
construct is the following:
#pragma oss task [clauses]
structured-block
The valid clauses for the task
construct are:
private(<list>)
firstprivate(<list>)
shared(<list>)
depend(<type>: <memory-reference-list>)
<depend-type>(<memory-reference-list>)
reduction(<operator>:<memory-reference-list>)
priority(<expression>)
cost(<expression>)
if(<scalar-expression>)
final(<scalar-expression>)
wait
onready(<statement>)
label(<string>)
create(<scalar-expression>)
The private
, firstprivate
and shared
clauses allow to specify the data sharing attribute of the variables referenced in the construct.
A description of these clauses can be found in Data sharing attributes section.
The depend
clause allows to infer additional task scheduling restrictions from the parameters it defines.
These restrictions are known as dependences.
The syntax of the depend
clause include a dependence type, followed by colon and its associated list items.
The list of valid type of dependences are defined in section Dependence model in the previous chapter.
In addition to this syntax, OmpSs-2 allows to specify this information using the type of dependence as the name of the clause.
Then, the following code:
#pragma oss task depend(in: a,b,c) depend(out: d)
Is equivalent to this one:
#pragma oss task in(a,b,c) out(d)
The reduction
clause allows to define the task as a participant of a reduction operation.
The first occurrence of a participating task defines the begin of the scope for the reduction.
The end of the scope is implicitly ended by a taskwait or a dependence over the memory-reference-item
.
More information about task reductions on OmpSs-2 at the following Master Thesis: https://upcommons.upc.edu/handle/2117/129246.
The priority
clause indicates a priority hint for the task.
Greater numbers indicate higher priority, and lower numbers indicate less priority.
By default, tasks have priority 0.
The expression of the priority is evaluated as a singed integer.
This way, strictly positive priorities indicate higher priority than the default, and negative priorities indicate lower than default priority.
If the expression of the if
clause evaluates to true, the execution of the new created task can be deferred, otherwise the current task must suspend its execution until the new created task has complete its execution.
If the expression of the final
clause evaluates to true, the new created task will be a final task and all the task generating code encountered when executing its dynamic extent will also generate final tasks.
In addition, when executing within a final task, all the encountered task generating codes will execute these tasks immediately after its creation as if they were simple routine calls.
And finally, tasks created within a final task can use the data environment of its parent task.
Tasks with the wait
clause will perform a taskwait-like operation immediately after exiting from the task code.
Since it is performed outside the scope of the code of the task, this happens once the task has abandoned the stack.
For this reason, its use is restricted to tasks that upon exiting do not have any subtask accessing its local variables.
Otherwise, the regular taskwait shall be used instead.
The onready
clause allow defining an action in the form of a statement (e.g., a call to a function) that will be executed once the task becomes ready.
This is explained in more detail in section Task Onready clause.
The label
clause defines a string literal that can be used by any performance or debugger tool to identify the task with a more human-readable format.
The string literal must be wrapped in double quotes.
For instance, a task that initializes an array could be labeled as label("init array")
.
The following C code shows an example of creating tasks using the task
construct:
float x = 0.0;
float y = 0.0;
float z = 0.0;
int main() {
#pragma oss task
do_computation(x);
#pragma oss task
{
do_computation(y);
do_computation(z);
}
#pragma oss taskwait
return 0;
}
When the control flow reaches #pragma oss task
construct, a new task instance is created.
The execution order of the tasks is not guaranteed.
Moreover, when the program reaches the #pragma oss taskwait
the previously created tasks may not have been executed yet by the OmpSs-2 run-time system.
After potentially being blocked in the taskwait
construct for a while, it is guaranteed that both tasks have already deeply completed.
The task construct is extended to allow the annotation of function declarations or definitions in addition to structured-blocks. When a function is annotated with the task construct each invocation of that function becomes a task creation point. The following C code is an example of how task functions are used:
extern void do_computation(float a);
#pragma oss task
extern void do_computation_task(float a);
float x = 0.0;
int main() {
do_computation_task(x); //this will create a task
do_computation(x); //regular function call
#pragma oss taskwait
}
The invocation of do_computation_task
inside main
function creates an instance of a task.
Note that OmpSs-2 does not gaurantee that the task has been already executed after returning from the regular function call do_computation(x)
.
Note that only the execution of the function itself is part of the task not the evaluation of the task arguments. Another restriction is that the task is not allowed to have any return value, that is, the return must be void.
Warning
The for
clause from the task
directive is no longer part of OmpSs-2.
3.1.1. Task Onready clause¶
The onready
clause allows defining an action in the form of a statement (e.g., a call to a function) that will be executed once the task becomes ready.
The run-time system will execute the statement only once, at any moment after the task satisfies all its data dependencies and before the task runs its body.
The onready action cannot assume that is running within a task context; it should not reach any task scheduling point.
Moreover, the action is recommended to be lightweight and should not perform blocking operations.
The onready action can register external events to the ready task to delay its execution until all the events are fulfilled.
As an example, the callback could execute an asynchronous TAMPI operation, such as TAMPI_Iwait
.
Such a call would delay the task’s execution until the corresponding MPI communications are completed.
In that way, the data dependencies allow tasks to define local dependencies with other tasks on the same process, whereas the onready
clause allows defining remote dependencies with other processes.
See Task external events for more information about the management of task external events.
The data sharing and dependency rules for the variables used in the onready
action are the same that apply to the task body.
We show an example below where a task safely increases by two the value of a variable from the onready
action and the task body:
void function(int *a)
{
// This is the first increase because it's the onready action
*a += 1;
}
int main()
{
int a = 0;
#pragma oss task inout(a) onready(function(&a))
{
// At this point, the onready action was already executed
++a;
}
#pragma oss taskwait
fprintf(stdout, "a: %d\n", a); // Should print "a: 2"
}
Below there is an example where we asynchronously receive (through TAMPI services) and process remote data using a single task with an onready
action:
void function(int *a, int rank)
{
MPI_Request request;
MPI_Irecv(a, 1, MPI_INT, src, 0, MPI_COMM_WORLD, &request);
// Asynchronously delay the execution of the task until the communication
// has completed. This service will register an external event if the
// receive has not completed immediately
TAMPI_Iwait(&request, MPI_STATUS_IGNORE);
}
int main()
{
// Initialize MPI...
int a = 0;
int rank = ...;
#pragma oss task inout(a) onready(function(&a, rank))
{
fprintf(stdout, "received data: %d\n", a);
process(a);
}
#pragma oss taskwait
// Finalize MPI...
}
Please note this is just an example of how onready
can be used to delay the execution of a task by registering external events.
Performing heavy operations like MPI_Irecv
in an onready
action is not recommended because the onready
action does not run in the context of a task.
3.1.2. Task Create clause (experimental)¶
Important
The create clause is experimental and may change or be removed in the future without any notice.
The optional create
clause can inhibit the creation of a task and just
execute the body directly. It is typically used to override the behavior of the
final
clause for an specific task.
The clause expects a conditional argument create(cond)
which is evaluated
when the task is to be created. If the condition evaluates to true the task is
created, even if the task is final. Otherwise, if evaluates to false, the task
is never created.
The following create
clause:
#pragma oss task create(cond)
do_work(size)
Is equivalent to this code:
if (cond) {
#pragma oss task create(true)
do_work(size)
} else {
do_work(size)
}
An example of the create
clause to override the final effect for some
specific tasks is depicted in the following diagram:
The tasks in grey won’t be created due to the final clause, but the ones in
green with the create(true)
clause, will always be created.
A task without the create clause will follow the normal creation process, following the rules imposed by the final clause if used.
The condition used in the create clause must not have side-effects, otherwise the
behavior is undefined. For example, create(size++ > 20)
may or may not
increase the value of the variable size
.
It is important to note that when the task is not created and the body runs as-is, the dependencies or data sharing clauses of that task won’t have any effect (as there won’t be any task). It is the programmer responsibility to ensure that the program it is still correct.
3.2. Taskwait construct¶
Apart from implicit synchronization (task data dependences), OmpSs-2 also offers a mechanism that allow users to synchronize task execution.
The taskwait
construct is an stand-alone directive (with no code block associated) and specifies a wait on the deep completion of all descendant tasks, including the non-direct children tasks.
The syntax of the taskwait
construct is the following:
#pragma oss taskwait [clauses]
The valid clauses for the taskwait
construct are the following:
on(list-of-variables)
: It specifies to wait only for the subset (not all of them) of descendant tasks that declared a dependency on any of the variables that appear on the list of variables.
The on
clause allows to wait only on the tasks that produces some data in the same way as the inout
clause.
It suspends the current task until all previous tasks with any dependency on the expression are completed.
The following example illustrates its use:
int compute1(void);
int compute2(void);
int main()
{
int result1, result2;
#pragma oss task out(result1)
result1 = compute1();
#pragma oss task out(result2)
result2 = compute2();
#pragma oss taskwait on(result1)
printf("result1 = %d\n", result1);
#pragma oss taskwait on(result2)
printf("result2 = %d\n", result2);
return 0;
}
3.3. Release directive¶
The release
directive asserts that a task will no longer perform accesses that conflict with the contents of the associated depend
clause.
The contents of the depend
clause must be a subset of that of the task construct that is no longer referenced in the rest of the lifetime of the current task and its future subtasks.
The release
directive has not associated structured block.
The syntax of the release
directive is the following:
#pragma oss release [clauses]
The valid clauses for the release
directive are:
depend(<type>: <memory-reference-list>)
<depend-type>(<memory-reference-list>)
The following C code shows an example of partial release of the task dependences using the release
directive:
#define SIZE 4096
float x[SIZE];
float y[SIZE];
int main() {
#pragma oss task depend(out:x,y)
{
for (int i=0; i<SIZE; i++) x[i] = 0.0;
#pragma oss release depend(out:x)
for (int i=0; i<SIZE; i++) y[i] = 0.0;
}
}
Warning
At this moment, the run-time system only supports releasing a dependency of the same dependency type that was specified at the task depend
clause.
3.4. Atomic construct¶
Warning
The following information applies to clang. Mercurium
does support atomic
, but without clauses.
The atomic
construct ensures that a specific storage location is
accessed atomically, rather than exposing it to the possibility of
multiple, simultaneous reading and writing threads that may result in indeterminate values:
#pragma oss atomic
statement
The construct allows the clauses read
, write
, and update
to define
the semantics for which a directive enforces atomicity. If no clause is present,
the behavior is as if the update
clause is specified.
read
results in an atomic read of the location designated byx
. The statement has the following form:v = x;
write
results in an atomic write of the location designated byx
. The statement has the following form:v = expr;
update
results in an atomic update of the location designated byx
using the designated operator or intrinsic. Only the read and write of the location designed byx
are performed mutually atomically. The statement has the following form:x++; x--; ++x; --x; x binop= expr; x = x binop expr; x = expr binop x;
capture
is an atomic capture update. That is, an atomic update to the location designed byx
using the designated operator or intrinsic while also capturing the original or final value of the location designed byx
with respect to the atomic update. The original or final value of the location designated byx
is written in the location designated byv
. Only the read and write of the location designed byx
are performed mutually atomically. The statement has the following form:v = expr-stmt { v = x; expr-stmt } { expr-stmt v = x; }
where expr-stmt is either an atomic write or update statement.
The atomic
construct also allows memory order clauses:
relaxed
, acquire
, release
, acq_rel
, seq_cst
.
If no memory order clause is specified the default memory ordering is
relaxed
.
3.5. Critical construct¶
The critical
construct allows programmers to specify regions of code that will be executed in mutual exclusion.
The associated region will be executed by a single thread at a time, other threads will wait at the beginning of the critical section until no thread is executing it.
The syntax of the critical
construct is the following:
#pragma oss critical
structured-block
The syntax also allows named criticals with the following syntax:
#pragma oss critical(<name>)
structured-block
Named criticals prevent concurrency between threads with respect to all critical regions with the same name. Unnammed criticals prevent concurrency between threads with respect to all unnamed critical regions.
The critical construct has no related clauses. The beginning and ending of a critical section may be task scheduling points.
3.6. Assert directive¶
The assert
is a declarative directive that checks whether any runtime configuration option is enabled or has a specific value.
The directive expects a single string with comma-separated conditions.
Each condition is composed of the option name, a comparison operator (==
, !=
, >
, >=
, <
and <=
), and the value to compare.
All conditions are checked before starting the program, and if any fails, the program aborts showing an error with the incorrect option. The configuration options are runtime-specific, so each runtime system may define its valid configuration options. Since this directive is declarative, it should be written in any part of the program’s source code files.
The syntax of the assert
directive is:
#pragma oss assert("option1==40,option2<10,option3!=stringvalue")
int main() {
// ...
}
Check the OmpSs-2 User Guide to see the runtime configuration options that can be asserted. For instance, we can assert that a program is running using the regions dependency system and that the thread stack size is greater than 4MB with the following code:
#pragma oss assert("version.dependencies==regions,misc.stack_size>4M")
int main() {
// ...
}
3.7. Restrictions¶
The point of exit of a stuctured block cannot be a branch out of it. The following C code shows an example where any of the exit
calls may lead to undefined behavior:
int main() {
for (int i = 0; i < 10; i++) {
if (i > 5) exit(1);
#pragma oss task
{
// computation
exit(1);
}
}
#pragma oss taskwait
}