3.1.2. CUDA specific options

This is the list of CUDA specific options:

NX_ARGS option Environment variable Description
–disable-cuda NX_DISABLECUDA Enable or disable the use of GPUs with CUDA
–gpus NX_GPUS Defines the maximum number of GPUs to use
–gpu-warmup NX_GPUWARMUP Enable or disable warming up the GPU before running user’s code
–gpu-prefetch NX_GPUPREFETCH Set whether data prefetching must be activated or not
–gpu-overlap NX_GPUOVERLAP Set whether GPU computation should be overlapped with all data transfers, whenever possible, or not
–gpu-overlap-inputs NX_GPUOVERLAP_INPUTS Set whether GPU computation should be overlapped with host –> device data transfers, whenever possible, or not
–gpu-overlap-outputs NX_GPUOVERLAP_OUTPUTS Set whether GPU computation should be overlapped with device –> host data transfers, whenever possible, or not
–gpu-max-memory NX_GPUMAXMEM Defines the maximum amount of GPU memory (in bytes) to use for each GPU. If this number is below 100, the amount of memory is taken as a percentage of the total device memory
–gpu-cache-policy NX_GPU_CACHE_POLICY Defines the cache policy for GPU architectures: write-through / write-back / do not use cache
–gpu-cublas-init NX_GPUCUBLASINIT Enable or disable CUBLAS initialization

Following table summarizes valid and default values:

NX_ARGS option Environment variable Values Default
–disable-cuda NX_DISABLECUDA yes / no Enabled
–gpus NX_GPUS integer All GPUs
–gpu-warmup NX_GPUWARMUP yes / no Enabled
–gpu-prefetch NX_GPUPREFETCH yes / no Disabled
–gpu-overlap NX_GPUOVERLAP yes / no Disabled
–gpu-overlap-inputs NX_GPUOVERLAP_INPUTS yes / no Disabled
–gpu-overlap-outputs NX_GPUOVERLAP_OUTPUTS yes / no Disabled
–gpu-max-memory NX_GPUMAXMEM positive integer No limit
–gpu-cache-policy NX_GPU_CACHE_POLICY wt/wb/nocache wb
–gpu-cublas-init NX_GPUCUBLASINIT yes / no Disabled