Skip to content
README.md 4.87 KiB
Newer Older
Rodrigo Arias Mallo's avatar
Rodrigo Arias Mallo committed
# SLURM Inception

Like the movie, this repository allows you to run a SLURM daemon inside a
current SLURM system, so you can launch your own jobs quickly in your
reservation with a controlled SLURM version.

These script use the nix installation in `/gpfs/projects/bsc15/nix` to access
the SLURM binaries, as well as the tools to enter the namespace that sets the
/nix directory.

All the jobs you launch in the inner SLURM system will have the /nix store
available.

## Usage

First, allocate `$N` nodes in the system:

		login$ salloc -N $N

Then, run the `start.sh` script in all nodes:

		compute$ srun ./start.sh &

Notice the ampersand, so you get back the control on the first compute node.
Now, run the `shell.sh` script to open a shell where you can submit jobs into
the inception SLURM.

		compute$ ./shell.sh
		[inception] compute$

Notice the `[inception]` mark, to help you remember where you are.
In that shell, all SLURM commands refer to the inception SLURM daemon:

		[inception] compute$ sinfo
		PARTITION AVAIL  TIMELIMIT  NODES  STATE NODELIST
		main*        up   infinite      2   idle s21r2b[47,55]

To run MPI programs, make sure you set the `--mpi=pmi2` option in srun:

		[inception] compute$ srun -N 2 --mpi=pmi2 IMB-MPI1 pingpong
		#------------------------------------------------------------
		#    Intel (R) MPI Benchmarks 2017 update 2, MPI-1 part    
		#------------------------------------------------------------
		# Date                  : Fri Jan 13 15:43:49 2023
		# Machine               : x86_64
		# System                : Linux
		# Release               : 4.4.59-92.20-default
		# Version               : #1 SMP Wed May 31 14:05:24 UTC 2017 (8cd473d)
		# MPI Version           : 3.1
		# MPI Thread Environment: 
		...
		#---------------------------------------------------
		# Benchmarking PingPong 
		# #processes = 2 
		#---------------------------------------------------
			   #bytes #repetitions      t[usec]   Mbytes/sec
					0         1000         1.51         0.00
					1         1000         1.51         0.66
					2         1000         1.51         1.33
					4         1000         1.51         2.66
					8         1000         1.51         5.30
				   16         1000         1.70         9.44
				   32         1000         1.70        18.80
				   64         1000         1.69        37.78
				  128         1000         1.72        74.46
				  256         1000         1.75       146.00
				  512         1000         1.83       279.86
				 1024         1000         1.98       518.34
				 2048         1000         2.24       915.28
				 4096         1000         2.75      1489.24
				 8192         1000         3.92      2087.41 <- notice
				16384         1000         7.33      2235.03
				32768         1000         9.13      3589.84
				65536          640        17.65      3714.06
			   131072          320        23.98      5465.19
			   262144          160        36.53      7175.80
			   524288           80        62.30      8415.50
			  1048576           40       114.09      9190.84
			  2097152           20       224.97      9321.85
			  4194304           10       418.15     10030.61

As a bonus, you also have direct access to the /nix store:

		[inception] compute$ srun -N2 --mpi=pmi2 /nix/store/lg0xzcfkd6fh09f238djjfc684cy4d9n-osu-micro-benchmarks-5.7/bin/osu_bw     
		...
		# OSU MPI Bandwidth Test v5.7
		# Size      Bandwidth (MB/s)
		1                       3.50
		2                       7.01
		4                      13.87
		8                      28.06
		16                     47.73
		32                     95.71
		64                    193.10
		128                   377.90
		256                   742.02
		512                  1410.46
		1024                 2556.12
		2048                 4203.39
		4096                 6072.21
		8192                 7539.24 <- yay
		16384                6160.90
		32768                8649.04
		65536                8705.41
		131072               8862.76
		262144               8940.29
		524288               8980.80
		1048576              9001.11
		2097152              9017.25
		4194304              9024.85


To exit and terminate the SLURM inception daemon, first exit the inception
shell:

		[inception] compute$ exit

Then return to the srun job in the background, and then press `^C` twice

		compute$ fg
		^Csrun: interrupt (one more within 1 sec to abort)
		srun: StepId=27021659.1 tasks 0-1: running
		^Csrun: sending Ctrl-C to StepId=27021659.1
		srun: forcing job termination
		srun: Job step aborted: Waiting up to 32 seconds for job step to finish.
		slurmstepd: error: *** STEP 27021659.1 ON s21r2b47 CANCELLED AT 2023-01-13T15:53:47 ***
		srun: launch/slurm: _step_signal: Terminating StepId=27021659.1

You can also use `scancel` from the `login0` node, to kill the outside job
allocation.

## Configuration

You can tune the configuration of the inner SLURM bu changing the
`slurm.base.conf` file. Notice that some options will be appended.