6.2. Sert installation¶
6.2.1. Access to the nodes¶
Sert nodes are behind a gateway.
From the DAC gateway, connect via ssh to the sert cluster and then select sert
remote host:
ssh gw.ac.upc.edu
Or if you are already inside DAC’s internal network, you can just ssh into the login nodes:
ssh sert
You now should be logged into an entry node. .. note:
Entry nodes lack the resources needed to build or run applications.
You must log into a compute node in order to build an application.
See :ref:`sert-connect-compute` or :ref:`sert-connect-fpga`
There, you can access the scratch file system. This file system is shared accross all the node in the cluster.
It is mounted on
/scratch/nas/[N]/$USER
Where N
is either 3 or 4
6.2.2. Building applications¶
6.2.2.1. Connecting to a compute node¶
In order to build applications, you must connect to a compute node. Since the installation is the scratch file system, it is accessible from all cluster nodes.
You can launch in interactive session using:
srun --cpus-per-task [N] --mem [M] --pty /bin/bash
Or if you need graphic applications:
srun.x11 --cpus-per-task [N] --mem [M]
Where N
Os the number of CPUs to use and M
is the requested memory in MB.
A failry large ammount of memory is needed to build a bitstream. Around 24GB should be enough for the installed alpha data devices. The CPU number is not critical as some phases in the bitstream generation process need to be run sequentially.
Using between 8 and 12 cores should be enough in most cases.
6.2.2.2. Setting up the environment¶
A installation from all components’ git master branch is maintained by the CI system.
Users can use this installation by sourcing the initialization script:
source /scratch/nas/4/pmtest/opt/modules/init/bash
Then load the needed modules to set up the environment:
module load vivado/2020.1 ompss/x86_fpga/git
This will set up the environment in order to use the toolchain.
From this point, all tools needed to build and run applications should be available.
6.2.3. Running applications¶
6.2.3.1. Connecting to the FPGA nodes¶
There are only 2 FPGA node, which are sert-1002
and sert-1002
These nodes are in the fpga slurm partition.
Use srun
in order to open an interactive session into these nodes:
srun -A fpga -p fpga --nodelist sert-[1001,1002] --cpus-per-task [1-24] --mem-per-cpu 900 --pty /bin/bash
The --mem-per-cpu
option is needed when asking many cpus, since
slurm guarantees 2GB of RAM per asked CPU by default, which may be too much
for fpga nodes as they have 32GB of memory each one.
To run an interactive session with X11 forwarding, execute the following command (remember to enable it previously when accessing through ssh in the gateway and the sert cluster with the -X option)
srun.x11 -A fpga -p fpga --nodelist sert-[1001,1002] --cpus-per-task [1-24] --mem-per-cpu 900
Note
Do not use FPGA nodes to do general development. As FPGA access needs to be exclussive, it prevents other users from accessing the resources. FPGA nodes are only intended for running and debugging fpga applications.
6.2.3.2. Load bitstream¶
Once the bitstream is generated, you need the flash tool from AlphaData.
/scratch/nas/4/pmtest/opt/admpcie7v3_sdk-1.0.0/host/util-v1_8_0b2/proj/linux/flash/flash program 0 path/to/bitstream
The bitstream files to load are the ones ending in .bit
6.2.3.3. Reboot sert-[1001,1002]¶
The flash application loads the bitstream in a flash memory, but it is
not loaded on the FPGA until it reboots. To do so, you need access to
the ipmi for the node you want to use. This script will do the work
reboot-sert.sh
./reboot-sert.sh # PASSWORD [-force] where '#' is 1 or 2 for node 1001 or 1002 respectivelly
6.2.4. Considerations when developing applications¶
All communication is done through the PCI. The current implementation of
xtasks uses a lock to give unique access to the PCI to only one thread.
This means that only one thread can send or receive tasks to/from the
FPGA. Therefore, using all threads of the node (24) will usually slow
the application due to the abuse of locks. Use the --smp-workers
and
--fpga-helper-threads
flags (NX_ARGS
) to control the number of
threads.