5.2. Sert installation

5.2.1. Access to the nodes

Sert nodes are behind a gateway. From the DAC gateway, connect via ssh to the sert cluster and then select sert remote host:

ssh gw.ac.upc.edu

Or if you are already inside DAC’s internal network, you can just ssh into the login nodes:

ssh sert

You now should be logged into an entry node. .. note:

Entry nodes lack the resources needed to build or run applications.
You must log into a compute node in order to build an application.
See :ref:`sert-connect-compute` or :ref:`sert-connect-fpga`

There, you can access the scratch file system. This file system is shared accross all the node in the cluster.

It is mounted on

/scratch/nas/[N]/$USER

Where N is either 3 or 4

5.2.2. Building applications

5.2.2.1. Connecting to a compute node

In order to build applications, you must connect to a compute node. Since the installation is the scratch file system, it is accessible from all cluster nodes.

You can launch in interactive session using:

srun --cpus-per-task [N] --mem [M] --pty /bin/bash

Or if you need graphic applications:

srun.x11 --cpus-per-task [N] --mem [M]

Where N Os the number of CPUs to use and M is the requested memory in MB.

A failry large ammount of memory is needed to build a bitstream. Around 24GB should be enough for the installed alpha data devices. The CPU number is not critical as some phases in the bitstream generation process need to be run sequentially.

Using between 8 and 12 cores should be enough in most cases.

5.2.2.2. Setting up the environment

A installation from all components’ git master branch is maintained by the CI system.

Users can use this installation by sourcing the initialization script:

source /scratch/nas/4/pmtest/opt/modules/init/bash

Then load the needed modules to set up the environment:

module load vivado/2020.1 ompss/x86_fpga/git

This will set up the environment in order to use the toolchain.

From this point, all tools needed to build and run applications should be available.

5.2.3. Running applications

5.2.3.1. Connecting to the FPGA nodes

There are only 2 FPGA node, which are sert-1002 and sert-1002 These nodes are in the fpga slurm partition. Use srun in order to open an interactive session into these nodes:

srun -A fpga -p fpga --nodelist sert-[1001,1002] --cpus-per-task [1-24] --mem-per-cpu 900 --pty /bin/bash

The --mem-per-cpu option is needed when asking many cpus, since slurm guarantees 2GB of RAM per asked CPU by default, which may be too much for fpga nodes as they have 32GB of memory each one.

To run an interactive session with X11 forwarding, execute the following command (remember to enable it previously when accessing through ssh in the gateway and the sert cluster with the -X option)

srun.x11 -A fpga -p fpga --nodelist sert-[1001,1002] --cpus-per-task [1-24] --mem-per-cpu 900

Note

Do not use FPGA nodes to do general development. As FPGA access needs to be exclussive, it prevents other users from accessing the resources. FPGA nodes are only intended for running and debugging fpga applications.

5.2.3.2. Load bitstream

Once the bitstream is generated, you need the flash tool from AlphaData.

/scratch/nas/4/pmtest/opt/admpcie7v3_sdk-1.0.0/host/util-v1_8_0b2/proj/linux/flash/flash program 0 path/to/bitstream

The bitstream files to load are the ones ending in .bit

5.2.3.3. Reboot sert-[1001,1002]

The flash application loads the bitstream in a flash memory, but it is not loaded on the FPGA until it reboots. To do so, you need access to the ipmi for the node you want to use. This script will do the work reboot-sert.sh

./reboot-sert.sh # PASSWORD [-force] where '#' is 1 or 2 for node 1001 or 1002 respectivelly

5.2.4. Considerations when developing applications

All communication is done through the PCI. The current implementation of xtasks uses a lock to give unique access to the PCI to only one thread. This means that only one thread can send or receive tasks to/from the FPGA. Therefore, using all threads of the node (24) will usually slow the application due to the abuse of locks. Use the --smp-workers and --fpga-helper-threads flags (NX_ARGS) to control the number of threads.