3.2.2. Accelerator placement options¶
This section documents how to constrain accelerators to a particular SLR region in a device.
There are three flags that control accelerator placement:
- Constraints:
--floorplanning_constr
- Slices:
--slr_slices
- Configuration file
--placement_file
On an Alveo U200, which has 3 Super logic regions, external interfaces are placed as follows:
By default, all user accelerators are placed as vivado considers. Sometimes it places a kernel accelerator between 2 SLR, usually negatively impacting timing. Users can enforce accelerators to be constrained to an slr region in order to prevent it from being scattered across multiple SLR. For instance, a user can specify something as follows:
Additionally, users can apply register slices between the SLR crossings to further help timing at the cost of using additional fpga resources. Users can control this by setting different settings for constraints and register slices. For example, activating register slices for the previous design will result in the following layout:
User flags¶
Constraints¶
Constraints affecting different sets of IPs can be individually enabled.
This is done by setting the --floorplanning_constr=<constraint level>
flag.
This can take four different values:
[none],
acc,
static,
all.
These are specified as follows:
[none]¶
Nothing is constrained to a particular region. This is the default behavior.
This is done by not specifying the --floorplanning_constr
acc¶
Accelerator kernels are constrained to be in a slr region.
static¶
Static logic is constrained to a particular region. Each of the static logic IP is constrained to its relevant region. For instance PCI IP is going to be constrained to the slr that contains it IO pins, which is SLR 1 in the case of the U200.
Slices¶
Slices can be automatically placed in SLR crossings to improve timing.
--slr_slices
flag controls the settings.
It can take four different values:
[none],
acc,
static,
all.
[none]¶
No register slices are created for slr crossing, this is the default behaviour.
This is achieved by omitting --slr_slices
flag.
acc¶
Register slices for SLR crossing are created for accelerator related interfaces: - Accelerator - hw runtime - Accelerator - DDR interconnect
static¶
Register slices are created for static logic (DDR MIGs, PCI, communication infrastructure, etc.).
Configuration file¶
Configuration file is a json file that determines the placement of each accelerator instance.
It’s specified using the --placement_file
option.
It should contain a dictionary of accelerator types
Each accelerator type must contain a list of SLR numbers, one for each instance,
indicating where the accelerator is going to be placed.
For instance:
{
"calculate_forces_BLOCK" : [0, 0, 1, 2, 2],
"solve_nbody_task": [1],
"update_particles_BLOCK": [1]
}
This constrains 2 of the 4 calculate_forces_BLOCK
accelerators to be in SLR0,
one of them in SLR1 and the remaining 2 in SLR2.
Also, solve_nbody_task
and update_particles_BLOCK
will be placed in SLR1.