3.2.1. AIT options

The AIT behavior can be modified with the available options. They are summarized and briefly described in the AIT help, which is:

usage: ait -b BOARD -n NAME The Accelerator Integration Tool (AIT) automatically integrates OmpSs@FPGA accelerators into FPGA designs using different vendor backends.

Required:
-b BOARD, --board BOARD

board model. Supported boards by vendor: xilinx: alveo_u200, alveo_u250, axiom, com_express, euroexa_maxilink, euroexa_maxilink_quad, simulation, zcu102, zedboard, zybo, zynq702, zynq706

-n NAME, --name NAME

project name

Generation flow:
-d DIR, --dir DIR

path where the project directory tree will be created (def: ‘./’)

--disable_IP_caching

disable IP caching. Significantly increases generation time

--disable_utilization_check

disable resources utilization check during HLS generation

--disable_board_support_check

disable board support check

--from_step FROM_STEP

initial generation step. Generation steps by vendor: xilinx: HLS, design, synthesis, implementation, bitstream, boot (def: ‘HLS’)

--IP_cache_location IP_CACHE_LOCATION

path where the IP cache will be located (def: ‘/var/tmp/ait/<vendor>/IP_cache/’)

--to_step TO_STEP

final generation step. Generation steps by vendor: xilinx: HLS, design, synthesis, implementation, bitstream, boot (def: ‘bitstream’)

Bitstream configuration:
-c CLOCK, --clock CLOCK

FPGA clock frequency in MHz (def: ‘100’)

--hwcounter

add a hardware counter to the bitstream

--wrapper_version WRAPPER_VERSION

version of accelerator wrapper shell. This information will be placed in the bitstream information

--bitinfo_note BITINFO_NOTE

custom note to add to the bitInfo

Data path:
--datainterfaces_map DATAINTERFACES_MAP

path of mappings file for the data interfaces

--memory_interleaving_stride MEM_INTERLEAVING_STRIDE

size in bytes of the stride of the memory interleaving. By default there is no interleaving

--enable_memory_bonding

bond memory channels to increase data throughput. By default there is no bonding

Hardware Runtime:
--cmdin_queue_len CMDIN_QUEUE_LEN

maximum length (64-bit words) of the queue for the hwruntime command in This argument is mutually exclusive with –cmdin_subqueue_len

--cmdin_subqueue_len CMDIN_SUBQUEUE_LEN

length (64-bit words) of each accelerator subqueue for the hwruntime command in. This argument is mutually exclusive with –cmdin_queue_len Must be power of 2 Def. max(64, 1024/num_accs)

--cmdout_queue_len CMDOUT_QUEUE_LEN

maximum length (64-bit words) of the queue for the hwruntime command out This argument is mutually exclusive with –cmdout_subqueue_len

--cmdout_subqueue_len CMDOUT_SUBQUEUE_LEN

length (64-bit words) of each accelerator subqueue for the hwruntime command out. This argument is mutually exclusive with –cmdout_queue_len Must be power of 2 Def. max(64, 1024/num_accs)

--disable_spawn_queues

disable the hwruntime spawn in/out queues

--spawnin_queue_len SPAWNIN_QUEUE_LEN

length (64-bit words) of the hwruntime spawn in queue. Must be power of 2 (def: ‘1024’)

--spawnout_queue_len SPAWNOUT_QUEUE_LEN

length (64-bit words) of the hwruntime spawn out queue. Must be power of 2 (def: ‘1024’)

--hwruntime_interconnect HWR_INTERCONNECT

type of hardware runtime interconnection with accelerators centralized distributed (def: ‘centralized’)

--max_args_per_task MAX_ARGS_PER_TASK

maximum number of arguments for any task in the bitstream (def: ‘15’)

--max_deps_per_task MAX_DEPS_PER_TASK

maximum number of dependencies for any task in the bitstream (def: ‘8’)

--max_copies_per_task MAX_COPIES_PER_TASK

maximum number of copies for any task in the bitstream (def: ‘15’)

Picos:
--picos_num_dcts NUM_DCTS

number of DCTs instantiated (def: ‘1’)

--picos_tm_size PICOS_TM_SIZE

size of the TM memory (def: ‘128’)

--picos_dm_size PICOS_DM_SIZE

size of the DM memory (def: ‘512’)

--picos_vm_size PICOS_VM_SIZE

size of the VM memory (def: ‘512’)

--picos_dm_ds DATA_STRUCT

data structure of the DM memory BINTREE: Binary search tree (not autobalanced) LINKEDLIST: Linked list (def: ‘BINTREE’)

--picos_dm_hash HASH_FUN

hashing function applied to dependence addresses P_PEARSON: Parallel Pearson function XOR (def: ‘P_PEARSON’)

--picos_hash_t_size PICOS_HASH_T_SIZE

DCT hash table size (def: ‘64’)

User-defined files:
--user_constraints USER_CONSTRAINTS

path of user defined constraints file

--user_pre_design USER_PRE_DESIGN

path of user TCL script to be executed before the design step (not after the board base design)

--user_post_design USER_POST_DESIGN

path of user TCL script to be executed after the design step

Miscellaneous:
-h, --help

show this help message and exit

-i, --verbose_info

print extra information messages

-j JOBS, --jobs JOBS

specify the number of jobs to run simultaneously By default it will use as many jobs as cores with at least 3GB of dedicated free memory, or the value returned by nproc, whichever is less.

-k, --keep_files

keep files on error

-v, --verbose

print vendor backend messages

--version

print AIT version and exits

Xilinx-specific arguments:
--floorplanning_constr FLOORPLANNING_CONSTR

built-in floorplanning constraints for accelerators and static logic acc: accelerator kernels are constrained to a SLR region static: each static logic IP is constrained to its relevant SLR all: enables both ‘acc’ and ‘static’ options By default no floorplanning constraints are used

--placement_file PLACEMENT_FILE

json file specifying accelerator placement

--slr_slices SLR_SLICES

enable SLR crossing register slices acc: create register slices for SLR crossing on accelerator-related interfaces static: create register slices for static logic IPs all: enable both ‘acc’ and ‘static’ options By default they are disabled

–interconnect_regslice INTER_REGSLICE_LIST [INTER_REGSLICE_LIST …]

enable register slices on AXI interconnects all: enables them on all interconnects mem: enables them on interconnects in memory datapath hwruntime: enables them on the AXI-stream interconnects between the hwruntime and the accelerators

--interconnect_opt OPT_STRATEGY

AXI interconnect optimization strategy: Minimize ‘area’ or maximize ‘performance’ (def: ‘area’)

--interconnect_priorities

enable priorities in the memory interconnect

--simplify_interconnection

simplify interconnection between accelerators and memory. Might negatively impact timing

--debug_intfs INTF_TYPE

choose which interfaces mark for debug and instantiate the correspondent ILA cores AXI: debug accelerator’s AXI interfaces stream: debug accelerator’s AXI-Stream interfaces both: debug both accelerator’s AXI and AXI-Stream interfaces custom: debug user-defined interfaces none: do not mark for debug any interface (def: ‘none’)

--debug_intfs_list DEBUG_INTFS_LIST

path of file with the list of interfaces to debug

--ignore_eng_sample

ignore engineering sample status from chip part number

--target_language TARGET_LANG

choose target language to synthesize files to: VHDL or Verilog (def: ‘VHDL’)

environment variables:

PETALINUX_INSTALL path where Petalinux is installed PETALINUX_BUILD path where the Petalinux project is located