3.2.1. AIT options¶
The AIT behavior can be modified with the available options. They are summarized and briefly described in the AIT help, which is:
usage: ait -b BOARD -n NAME The Accelerator Integration Tool (AIT) automatically integrates OmpSs@FPGA accelerators into FPGA designs using different vendor backends.
- Required:
- -b BOARD, --board BOARD
board model. Supported boards by vendor: xilinx: alveo_u200, alveo_u250, axiom, com_express, euroexa_maxilink, euroexa_maxilink_quad, simulation, zcu102, zedboard, zybo, zynq702, zynq706
- -n NAME, --name NAME
project name
- Generation flow:
- -d DIR, --dir DIR
path where the project directory tree will be created (def: ‘./’)
- --disable_IP_caching
disable IP caching. Significantly increases generation time
- --disable_utilization_check
disable resources utilization check during HLS generation
- --disable_board_support_check
disable board support check
- --from_step FROM_STEP
initial generation step. Generation steps by vendor: xilinx: HLS, design, synthesis, implementation, bitstream, boot (def: ‘HLS’)
- --IP_cache_location IP_CACHE_LOCATION
path where the IP cache will be located (def: ‘/var/tmp/ait/<vendor>/IP_cache/’)
- --to_step TO_STEP
final generation step. Generation steps by vendor: xilinx: HLS, design, synthesis, implementation, bitstream, boot (def: ‘bitstream’)
- Bitstream configuration:
- -c CLOCK, --clock CLOCK
FPGA clock frequency in MHz (def: ‘100’)
- --hwcounter
add a hardware counter to the bitstream
- --wrapper_version WRAPPER_VERSION
version of accelerator wrapper shell. This information will be placed in the bitstream information
- --bitinfo_note BITINFO_NOTE
custom note to add to the bitInfo
- Data path:
- --datainterfaces_map DATAINTERFACES_MAP
path of mappings file for the data interfaces
- --memory_interleaving_stride MEM_INTERLEAVING_STRIDE
size in bytes of the stride of the memory interleaving. By default there is no interleaving
- --enable_memory_bonding
bond memory channels to increase data throughput. By default there is no bonding
- Hardware Runtime:
- --cmdin_queue_len CMDIN_QUEUE_LEN
maximum length (64-bit words) of the queue for the hwruntime command in This argument is mutually exclusive with –cmdin_subqueue_len
- --cmdin_subqueue_len CMDIN_SUBQUEUE_LEN
length (64-bit words) of each accelerator subqueue for the hwruntime command in. This argument is mutually exclusive with –cmdin_queue_len Must be power of 2 Def. max(64, 1024/num_accs)
- --cmdout_queue_len CMDOUT_QUEUE_LEN
maximum length (64-bit words) of the queue for the hwruntime command out This argument is mutually exclusive with –cmdout_subqueue_len
- --cmdout_subqueue_len CMDOUT_SUBQUEUE_LEN
length (64-bit words) of each accelerator subqueue for the hwruntime command out. This argument is mutually exclusive with –cmdout_queue_len Must be power of 2 Def. max(64, 1024/num_accs)
- --disable_spawn_queues
disable the hwruntime spawn in/out queues
- --spawnin_queue_len SPAWNIN_QUEUE_LEN
length (64-bit words) of the hwruntime spawn in queue. Must be power of 2 (def: ‘1024’)
- --spawnout_queue_len SPAWNOUT_QUEUE_LEN
length (64-bit words) of the hwruntime spawn out queue. Must be power of 2 (def: ‘1024’)
- --hwruntime_interconnect HWR_INTERCONNECT
type of hardware runtime interconnection with accelerators centralized distributed (def: ‘centralized’)
- --max_args_per_task MAX_ARGS_PER_TASK
maximum number of arguments for any task in the bitstream (def: ‘15’)
- --max_deps_per_task MAX_DEPS_PER_TASK
maximum number of dependencies for any task in the bitstream (def: ‘8’)
- --max_copies_per_task MAX_COPIES_PER_TASK
maximum number of copies for any task in the bitstream (def: ‘15’)
- Picos:
- --picos_num_dcts NUM_DCTS
number of DCTs instantiated (def: ‘1’)
- --picos_tm_size PICOS_TM_SIZE
size of the TM memory (def: ‘128’)
- --picos_dm_size PICOS_DM_SIZE
size of the DM memory (def: ‘512’)
- --picos_vm_size PICOS_VM_SIZE
size of the VM memory (def: ‘512’)
- --picos_dm_ds DATA_STRUCT
data structure of the DM memory BINTREE: Binary search tree (not autobalanced) LINKEDLIST: Linked list (def: ‘BINTREE’)
- --picos_dm_hash HASH_FUN
hashing function applied to dependence addresses P_PEARSON: Parallel Pearson function XOR (def: ‘P_PEARSON’)
- --picos_hash_t_size PICOS_HASH_T_SIZE
DCT hash table size (def: ‘64’)
- User-defined files:
- --user_constraints USER_CONSTRAINTS
path of user defined constraints file
- --user_pre_design USER_PRE_DESIGN
path of user TCL script to be executed before the design step (not after the board base design)
- --user_post_design USER_POST_DESIGN
path of user TCL script to be executed after the design step
- Miscellaneous:
- -h, --help
show this help message and exit
- -i, --verbose_info
print extra information messages
- -j JOBS, --jobs JOBS
specify the number of jobs to run simultaneously By default it will use as many jobs as cores with at least 3GB of dedicated free memory, or the value returned by nproc, whichever is less.
- -k, --keep_files
keep files on error
- -v, --verbose
print vendor backend messages
- --version
print AIT version and exits
- Xilinx-specific arguments:
- --floorplanning_constr FLOORPLANNING_CONSTR
built-in floorplanning constraints for accelerators and static logic acc: accelerator kernels are constrained to a SLR region static: each static logic IP is constrained to its relevant SLR all: enables both ‘acc’ and ‘static’ options By default no floorplanning constraints are used
- --placement_file PLACEMENT_FILE
json file specifying accelerator placement
- --slr_slices SLR_SLICES
enable SLR crossing register slices acc: create register slices for SLR crossing on accelerator-related interfaces static: create register slices for static logic IPs all: enable both ‘acc’ and ‘static’ options By default they are disabled
- –interconnect_regslice INTER_REGSLICE_LIST [INTER_REGSLICE_LIST …]
enable register slices on AXI interconnects all: enables them on all interconnects mem: enables them on interconnects in memory datapath hwruntime: enables them on the AXI-stream interconnects between the hwruntime and the accelerators
- --interconnect_opt OPT_STRATEGY
AXI interconnect optimization strategy: Minimize ‘area’ or maximize ‘performance’ (def: ‘area’)
- --interconnect_priorities
enable priorities in the memory interconnect
- --simplify_interconnection
simplify interconnection between accelerators and memory. Might negatively impact timing
- --debug_intfs INTF_TYPE
choose which interfaces mark for debug and instantiate the correspondent ILA cores AXI: debug accelerator’s AXI interfaces stream: debug accelerator’s AXI-Stream interfaces both: debug both accelerator’s AXI and AXI-Stream interfaces custom: debug user-defined interfaces none: do not mark for debug any interface (def: ‘none’)
- --debug_intfs_list DEBUG_INTFS_LIST
path of file with the list of interfaces to debug
- --ignore_eng_sample
ignore engineering sample status from chip part number
- --target_language TARGET_LANG
choose target language to synthesize files to: VHDL or Verilog (def: ‘VHDL’)
- environment variables:
PETALINUX_INSTALL path where Petalinux is installed PETALINUX_BUILD path where the Petalinux project is located