3.2.1. AIT optionsΒΆ

The AIT behavior can be modified with the available options. They are summarized and briefly described in the AIT help, which is:

usage: ait -b BOARD -n NAME
The Accelerator Integration Tool (AIT) automatically integrates OmpSs@FPGA accelerators into FPGA designs using different vendor backends.

Required:
  -b BOARD, --board BOARD
                        board model. Supported boards by vendor:
                        xilinx: alveo_u200, alveo_u250, alveo_u280, alveo_u280_hbm, alveo_u55c, com_express, kv260, simulation, zcu102, zedboard, zybo, zynq702, zynq706
  -n NAME, --name NAME  project name

Generation flow:
  -d DIR, --dir DIR     path where the project directory tree will be created
                        (def: './')
  --disable_IP_caching  disable IP caching. Significantly increases generation time
  --disable_utilization_check
                        disable resources utilization check during HLS generation
  --disable_board_support_check
                        disable board support check
  --from_step FROM_STEP
                        initial generation step. Generation steps by vendor:
                        xilinx: HLS, design, synthesis, implementation, bitstream, boot
                        (def: 'HLS')
  --IP_cache_location IP_CACHE_LOCATION
                        path where the IP cache will be located
                        (def: '/var/tmp/ait/<vendor>/IP_cache/')
  --to_step TO_STEP     final generation step. Generation steps by vendor:
                        xilinx: HLS, design, synthesis, implementation, bitstream, boot
                        (def: 'bitstream')

Bitstream configuration:
  -c CLOCK, --clock CLOCK
                        FPGA clock frequency in MHz
                        (def: '100')
  --hwcounter           add a hardware counter to the bitstream
  --wrapper_version WRAPPER_VERSION
                        version of accelerator wrapper shell. This information will be placed in the bitstream information
  --bitinfo_note BITINFO_NOTE
                        custom note to add to the bitInfo

Data path:
  --datainterfaces_map DATAINTERFACES_MAP
                        path of mappings file for the data interfaces
  --memory_interleaving_stride MEM_INTERLEAVING_STRIDE
                        size in bytes of the stride of the memory interleaving. By default there is no interleaving
  --disable_creator_ports
                        Disable memory access ports in the task-creation accelerators

Hardware Runtime:
  --cmdin_queue_len CMDIN_QUEUE_LEN
                        maximum length (64-bit words) of the queue for the hwruntime command in
                        This argument is mutually exclusive with --cmdin_subqueue_len
  --cmdin_subqueue_len CMDIN_SUBQUEUE_LEN
                        length (64-bit words) of each accelerator subqueue for the hwruntime command in.
                        This argument is mutually exclusive with --cmdin_queue_len
                        Must be power of 2
                        Def. max(64, 1024/num_accs)
  --cmdout_queue_len CMDOUT_QUEUE_LEN
                        maximum length (64-bit words) of the queue for the hwruntime command out
                        This argument is mutually exclusive with --cmdout_subqueue_len
  --cmdout_subqueue_len CMDOUT_SUBQUEUE_LEN
                        length (64-bit words) of each accelerator subqueue for the hwruntime command out. This argument is mutually exclusive with --cmdout_queue_len
                        Must be power of 2
                        Def. max(64, 1024/num_accs)
  --disable_spawn_queues
                        disable the hwruntime spawn in/out queues
  --spawnin_queue_len SPAWNIN_QUEUE_LEN
                        length (64-bit words) of the hwruntime spawn in queue. Must be power of 2
                        (def: '1024')
  --spawnout_queue_len SPAWNOUT_QUEUE_LEN
                        length (64-bit words) of the hwruntime spawn out queue. Must be power of 2
                        (def: '1024')
  --hwruntime_interconnect HWR_INTERCONNECT
                        type of hardware runtime interconnection with accelerators
                        centralized
                        distributed
                        (def: 'centralized')
  --max_args_per_task MAX_ARGS_PER_TASK
                        maximum number of arguments for any task in the bitstream
                        (def: '15')
  --max_deps_per_task MAX_DEPS_PER_TASK
                        maximum number of dependencies for any task in the bitstream
                        (def: '8')
  --max_copies_per_task MAX_COPIES_PER_TASK
                        maximum number of copies for any task in the bitstream
                        (def: '15')
  --enable_pom_axilite  enable the POM axilite interface with debug counters

Picos:
  --picos_num_dcts NUM_DCTS
                        number of DCTs instantiated
                        (def: '1')
  --picos_tm_size PICOS_TM_SIZE
                        size of the TM memory
                        (def: '128')
  --picos_dm_size PICOS_DM_SIZE
                        size of the DM memory
                        (def: '512')
  --picos_vm_size PICOS_VM_SIZE
                        size of the VM memory
                        (def: '512')
  --picos_dm_ds DATA_STRUCT
                        data structure of the DM memory
                        BINTREE: Binary search tree (not autobalanced)
                        LINKEDLIST: Linked list
                        (def: 'BINTREE')
  --picos_dm_hash HASH_FUN
                        hashing function applied to dependence addresses
                        P_PEARSON: Parallel Pearson function
                        XOR
                        (def: 'P_PEARSON')
  --picos_hash_t_size PICOS_HASH_T_SIZE
                        DCT hash table size
                        (def: '64')

User-defined files:
  --user_constraints USER_CONSTRAINTS
                        path of user defined constraints file
  --user_pre_design USER_PRE_DESIGN
                        path of user TCL script to be executed before the design step (not after the board base design)
  --user_post_design USER_POST_DESIGN
                        path of user TCL script to be executed after the design step

Miscellaneous:
  -h, --help            show this help message and exit
  -i, --verbose_info    print extra information messages
  --dump_board_info     dump board info json for the specified board
  -j JOBS, --jobs JOBS  specify the number of jobs to run simultaneously
                        By default it will use as many jobs as cores with at least 5GB of dedicated free memory, or the value returned by `nproc`, whichever is less.
  --mem_per_job MEM_PER_JOB
                        specify the memory per core used to estimate the number of jobs to launch (def: 5G)
  -k, --keep_files      keep files on error
  -v, --verbose         print vendor backend messages
  --version             print AIT version and exits

Xilinx-specific arguments:
  --floorplanning_constr FLOORPLANNING_CONSTR
                        built-in floorplanning constraints for accelerators and static logic
                        acc: accelerator kernels are constrained to a SLR region
                        static: each static logic IP is constrained to its relevant SLR
                        all: enables both 'acc' and 'static' options
                        By default no floorplanning constraints are used
  --placement_file PLACEMENT_FILE
                        json file specifying accelerator placement
  --slr_slices SLR_SLICES
                        enable SLR crossing register slices
                        acc: create register slices for SLR crossing on accelerator-related interfaces
                        static: create register slices for static logic IPs
                        all: enable both 'acc' and 'static' options
                        By default they are disabled
  --regslice_pipeline_stages REGSLICE_PIPELINE_STAGES
                        number of register slice pipeline stages per SLR
                        'x:y:z': add between 1 and 5 stages in master:middle:slave SLRs
                        auto: let Vivado choose the number of stages
                        (def: auto)
  --interconnect_regslice INTER_REGSLICE_LIST [INTER_REGSLICE_LIST ...]
                        enable register slices on AXI interconnects
                        all: enables them on all interconnects
                        mem: enables them on interconnects in memory datapath
                        hwruntime: enables them on the AXI-stream interconnects between the hwruntime and the accelerators
  --interconnect_opt OPT_STRATEGY
                        AXI interconnect optimization strategy: Minimize 'area' or maximize 'performance'
                        (def: 'area')
  --interconnect_priorities
                        enable priorities in the memory interconnect
  --simplify_interconnection
                        simplify interconnection between accelerators and memory. Might negatively impact timing
  --power_monitor       enable power monitoring infrastructure
  --thermal_monitor     enable thermal monitoring infrastructure
  --debug_intfs INTF_TYPE
                        choose which interfaces mark for debug and instantiate the correspondent ILA cores
                        AXI: debug accelerator's AXI interfaces
                        stream: debug accelerator's AXI-Stream interfaces
                        both: debug both accelerator's AXI and AXI-Stream interfaces
                        custom: debug user-defined interfaces
                        none: do not mark for debug any interface
                        (def: 'none')
  --debug_intfs_list DEBUG_INTFS_LIST
                        path of file with the list of interfaces to debug
  --ignore_eng_sample   ignore engineering sample status from chip part number
  --target_language TARGET_LANG
                        choose target language to synthesize files to: vhdl or verilog
                        (def: 'verilog')

  environment variables:
    PETALINUX_BUILD   path where the Petalinux project is located