TensorRTExtension#

Refer to the official NVIDIA Tensorrt documentation for support matrix and more.

  • UUID: d43f23e4-b9bf-11eb-9d18-2b7be630552b

  • Version: 2.7.0

  • Author: NVIDIA

  • License: Proprietary

Components#

nvidia::gxf::TensorRtInference#

Codelet taking input tensors and feed them into TensorRT for inference.

  • Component ID: 06a7f0e0-b9c0-11eb-8cd6-23c9c2070107

  • Base Type: nvidia::gxf::Codelet

Parameters#

model_file_path

Model File Path. Path to ONNX model to be loaded.

  • Flags: GXF_PARAMETER_FLAGS_NONE

  • Type: GXF_PARAMETER_TYPE_STRING


engine_file_path

Engine File Path. Path to the generated engine to be serialized and loaded from.

  • Flags: GXF_PARAMETER_FLAGS_NONE

  • Type: GXF_PARAMETER_TYPE_STRING


force_engine_update

Force Engine Update. Always update engine regard less of existing engine file. Such conversion may take minutes. Default to false.

  • Flags: GXF_PARAMETER_FLAGS_NONE

  • Type: GXF_PARAMETER_TYPE_BOOL

  • Default: false


input_tensor_names

Input Tensor Names. Names of input tensors in the order to be fed into the model.

  • Flags: GXF_PARAMETER_FLAGS_NONE

  • Type: GXF_PARAMETER_TYPE_STRING


input_binding_names

Input Binding Names. Names of input bindings as in the model in the same order of what is provided in input_tensor_names.

  • Flags: GXF_PARAMETER_FLAGS_NONE

  • Type: GXF_PARAMETER_TYPE_STRING


output_tensor_names

Output Tensor Names. Names of output tensors in the order to be retrieved from the model.

  • Flags: GXF_PARAMETER_FLAGS_NONE

  • Type: GXF_PARAMETER_TYPE_STRING


output_binding_names

Output Binding Names. Names of output bindings in the model in the same order of of what is provided in output_tensor_names.

  • Flags: GXF_PARAMETER_FLAGS_NONE

  • Type: GXF_PARAMETER_TYPE_STRING


pool

Pool. Allocator instance for output tensors.

  • Flags: GXF_PARAMETER_FLAGS_NONE

  • Type: GXF_PARAMETER_TYPE_HANDLE

  • Handle Type: nvidia::gxf::Allocator


cuda_stream_pool

Cuda Stream Pool. Instance of gxf::CudaStreamPool to allocate CUDA stream.

  • Flags: GXF_PARAMETER_FLAGS_NONE

  • Type: GXF_PARAMETER_TYPE_HANDLE

  • Handle Type: nvidia::gxf::CudaStreamPool


max_workspace_size

Max Workspace Size. Size of working space in bytes. Default to 64MB.

  • Flags: GXF_PARAMETER_FLAGS_NONE

  • Type: GXF_PARAMETER_TYPE_INT64

  • Default: 67108864


dla_core

DLA Core to use. Fallback to GPU is always enabled. Default to use GPU only.

  • Flags: GXF_PARAMETER_FLAGS_OPTIONAL

  • Type: GXF_PARAMETER_TYPE_INT64


max_batch_size

Max Batch Size. Maximum possible batch size in case the first dimension is dynamic and used as batch size.

  • Flags: GXF_PARAMETER_FLAGS_NONE

  • Type: GXF_PARAMETER_TYPE_INT32

  • Default: 1


enable_fp16

Enable FP16 Mode. Enable inference with FP16 and FP32 fallback.

  • Flags: GXF_PARAMETER_FLAGS_NONE

  • Type: GXF_PARAMETER_TYPE_BOOL

  • Default: false


verbose

Enable verbose logging on console. Default to false.

  • Flags: GXF_PARAMETER_FLAGS_NONE

  • Type: GXF_PARAMETER_TYPE_BOOL

  • Default: false


relaxed_dimension_check

Relaxed Dimension Check. Ignore dimensions of 1 for input tensor dimension check.

  • Flags: GXF_PARAMETER_FLAGS_NONE

  • Type: GXF_PARAMETER_TYPE_BOOL

  • Default: true


clock

Clock. Instance of clock for publish time.

  • Flags: GXF_PARAMETER_FLAGS_OPTIONAL

  • Type: GXF_PARAMETER_TYPE_HANDLE

  • Handle Type: nvidia::gxf::Clock


dev_id

Device Id. Create CUDA Stream on which device.

  • Flags: GXF_PARAMETER_FLAGS_OPTIONAL

  • Type: GXF_PARAMETER_TYPE_INT32

  • Default: 0


rx

RX. List of receivers to take input tensors.

  • Flags: GXF_PARAMETER_FLAGS_NONE

  • Type: GXF_PARAMETER_TYPE_HANDLE

  • Handle Type: nvidia::gxf::Receiver


tx

TX. Transmitter to publish output tensors.

  • Flags: GXF_PARAMETER_FLAGS_NONE

  • Type: GXF_PARAMETER_TYPE_HANDLE

  • Handle Type: nvidia::gxf::Transmitter