TensorRTExtension#

Refer to the official NVIDIA Tensorrt documentation for support matrix and more.

UUID: d43f23e4-b9bf-11eb-9d18-2b7be630552b
Version: 2.7.0
Author: NVIDIA
License: Proprietary

Components#

nvidia::gxf::TensorRtInference#

Codelet taking input tensors and feed them into TensorRT for inference.

Component ID: 06a7f0e0-b9c0-11eb-8cd6-23c9c2070107
Base Type: nvidia::gxf::Codelet

Parameters#

model_file_path

Model File Path. Path to ONNX model to be loaded.

Flags: GXF_PARAMETER_FLAGS_NONE
Type: GXF_PARAMETER_TYPE_STRING

engine_file_path

Engine File Path. Path to the generated engine to be serialized and loaded from.

Flags: GXF_PARAMETER_FLAGS_NONE
Type: GXF_PARAMETER_TYPE_STRING

force_engine_update

Force Engine Update. Always update engine regard less of existing engine file. Such conversion may take minutes. Default to false.

Flags: GXF_PARAMETER_FLAGS_NONE
Type: GXF_PARAMETER_TYPE_BOOL
Default: false

input_tensor_names

Input Tensor Names. Names of input tensors in the order to be fed into the model.

Flags: GXF_PARAMETER_FLAGS_NONE
Type: GXF_PARAMETER_TYPE_STRING

input_binding_names

Input Binding Names. Names of input bindings as in the model in the same order of what is provided in input_tensor_names.

Flags: GXF_PARAMETER_FLAGS_NONE
Type: GXF_PARAMETER_TYPE_STRING

output_tensor_names

Output Tensor Names. Names of output tensors in the order to be retrieved from the model.

Flags: GXF_PARAMETER_FLAGS_NONE
Type: GXF_PARAMETER_TYPE_STRING

output_binding_names

Output Binding Names. Names of output bindings in the model in the same order of of what is provided in output_tensor_names.

Flags: GXF_PARAMETER_FLAGS_NONE
Type: GXF_PARAMETER_TYPE_STRING

pool

Pool. Allocator instance for output tensors.

Flags: GXF_PARAMETER_FLAGS_NONE
Type: GXF_PARAMETER_TYPE_HANDLE
Handle Type: nvidia::gxf::Allocator

cuda_stream_pool

Cuda Stream Pool. Instance of gxf::CudaStreamPool to allocate CUDA stream.

Flags: GXF_PARAMETER_FLAGS_NONE
Type: GXF_PARAMETER_TYPE_HANDLE
Handle Type: nvidia::gxf::CudaStreamPool

max_workspace_size

Max Workspace Size. Size of working space in bytes. Default to 64MB.

Flags: GXF_PARAMETER_FLAGS_NONE
Type: GXF_PARAMETER_TYPE_INT64
Default: 67108864

dla_core

DLA Core to use. Fallback to GPU is always enabled. Default to use GPU only.

Flags: GXF_PARAMETER_FLAGS_OPTIONAL
Type: GXF_PARAMETER_TYPE_INT64

max_batch_size

Max Batch Size. Maximum possible batch size in case the first dimension is dynamic and used as batch size.

Flags: GXF_PARAMETER_FLAGS_NONE
Type: GXF_PARAMETER_TYPE_INT32
Default: 1

enable_fp16

Enable FP16 Mode. Enable inference with FP16 and FP32 fallback.

Flags: GXF_PARAMETER_FLAGS_NONE
Type: GXF_PARAMETER_TYPE_BOOL
Default: false

verbose

Enable verbose logging on console. Default to false.

Flags: GXF_PARAMETER_FLAGS_NONE
Type: GXF_PARAMETER_TYPE_BOOL
Default: false

relaxed_dimension_check

Relaxed Dimension Check. Ignore dimensions of 1 for input tensor dimension check.

Flags: GXF_PARAMETER_FLAGS_NONE
Type: GXF_PARAMETER_TYPE_BOOL
Default: true

clock

Clock. Instance of clock for publish time.

Flags: GXF_PARAMETER_FLAGS_OPTIONAL
Type: GXF_PARAMETER_TYPE_HANDLE
Handle Type: nvidia::gxf::Clock

dev_id

Device Id. Create CUDA Stream on which device.

Flags: GXF_PARAMETER_FLAGS_OPTIONAL
Type: GXF_PARAMETER_TYPE_INT32
Default: 0

rx

RX. List of receivers to take input tensors.

Flags: GXF_PARAMETER_FLAGS_NONE
Type: GXF_PARAMETER_TYPE_HANDLE
Handle Type: nvidia::gxf::Receiver

tx

TX. Transmitter to publish output tensors.

Flags: GXF_PARAMETER_FLAGS_NONE
Type: GXF_PARAMETER_TYPE_HANDLE
Handle Type: nvidia::gxf::Transmitter