TensorRTExtension#
Refer to the official NVIDIA Tensorrt documentation for support matrix and more.
UUID: d43f23e4-b9bf-11eb-9d18-2b7be630552b
Version: 2.7.0
Author: NVIDIA
License: Proprietary
Components#
nvidia::gxf::TensorRtInference#
Codelet taking input tensors and feed them into TensorRT for inference.
Component ID: 06a7f0e0-b9c0-11eb-8cd6-23c9c2070107
Base Type: nvidia::gxf::Codelet
Parameters#
model_file_path
Model File Path. Path to ONNX model to be loaded.
Flags: GXF_PARAMETER_FLAGS_NONE
Type: GXF_PARAMETER_TYPE_STRING
engine_file_path
Engine File Path. Path to the generated engine to be serialized and loaded from.
Flags: GXF_PARAMETER_FLAGS_NONE
Type: GXF_PARAMETER_TYPE_STRING
force_engine_update
Force Engine Update. Always update engine regard less of existing engine file. Such conversion may take minutes. Default to false.
Flags: GXF_PARAMETER_FLAGS_NONE
Type: GXF_PARAMETER_TYPE_BOOL
Default: false
input_tensor_names
Input Tensor Names. Names of input tensors in the order to be fed into the model.
Flags: GXF_PARAMETER_FLAGS_NONE
Type: GXF_PARAMETER_TYPE_STRING
input_binding_names
Input Binding Names. Names of input bindings as in the model in the same order of what is provided in input_tensor_names.
Flags: GXF_PARAMETER_FLAGS_NONE
Type: GXF_PARAMETER_TYPE_STRING
output_tensor_names
Output Tensor Names. Names of output tensors in the order to be retrieved from the model.
Flags: GXF_PARAMETER_FLAGS_NONE
Type: GXF_PARAMETER_TYPE_STRING
output_binding_names
Output Binding Names. Names of output bindings in the model in the same order of of what is provided in output_tensor_names.
Flags: GXF_PARAMETER_FLAGS_NONE
Type: GXF_PARAMETER_TYPE_STRING
pool
Pool. Allocator instance for output tensors.
Flags: GXF_PARAMETER_FLAGS_NONE
Type: GXF_PARAMETER_TYPE_HANDLE
Handle Type: nvidia::gxf::Allocator
cuda_stream_pool
Cuda Stream Pool. Instance of gxf::CudaStreamPool to allocate CUDA stream.
Flags: GXF_PARAMETER_FLAGS_NONE
Type: GXF_PARAMETER_TYPE_HANDLE
Handle Type: nvidia::gxf::CudaStreamPool
max_workspace_size
Max Workspace Size. Size of working space in bytes. Default to 64MB.
Flags: GXF_PARAMETER_FLAGS_NONE
Type: GXF_PARAMETER_TYPE_INT64
Default: 67108864
dla_core
DLA Core to use. Fallback to GPU is always enabled. Default to use GPU only.
Flags: GXF_PARAMETER_FLAGS_OPTIONAL
Type: GXF_PARAMETER_TYPE_INT64
max_batch_size
Max Batch Size. Maximum possible batch size in case the first dimension is dynamic and used as batch size.
Flags: GXF_PARAMETER_FLAGS_NONE
Type: GXF_PARAMETER_TYPE_INT32
Default: 1
enable_fp16
Enable FP16 Mode. Enable inference with FP16 and FP32 fallback.
Flags: GXF_PARAMETER_FLAGS_NONE
Type: GXF_PARAMETER_TYPE_BOOL
Default: false
verbose
Enable verbose logging on console. Default to false.
Flags: GXF_PARAMETER_FLAGS_NONE
Type: GXF_PARAMETER_TYPE_BOOL
Default: false
relaxed_dimension_check
Relaxed Dimension Check. Ignore dimensions of 1 for input tensor dimension check.
Flags: GXF_PARAMETER_FLAGS_NONE
Type: GXF_PARAMETER_TYPE_BOOL
Default: true
clock
Clock. Instance of clock for publish time.
Flags: GXF_PARAMETER_FLAGS_OPTIONAL
Type: GXF_PARAMETER_TYPE_HANDLE
Handle Type: nvidia::gxf::Clock
dev_id
Device Id. Create CUDA Stream on which device.
Flags: GXF_PARAMETER_FLAGS_OPTIONAL
Type: GXF_PARAMETER_TYPE_INT32
Default: 0
rx
RX. List of receivers to take input tensors.
Flags: GXF_PARAMETER_FLAGS_NONE
Type: GXF_PARAMETER_TYPE_HANDLE
Handle Type: nvidia::gxf::Receiver
tx
TX. Transmitter to publish output tensors.
Flags: GXF_PARAMETER_FLAGS_NONE
Type: GXF_PARAMETER_TYPE_HANDLE
Handle Type: nvidia::gxf::Transmitter