DeepStream SDK 8.0 for NVIDIA dGPU/X86 and Jetson#

DeepStream Release Notes#

1.0 About this Release

1.1 What’s New

1.1.1 DS 8.0

1.1.2 DS 7.1 (Previous Release)

1.1.3 Graph Composer 5.1.0

1.2 Differences since Deepstream 6.1 and Above

1.3 Breaking Changes

2.0 Limitations

3.0 Notes

3.1 Deploying Applications in a Docker Container

About this Release#

These release notes are for the NVIDIA^® DeepStream SDK for NVIDIA^® Turing ^®, NVIDIA^® Ampere^®, NVIDIA^® Hopper^®, NVIDIA^® Ada Lovelace^®, NVIDIA^® Blackwell^® and NVIDIA^® Jetson Thor ^™.

What’s New#

The following new features are supported in this DeepStream SDK release:

DS 8.0#

Support for Blackwell and Jetson Thor
Supports Triton 25.03 on x86 and Triton 25.08 on jetson
Jetson package based on JP 7.0 (r38.2 BSP)
Pyservicemaker enhancements, prepare and activate API calls to service-maker pipelineAPI. Configurable component(s) selection. Refer to
New applications using pyservicemaker
- Action recognition app
- Smart record app
- DeepStream test5 app using Flow API
- Kafka test app using pipeline APIs
Added MediaExtractor support to Jetson (servicemaker)
Added MaskTracker, a new multi-object tracker using Segment Anything Model 2 as visual engine.
Added Multi-View 3D Tracking (MV3DT).
Added pose estimation to Single and Multi-View 3D Tracking.
Support for Dynamic stream handling in demuxer
OpenSourced components
- nvll_osd library
- smart record library “gst-nvdssr”
- nvimageenc gst-plugin for x86
- nvimagedec gst-plugin for x86
- nvdsudpsrc gst-plugin
- nvdsudpsink gst-plugin
- nvdsanalytics gst-plugin
nvvideoconvert: Support U16 formats
REST API support for nvdsanalytics and nvtracker plugin.
Added support for CUDA enabled openCV in dsexample to demonstrate GPU matrix.
nvdsudpsrc/nvdsudpsink plugin specific enhancements:
- Configure audio frame size.
- RTP timestamp read/write/passthrough to enable A/V resync.
- Support stream duplication for redundancy (ST 2022-7 implementation).
Added nv3dsink support on x86
Inference Builder open-source tool to create inference microservice across multiple AI frameworks.
Added nvdsdynamicsrcbin plugin, for dynamic stream addition without reinitlization of the decoder.
Added support for high quality pixel formats in nvvideotestsrc
Support for TAO 6.0 models

DS 7.1 (Previous Release)#

Supports Triton 24.08 and Rivermax v1.40/v1.50.
Jetson package based on JP 6.1 (r36.4 BSP).
New Service maker framework in Python (Alpha): New application layer that removes the need to understand GStreamer application programming paradigm and enable to use Python.
Support Gray 16 LE type.
Postprocessing plugin to support output tensor meta from custom preprocessing
Support Access to tensor metadata with Service Maker C++ APIs
nvvideoconvert to support UYVY (8-bit YCbCr-4:2:2) on x86/dGPU
Enhanced Single-View 3D Tracking.
Improved ReID Accuracy in Tracker.
NVIDIA TAO toolkit (previously called NVIDIA Transfer Learning Toolkit) models from NVIDIA-AI-IOT/deepstream_tao_apps (branch: release/tao_ds7.1ga) integrated into SDK.
Improved stability.
Source code release
Nvurisrcbin
Message broker: protocol adapter code and nvmsgbroker API code release (amqp, azure, kafka, mqtt, redis, nvmsgbroker)
Python bindings and samples updates:
Build system update: new build system using PyPA to support pip 24.2
Pybind11 version updated to v2.13.0
New bindings:
NvDsObjEncOutParams, NvDsObjEncUsrArgs
nvds_obj_enc_create_context(), nvds_obj_enc_process(), nvds_obj_enc_finish(), nvds_obj_enc_destroy_context()
NvDsAnalyticsObjInfo.objStatus
NvDsObjReid

Graph Composer 5.1.0#

Graph Composer 5.1.0 has enhanced features added to the sdk along with the updated compute stack.

Graph Execution Engine

Supported on Ubuntu 24.04 x86_64 and NVIDIA Jetson.
Version updated to 5.1.0.
Graph Composer
Version updated to 5.1.0.
x86 only - Ubuntu 24.04.

Container Builder

No change in version.

Registry

No change in version.

Extensions update

Minor version of all the extensions are updated.

Following enhancements have been added

CUDA Green context for resource partitioning and energy efficiency. CUDA Green Contexts provide a lightweight alternative to traditional CUDA contexts, enabling developers to create distinct spatial partitions of GPU resources such as Streaming Multiprocessors (SMs). This partitioning allows for targeted resource allocation and management within the same CUDA programming model, supporting improved energy efficiency and resource utilization. Green Contexts enable developers to partition GPU resources, create resource descriptors, and manage multiple contexts with specific SM allocations for optimized performance and power consumption.
CPU core pinning support. This feature allows developers to pin CPU cores to specific physical cores, enabling fine-grained control over CPU resource allocation and management. By pinning CPU cores, developers can optimize system performance by ensuring that critical tasks are executed on designated CPU cores, reducing contention and improving overall efficiency.

Note:

DS + Triton graph with CUDA-13.0 on x86 is not supported
Tensorflow backend is not supported in the extension NvTritonExt
README update

For /opt/nvidia/deepstream/deepstream/sources/apps/sample_apps/deepstream-can-orientation-app/README, please refer to the below instructions.
1. Refer to Basler official link and install pylon Camera Software Suite x86 or ARM version from https://www.baslerweb.com/en/products/basler-pylon-camera-software-suite/
  
  Suggest to use pylon_7.2.1 version which is compatible with PSF file ‘basler_cam_emulation_0815-0000.pfs’
2. Download specific pylon 7.2.1 Camera Software Suite Debian packages:
  
  For x86, get them from https://www.baslerweb.com/en/downloads/software-downloads/software-pylon-7-2-1-linux-x86-64bit-debian/
  
  For Jetson, get them from https://www.baslerweb.com/en/downloads/software-downloads/software-pylon-7-2-1-linux-arm-64bit-debian/
DS Triton graph with CUDA-13.0 on Thor
1. There is a known failure while building the container with CUDA-13.0 for Thor using container_builder.
2. There is a known failure while running the DS Triton graph with CUDA-13.0 on Thor with baremetal
Removed support of Graph Composer on Windows.

Differences since Deepstream 6.1 and Above#

gstreamer1.0-libav, libav, OSS encoder,decoder plugins (x264/x265) and audioparsers packages are removed in DeepStream dockers from DeepStream 6.1 onwards. You may install these packages based on your requirement (gstreamer1.0-plugins-good/ gstreamer1.0-plugins-bad/ gstreamer1.0-plugins-ugly). While running DeepStream applications inside dockers, you may see the following warnings:

WARNING from src_elem: No decoder available for type ‘audio/mpeg, mpegversion=(int)4, framed=(boolean)true, stream-format=(string)raw, level=(string)2, base-profile=(string)lc, profile=(string)lc, codec_data=(buffer)119056e500, rate=(int)48000, channels=(int)2’.

Debug info: gsturidecodebin.c(920): unknown_type_cb ():

To avoid such warnings, install gstreamer1.0-libav and gstreamer1.0-plugins-good inside docker.

Specifically, for deepstream-nmos, deepstream-avsync-app and python based deepstream-imagedata-multistream app you would need to install gstreamer1.0-libav and gstreamer1.0-plugins-good.

Gst-nveglglessink plugin is deprecated. Use Gst-nv3dsink plugin for Jetson instead.

Breaking Changes#

From DeepStream 8.0, the following models are removed. Facedetect, FacedetectIR, PeopleSegnet, bodyposenet, gesturenet,emotionnet, hearratenet, gazenet, facial landmark estimation models and its corresponding applications. Yolo OSS, SSD, DSSD, Yolov3, Yolov4, Yolov4-tiny, Fasterrcnn, Densenet models and it’s corresponding applications.
DeepStream Audio support, ASR, TTS plugin not supported. User to use Nvidia RIVA speech sdk.
Removed support for tensorflow, uff and caffe models.

The Tensorflow Backend has been deprecated starting in 25.03. The last release of Triton Inference Server with the Tensorflow Backend is 25.02. To continue using the Tensorflow backend starting with version 25.03 and onward, users must compile the Tensorflow backend themselves from the source code, since pre-built versions will no longer be available.

For more details refer: https://docs.nvidia.com/deeplearning/triton-inference-server/release-notes/rel-25-03.html#rel-25-03
Removed int8 calibration support for previous DeepStream releases supported TAO models. Lower perf with default model will be observed, as we are moving to FP16 mode from INT8.
nvv4l2decoder does not support JPEG decode for Thor OpenRM.
DLA is not supported on Jetson Thor
For Protocol adapters - Password through config file will be deprecated from the next release.
Some of the open-source libraries related to Codecs have been removed so user might see some warnings as below, which can be ignored safely:
(gst-plugin-scanner:1433): GStreamer-WARNING **: 18:05:56.454: Failed to load plugin ‘/usr/lib/aarch64-linux-gnu/gstreamer-1.0/libgstfaad.so’: libfaad.so.2: cannot open shared object file: No such file or directory

/bin/bash: line 1: lsmod: command not found

/bin/bash: line 1: modprobe: command not found

Limitations#

This section provides details about issues discovered during development and QA but not resolved in this release.

DeepStream on Jetson is based on L4T BSP version r38.2. Refer to “Known Issues” section in Jetson release notes.
With V4L2 codecs only MAX 1024 (decode + encode) instances are provided. The maximum number of instances can be increased by making changes in open-source code.
detected-min-w and detected-min-h must be set to values larger than 32 in the primary inference configuration file (config_infer_primary.txt) for gst-dsexample on Jetson.
The Kafka protocol adapter sometimes does not automatically reconnect when the Kafka Broker to which it is connected goes down and comes back up. This requires the application to restart.
If the nvds log file ds.log has been deleted, to restart logging you must delete the file /run/rsyslogd.pid within the container before reenabling logging by running the setup_nvds_logger.sh script. This is described in the “nvds_logger: Logging Framework” sub-section in the “Gst-nvmsgbroker” section.
Running a DeepStream application over SSH (via putty) with X11 forwarding does not work.
DeepStream currently expects model network width to be a multiple of 4 and network height to be a multiple of 2.
Triton Inference Server implementation in DeepStream currently supports a single GPU. The models need to be configured to use a single GPU.
For some models output in DeepStream is not exactly same as observed in TAO Toolkit. This is due to input scaling algorithm differences.
Dynamic resolution change support is Alpha quality.
On the fly Model update only supports the same type of Model with same Network parameters.
Rivermax SDK is not part of DeepStream. So, the following warning is observed (gst-plugin-scanner:33257):

GStreamer-WARNING **: 11:38:46.882: Failed to load plugin ‘/usr/lib/x86_64-linux-gnu/gstreamer-1.0/deepstream/libnvdsgst_udp.so’: librivermax.so.0: cannot open shared object file: No such file or directory

You can ignore this warning safely.
When using Composer WebSocket streaming, sometimes errors like “Error while sending buffer: invalid state” is seen, or the window becomes unresponsive. Refreshing the browser page might fix it.
Composer WebRTC Streaming is supported only on RTX GPUs.
On jetson, when the screen is idle, fps is lowered for DeepStream applications. This behavior is by design to save power. However, if user does not want screen idle then refer to the FAQ for WAR.
RDMA functionality is only supported on x86 and only in x86 Triton docker for now.
You cannot build the DeepStream out of the box on Jetson dockers except its Triton variant.
There can be performance drop from TensorRT to Triton for some models (5 to 15%). In such cases, user may want to use nvinfer plugin instead nvinferserver plugin.
NVRM: XID errors seen sometimes when running 200+ streams on Ampere, Hopper and ADA.
NVRM: XID errors seen on some setups with gst-dsexample and transfer learning sample apps.
Sometimes during deepstream-testsr app execution, assertion ” GStreamer-CRITICAL **: 12:55:35.006: gst_pad_link_full: assertion ‘GST_IS_PAD sinkpad)’ failed” is seen which can be safely ignored.
For some of the models during engine file generation, error “[TRT]: 3: [builder.cpp::~Builder::307] Error Code 3: API Usage Error“ observed from TensorRT, but has no impact on functionality and can be safely ignored.
deepstream-server app is not supported with new nvstreammux plugin.
TAO point-pillar model works only in FP32 mode.
REST API support for few components (decoder, preprocessor, nvinfer along with stream addition deletion support) with limited configuration options. However, you can extend the functionality with the steps mentioned in SDK documentation.
While adding and removing streams continuosly using REST API at frequent intervals, memory surge is observed. This is due to new libc behaviour on Ubuntu 24. User should export below system environment variables before running application which uses REST API.
- export MALLOC_ARENA_MAX=1
- export MALLOC_MMAP_MAX_=0
- export MALLOC_MMAP_THRESHOLD_=131072
- export MALLOC_TRIM_THRESHOLD_=131072
With Basler camera, on Jetson, only images with width of multiple of 4 supported.
In some cases, performance with Python sample apps may be lower than C version.
while running deepstream-opencv-test app, warning “gst_caps_features_set_parent_refcount: assertion ‘refcount == NULL’ failed” observed. No impact on functionality & can be safely ignored.
Observing below errors for Jetson dockers (but no impact on functionality)
- While decoding: /bin/dash: 1: lsmod: not found and /bin/dash: 1: modprobe: not found.
Performance drop for some Models seen with nvinferserver in gRPC mode when run on ARM SBSA w.r.t to x86.
Minor performance drop observed compared to the DS 6.4 release when using the NvDCF performance configuration. To improve performance in this case, set the environment variable NVDS_DISABLE_CUDADEV_BLOCKINGSYNC=1.
For Azure, messages sent not matching with messages received on server side.
Performance in WSL is not at par with Ubuntu system. There is a known throughput issue while running multiple decode instances in WSL. So you may observe lower FPS compared to ubuntu baseline
Image encode not supported in WSL.
In WSL2, black screen with log “”MESA: error: Failed to attach to x11 shm” is observed with pipelines with nveglglessink or other display sinks. Use filesink instead in such usecases.
Numpy 2.x is not supported by PyDS.
While running ~200+ streams simultaneously kernel crash may be observed randomly which may stall the application for some time.
Inference builder based deepstream app samples hang on B200 platform when the batch-size exceeds 16.

Notes#

REST API commands only work after the video shows up on the host screen.
Graph composer is deprecated.
On Jetson, you may observe segfault while running display test cases using nv3dsink component. This is a known issue and expected to be fixed in the upcoming JP 7.0 update
DeepStream Python bindings will be deprecated from the next release onwards. Usage of pyservicemaker instead of python bindings is recommended.
NVIDIA^® DeepStream SDK 8.0 supports TAO model-based applications (https://developer.nvidia.com/tao-toolkit). For more details, see NVIDIA-AI-IOT/deepstream_tao_apps (branch: release/tao_ds8.0ga).
On vGPU, only CUDA device memory NVBUF_MEM_CUDA_DEVICE supported.
In rare cases, a segmentation fault may occur in the DeepStream-testsr application within the libnvdsgst_smartrecord.so library.
Usecase with HW encoder + RTSP in long duration may face issue, work around is to restart application.
For DGX Spark, as VIC is not supported, need to set tiler compute-hw to GPU (compute-hw=1) Example: In ds_python app - deepstream_test_3.py, set compute_hw=1 for tiler property.
For DGX Spark, WARNING: Detected NVIDIA GB10 GPU, which is not yet supported in this version of the container seen while launching container. This is harmless and can be ignored as it is does not impact the functionality.

Note

OpenCV Deprecation: OpenCV is deprecated by default. However, you can enable OpenCV support in plugins such as nvinfer (nvdsinfer) and dsexample (gst-dsexample) by setting WITH_OPENCV=1 in the Makefile of these components. Refer to the component README for detailed instructions.
Docker Note: If your application requires OpenCV and you are using Docker, ensure that the libopencv-dev package is installed inside the Docker container.

Deploying Applications in a Docker Container#

Refer Deepstream docker section for detailed information of using DeepStream SDK inside dockers.

Notice

THE INFORMATION IN THIS DOCUMENT AND ALL OTHER INFORMATION CONTAINED IN NVIDIA DOCUMENTATION REFERENCED IN THIS DOCUMENT IS PROVIDED “AS IS.” NVIDIA MAKES NO WARRANTIES, EXPRESSED, IMPLIED, STATUTORY, OR OTHERWISE WITH RESPECT TO THE INFORMATION FOR THE PRODUCT, AND EXPRESSLY DISCLAIMS ALL IMPLIED WARRANTIES OF NONINFRINGEMENT, MERCHANTABILITY, AND FITNESS FOR A PARTICULAR PURPOSE. Notwithstanding any damages that customer might incur for any reason whatsoever, NVIDIA’s aggregate and cumulative liability towards customer for the product described in this document shall be limited in accordance with the NVIDIA terms and conditions of sale for the product. THE NVIDIA PRODUCT DESCRIBED IN THIS DOCUMENT IS NOT FAULT TOLERANT AND IS NOT DESIGNED, MANUFACTURED OR INTENDED FOR USE IN CONNECTION WITH THE DESIGN, CONSTRUCTION, MAINTENANCE, AND/OR OPERATION OF ANY SYSTEM WHERE THE USE OR A FAILURE OF SUCH SYSTEM COULD RESULT IN A SITUATION THAT THREATENS THE SAFETY OF HUMAN LIFE OR SEVERE PHYSICAL HARM OR PROPERTY DAMAGE (INCLUDING, FOR EXAMPLE, USE IN CONNECTION WITH ANY NUCLEAR, AVIONICS, LIFE SUPPORT OR OTHER LIFE CRITICAL APPLICATION). NVIDIA EXPRESSLY DISCLAIMS ANY EXPRESS OR IMPLIED WARRANTY OF FITNESS FOR SUCH HIGH RISK USES. NVIDIA SHALL NOT BE LIABLE TO CUSTOMER OR ANY THIRD PARTY, IN WHOLE OR IN PART, FOR ANY CLAIMS OR DAMAGES ARISING FROM SUCH HIGH RISK USES.

NVIDIA makes no representation or warranty that the product described in this document will be suitable for any specified use without further testing or modification. Testing of all parameters of each product is not necessarily performed by NVIDIA. It is customer’s sole responsibility to ensure the product is suitable and fit for the application planned by customer and to do the necessary testing for the application in order to avoid a default of the application or the product. Weaknesses in customer’s product designs may affect the quality and reliability of the NVIDIA product and may result in additional or different conditions and/or requirements beyond those contained in this document. NVIDIA does not accept any liability related to any default, damage, costs or problem which may be based on or attributable to: (i) the use of the NVIDIA product in any manner that is contrary to this document, or (ii) customer product designs.

Other than the right for customer to use the information in this document with the product, no other license, either expressed or implied, is hereby granted by NVIDIA under this document. Reproduction of information in this document is permissible only if reproduction is approved by NVIDIA in writing, is reproduced without alteration, and is accompanied by all associated conditions, limitations, and notices.

Trademarks

NVIDIA, the NVIDIA logo, TensorRT, NVIDIA Ampere, NVIDIA Hopper and NVIDIA Tesla are trademarks and/or registered trademarks of NVIDIA Corporation in the Unites States and other countries. Other company and product names may be trademarks of the respective companies with which they are associated.