.. _DS_plugin_nvds_text_to_speech:

Gst-nvds_text_to_speech (Alpha)
===============================

The `Gst-nvds_text_to_speech` plugin performs speech synthesis on the input text. It is supported on both x86 and Jetson platforms.
The plugin provides a mechanism to load custom Text To Speech (TTS) low level library at runtime.
By default, the plugin loads `DS-Riva` TTS library (``libnvds_riva_tts.so``) to perform speech synthesis.
The library communicates with the TTS service of the NVIDIA Riva SDK for speech synthesis using optimized Riva TTS models.


.. note::
   * The `Gst-nvds_text_to_speech` plugin is being released as an alpha feature.
   * The `DS-Riva` Text To Speech library uses gRPC API to access the Riva TTS service. The Riva TTS service should be started before using this plugin. Installation of the gRPC C++ libraries (v1.38) is required on the client side.

.. note::
    The DS-Riva TTS library (``libnvds_riva_tts.so``) works with NVIDIA Riva Release 2.0.0 or later.

The plugin accepts text (UTF8) GStreamer buffers (``GstBuffers``) from upstream component. It transforms the text into audio GStreamer buffer output.

The DS-Riva TTS library (``libnvds_riva_tts.so``) generates raw audio data with S16LE format (Signed 16 bit Little Endian) at 22050 Hz sample rate. Library settings can be configured via YAML format file (by setting a property on `nvds_text_to_speech` gst plugin) which has multi-part settings for plugin control, and Riva TTS service configurations.

As shown in the diagram below, input text is send to Riva TTS service for speech synthesis. The final output is available as S16LE PCM audio at 22050 Hz.

.. image:: /content/DS_plugin_gst-nvds_text_to_speech.png
         :align: center
         :alt: Gst-Nvds_text_to_speech

Inputs and Outputs
-------------------

This section summarizes the inputs, outputs, and communication facilities of the `Gst-nvds_text_to_speech` plugin with DS-Riva TTS implementation.

* Input

  * Text GStreamer buffers

* Control parameters

  * ``customlib-name``: Set a custom TTS library that the plugin loads to perform speech synthesis. By default, DS-Riva TTS library (``libnvds_riva_tts.so``) is set
  *	``create-speech-ctx-func``: Symbol name to create TTS speech context. Default: ``create_text_to_speech_ctx``
  *	``config-file``: A text file to configure the plugin, DS-Riva TTS service requests.

* Output

  *	Raw audio GStreamer buffers containing the synthesized speech

Features
---------

The following table summarizes the features of the plugin.

.. csv-table:: Gst-nvds_text_to_speech plugin features
  :file: ../text/tables/Gst-nvds_text_to_speech tables/DS_Plugin_gst-nvds_text_to_speech_features.csv
  :widths: 30, 30, 30
  :header-rows: 1


DS-Riva TTS Yaml File Configuration Specifications
-----------------------------------------------------

DS-Riva TTS configuration file uses YAML 1.2 file format: https://yaml.org/spec/1.2/spec.html.

* There are multiple parts in the configuration file. An example is located at
  ``/opt/nvidia/deepstream/deepstream/sources/apps/audio_apps/deepstream_asr_tts_app/riva_tts_conf.yml``.
  Each part has a ``name`` indicating a unique part name and a ``detail`` indicating the setting details.

* ``name: riva_server`` part configures the Riva server URI in its corresponding node ``detail:``.

* ``name: riva_tts_stream`` part configures Riva TTS service supported features in its corresponding node ``detail:``.

* ``name: ds_riva_tts_plugin`` part configures DS-Riva TTS settings in its corresponding node ``detail:``.

* A separator line with ``---`` is inserted between the 2 neighbor parts according to YAML specification.

Gst Properties
----------------

The following tables describes the Gst properties of the `Gst-nvds_text_to_speech` plugin.

.. csv-table:: riva_server: Configuration properties for Riva low level library
  :file: ../text/tables/Gst-nvds_text_to_speech tables/DS_Plugin_gst-nvds_text_to_speech_gst_riva_server_properties.csv
  :widths: 20, 20, 20, 20
  :header-rows: 1

.. csv-table:: ds_riva_tts_stream: Configuration properties for Riva TTS service request
  :file: ../text/tables/Gst-nvds_text_to_speech tables/DS_Plugin_gst-nvds_text_to_speech_gst_riva_tts_stream_properties.csv
  :widths: 20, 20, 20, 20
  :header-rows: 1

.. csv-table:: ds_riva_tts_plugin: Configuration properties for DS-Riva TTS library settings
  :file: ../text/tables/Gst-nvds_text_to_speech tables/DS_Plugin_gst-nvds_text_to_speech_gst_riva_tts_plugin_properties.csv
  :widths: 20, 20, 20, 20
  :header-rows: 1

Riva TTS Service Initiation
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Refer to https://docs.nvidia.com/deeplearning/riva/user-guide/docs/quick-start-guide.html#local-deployment-using-quick-start-scripts for the procedure to start Riva TTS service.


gRPC C++ Installation
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

gRPC C++ shared libraries v1.38 installation is needed for using the DS-Riva TTS library to access the Riva TTS gRPC service.
To install the libraries, please follow steps given at https://grpc.io/docs/languages/cpp/quickstart/ , and add ``-DBUILD_SHARED_LIBS=ON``
to the cmake build options. (Recommended to use ``make -j4`` instead of ``make -j``)

Or

Use the included script to install gRPC C++ libraries, this scripts performs same steps::

      $ cd /opt/nvidia/deepstream/deepstream/sources/apps/audio_apps/deepstream_asr_app
      $ sudo chmod +x gRPC_installation.sh
      $ ./gRPC_installation.sh

Please run below command to add the installation path to the LD_LIBRARY_PATH
environment variable::

   $ export LD_LIBRARY_PATH=$HOME/.local/lib:$LD_LIBRARY_PATH

The gRPC C++ libraries are pre-installed on the DeepStream dGPU docker images.
In the dGPU docker container, please run below command to add the installation path to the LD_LIBRARY_PATH
environment variable::

   $ export LD_LIBRARY_PATH=$HOME/.local/lib:$LD_LIBRARY_PATH


Sample Application
------------------
A sample application using the plugin is available here: ``sources/apps/audio_apps/deepstream_asr_tts_app``. Please follow the ``README`` to run the tests.