Training Tutorials#

We have hands-on tutorials with supported training frameworks to help you train with NeMo Gym environments. If you’re interested in integrating another training framework, see the Training Framework Integration Guide.

Tip

See Training Approaches for a refresher on when to use GRPO, SFT, or DPO.

RL (GRPO)#

NeMo RL

Tutorial-series: GRPO training to improve multi-step tool calling on the Workplace Assistant environment, scaling from single-node to multi-node training.

RL Training with NeMo RL using GRPO
OpenRLHF

Review the agent executor for using NeMo Gym environments with OpenRLHF.

https://github.com/OpenRLHF/OpenRLHF/blob/main/examples/python/agent_func_nemogym_executor.py
Unsloth

GRPO training on instruction following and reasoning environments.

RL Training with Unsloth
NeMo Customizer

Coming soon

VeRL

Coming soon

Multi-Environment Training#

Multi-Environment Training

Run multiple training environments simultaneously for rollout collection.

Multi-Environment Training

SFT & DPO#

Offline Training with Rollouts

Transform rollouts into training data for supervised fine-tuning (SFT) and direct preference optimization (DPO).

Offline Training with Rollouts (SFT/DPO) - Experimental