`nemo_rl.data.processors`#

Contains data processors for evaluation.

Module Contents#

`helpsteer3_data_processor`	Process a HelpSteer3 preference datum into a DatumSpec for GRPO training.
`math_data_processor`	Process a datum dictionary (directly loaded from dataset) into a DatumSpec for the Math Environment.
`math_hf_data_processor`	Process a datum dictionary (directly loaded from data/hf_datasets/openmathinstruct2.py) into a DatumSpec for the Reward Model Environment.
`_construct_multichoice_prompt`	Construct prompt from question and options.
`multichoice_qa_processor`	Process a datum dictionary (directly loaded from dataset) into a DatumSpec for multiple-choice problems.
`register_processor`

`TokenizerType`
`PROCESSOR_REGISTRY`

nemo_rl.data.processors.helpsteer3_data_processor( datum_dict: dict[str, Any], task_data_spec: nemo_rl.data.interfaces.TaskDataSpec, tokenizer: nemo_rl.data.processors.TokenizerType, max_seq_length: int, idx: int, ) → nemo_rl.data.interfaces.DatumSpec#

Process a HelpSteer3 preference datum into a DatumSpec for GRPO training.

This function converts HelpSteer3 preference data to work with GRPO by:

nemo_rl.data.processors.math_data_processor( datum_dict: dict[str, Any], task_data_spec: nemo_rl.data.interfaces.TaskDataSpec, tokenizer: nemo_rl.data.processors.TokenizerType, max_seq_length: int, idx: int, ) → nemo_rl.data.interfaces.DatumSpec#: Process a datum dictionary (directly loaded from dataset) into a DatumSpec for the Math Environment.

nemo_rl.data.processors.math_hf_data_processor( datum_dict: dict[str, Any], task_data_spec: nemo_rl.data.interfaces.TaskDataSpec, tokenizer: nemo_rl.data.processors.TokenizerType, max_seq_length: int, idx: int, ) → nemo_rl.data.interfaces.DatumSpec#: Process a datum dictionary (directly loaded from data/hf_datasets/openmathinstruct2.py) into a DatumSpec for the Reward Model Environment.

nemo_rl.data.processors._construct_multichoice_prompt( prompt: str, question: str, options: dict[str, str], ) → str#: Construct prompt from question and options.

nemo_rl.data.processors.multichoice_qa_processor( datum_dict: dict[str, Any], task_data_spec: nemo_rl.data.interfaces.TaskDataSpec, tokenizer: nemo_rl.data.processors.TokenizerType, max_seq_length: int, idx: int, ) → nemo_rl.data.interfaces.DatumSpec#: Process a datum dictionary (directly loaded from dataset) into a DatumSpec for multiple-choice problems.

nemo_rl.data.processors.PROCESSOR_REGISTRY: Dict[str, nemo_rl.data.interfaces.TaskDataProcessFnCallable]#: ‘cast(…)’

nemo_rl.data.processors.register_processor( processor_name: str, processor_function: nemo_rl.data.interfaces.TaskDataProcessFnCallable, ) → None#