mergeProcessor

Classes

MergeProcessor

Module Contents

class mergeProcessor.MergeProcessor
merge_added_rewards(original_file_path: str, save_to_temp_folder: bool = False) None

Merges rewards from various human values stored in temporary files into a single JSON file, either saving to a temporary folder or overwriting the original file.

This method checks for files in the temporary folder with names matching the pattern of the original file, updates the original JSON file by adding reward entries from these files, and saves the result either in the temp folder or overwrites the original.

Parameters:
  • original_file_path (str) – Path to the original JSON file.

  • save_to_temp_folder (bool, optional) – If True, saves the merged file in a temp folder instead of overwriting the original. Defaults to False.

Example

>>> processor = MergeProcessor()
>>> processor.merge_added_rewards("results/Llama27b-chat-Anthropic-harmless.json", save_to_temp_folder=True)
Command-line usage:
>>> python mergeProcessor.py merge_added_rewards --original_file_path="results/Llama27b-chat-Anthropic-harmless.json" --save_to_temp_folder=True
merge_gendata_bypattern(json_file_pattern: str) None

Merges multiple JSON files matched by a pattern into a single output file.

This function collects JSON files based on the specified glob pattern, merges the data into one JSON array, and saves the result at a directory level above ‘temp/’. The function also removes ‘_*to*’ from the filename before saving.

Parameters:

json_file_pattern (str) – The glob pattern to match JSON files for merging. Example: ‘results/temp/_val=all_*to.json’

Example

>>> processor = MergeProcessor()
>>> processor.merge_gendata_bypattern("results/temp/Llama27b-chat-Anthropic-harmless_lam=2.018,1.393,1.498,0.008,0.015,0.088_val=all_*to*.json")
Command-line usage:
>>> python mergeProcessor.py merge_gendata_bypattern --json_file_pattern="results/temp/Llama27b-chat-Anthropic-harmless_lam=2.018,1.393,1.498,0.008,0.015,0.088_val=all_*to*.json"