rl4lms.data_pools package

Submodules

rl4lms.data_pools.custom_text_generation_pools module

class rl4lms.data_pools.custom_text_generation_pools.ToTTo(samples: List[Sample])[source]

Bases: TextGenPool

classmethod prepare(split: str, representation: str = 'subtable', **args) TextGenPool[source]

A factory method to instantiate data pool

static gen_split_name(split: str)[source]
class rl4lms.data_pools.custom_text_generation_pools.CommonGen(samples: List[Sample])[source]

Bases: TextGenPool

classmethod prepare(split: str, concept_separator_token: str = ' ', concept_end_token=' ', prefix: str = 'summarize: ') TextGenPool[source]

A factory method to instantiate data pool

static gen_split_name(split: str)[source]
class rl4lms.data_pools.custom_text_generation_pools.Xsum(samples: List[Sample])[source]

Bases: TextGenPool

classmethod prepare(split: str, prompt_suffix: str = 'TL;DR:')[source]

A factory method to instantiate data pool

class rl4lms.data_pools.custom_text_generation_pools.CNNDailyMail(samples: List[Sample])[source]

Bases: TextGenPool

classmethod prepare(split: str, prompt_suffix: str = '', prompt_prefix: str = '', truncate_article: int | None = None, max_size: int | None = None)[source]

A factory method to instantiate data pool

class rl4lms.data_pools.custom_text_generation_pools.IMDB(samples: List[Sample])[source]

Bases: TextGenPool

IMDB Dataset for sentiment continuation task

classmethod prepare(split: str, seed: int)[source]

A factory method to instantiate data pool

class rl4lms.data_pools.custom_text_generation_pools.IMDBForSeq2Seq(samples: List[Sample])[source]

Bases: TextGenPool

IMDB Dataset in seq2seq format to train supervised generator

classmethod prepare(split: str, positive_ratio: int = 1.0)[source]

A factory method to instantiate data pool

rl4lms.data_pools.custom_text_generation_pools.download_file_using_url(url: str, dest_path: str)[source]
class rl4lms.data_pools.custom_text_generation_pools.NarrativeQA(samples: List[Sample])[source]

Bases: TextGenPool

classmethod normalize_text(text, strip: bool)[source]
classmethod prepare(split: str)[source]

A factory method to instantiate data pool

class rl4lms.data_pools.custom_text_generation_pools.WMT(samples: List[Sample])[source]

Bases: TextGenPool

classmethod get_dataset(wmt_id: str, source_language: str, target_language: str, split: str)[source]
classmethod prepare(wmt_id: str, split: str, source_language: str, target_language: str, prompt_suffix: str = '', prompt_prefix: str = '')[source]

A factory method to instantiate data pool

class rl4lms.data_pools.custom_text_generation_pools.WMT14PreprocessedEnDe(samples: List[Sample])[source]

Bases: TextGenPool

classmethod get_dataset(split: str)[source]
classmethod prepare(split: str, prompt_suffix: str = '', prompt_prefix: str = '')[source]

A factory method to instantiate data pool

class rl4lms.data_pools.custom_text_generation_pools.WMT16NewsOnlyDatasetEnDe(samples: List[Sample])[source]

Bases: TextGenPool

classmethod get_dataset(split: str)[source]
classmethod prepare(split: str, prompt_suffix: str = '', prompt_prefix: str = '')[source]

A factory method to instantiate data pool

class rl4lms.data_pools.custom_text_generation_pools.IWSLT2017EnDe(samples: List[Sample])[source]

Bases: TextGenPool

classmethod get_dataset(split: str)[source]
classmethod prepare(split: str, prompt_suffix: str = '', prompt_prefix: str = '')[source]

A factory method to instantiate data pool

class rl4lms.data_pools.custom_text_generation_pools.CRD3DialogueGeneration(samples: List[Sample])[source]

Bases: TextGenPool

SOURCE_URL = 'https://github.com/RevanthRameshkumar/CRD3/archive/refs/heads/master.zip'
DEST_BASE_FOLDER = 'crd3'
DEST_EXTRACTED_FOLDER = 'CRD3-master'
ZIP_FILE_NAME = 'master.zip'
PATH_TO_ALIGNED_DATA = 'data/aligned data'
PATH_TO_CLEANED_DATA = 'data/cleaned data'
classmethod prepare(split: str, max_context_size: int)[source]

A factory method to instantiate data pool

class rl4lms.data_pools.custom_text_generation_pools.DailyDialog(samples: List[Sample])[source]

Bases: TextGenPool

EOU_TOKEN = '<EOU>'
classmethod prepare(split: str, context_size: int)[source]

A factory method to instantiate data pool

rl4lms.data_pools.text_generation_pool module

class rl4lms.data_pools.text_generation_pool.Sample(id: str, prompt_or_input_text: str, references: List[str], meta_data: Dict[str, Any] = None)[source]

Bases: object

id: str
prompt_or_input_text: str
references: List[str]
meta_data: Dict[str, Any] = None
__init__(id: str, prompt_or_input_text: str, references: List[str], meta_data: Dict[str, Any] | None = None) None
class rl4lms.data_pools.text_generation_pool.TextGenPool(samples: List[Sample])[source]

Bases: object

__init__(samples: List[Sample])[source]
sample() Sample[source]
abstract classmethod prepare(**args) TextGenPool[source]

A factory method to instantiate data pool

split(split_ratios: List[float]) List[TextGenPool][source]

Module contents