Welcome to RL4LMs’s documentation!

Note

The documentation is currently under active development

RL4LMs provides easily customizable building blocks for training language models, including implementations of on-policy algorithms, reward functions, metrics, datasets and LM based actor-critic policies

Github Repository: https://github.com/allenai/RL4LMs

Paper Link: https://arxiv.org/abs/2210.01241

Website Link: https://rl4lms.apps.allenai.org/

Main Characteristics

RL4LMs is thoroughly tested and benchmarked with over 2000 experiments (GRUE benchmark) on a comprehensive set of:

7 different Natural Language Processing (NLP) Tasks:
- Summarization
- Generative Commonsense Reasoning
- IMDB Sentiment-based Text Continuation
- Table-to-text generation
- Abstractive Question Answering
- Machine Translation
- Dialogue Generation
Different types of NLG metrics (20+) which can be used as reward functions:
- Lexical Metrics (eg: ROUGE, BLEU, SacreBLEU, METEOR)
- Semantic Metrics (eg: BERTSCORE, BLEURT)
- Task specific metrics (eg: PARENT, CIDER, SPICE)
- Scores from pre-trained classifiers (eg: Sentiment scores)
On-policy algorithms of PPO, A2C, TRPO and novel NLPO (Natural Language Policy Optimization)
Actor-Critic Policies supporting causal LMs (eg. GPT-2/3) and seq2seq LMs (eg. T5, BART)

All of these building blocks can be customizable allowing users to train transformer-based LMs to optimize any arbitrary reward function on any dataset of their choice.

Getting Started

Todo

Solve the problem with rl4lms.algorithms not being shown

Module Guide

Citing RL4LMs

To cite this project in publications:

@inproceedings{Ramamurthy2022IsRL,
   title={Is Reinforcement Learning (Not) for Natural Language Processing?: Benchmarks, Baselines, and Building Blocks for Natural Language Policy Optimization},
   author={Rajkumar Ramamurthy and Prithviraj Ammanabrolu and Kiant{\'e} Brantley and Jack Hessel and Rafet Sifa and Christian Bauckhage and Hannaneh Hajishirzi and Yejin Choi},
   journal={arXiv preprint arXiv:2210.01241},
   url={https://arxiv.org/abs/2210.01241},
   year={2022}
}

Welcome to RL4LMs’s documentation!

Main Characteristics

Citing RL4LMs

Indices and tables