Welcome to RL4LMs’s documentation!

Note

The documentation is currently under active development

RL4LMs provides easily customizable building blocks for training language models, including implementations of on-policy algorithms, reward functions, metrics, datasets and LM based actor-critic policies

Github Repository: https://github.com/allenai/RL4LMs

Paper Link: https://arxiv.org/abs/2210.01241

Website Link: https://rl4lms.apps.allenai.org/

Main Characteristics

RL4LMs is thoroughly tested and benchmarked with over 2000 experiments (GRUE benchmark) on a comprehensive set of:

  • 7 different Natural Language Processing (NLP) Tasks:
    • Summarization

    • Generative Commonsense Reasoning

    • IMDB Sentiment-based Text Continuation

    • Table-to-text generation

    • Abstractive Question Answering

    • Machine Translation

    • Dialogue Generation

  • Different types of NLG metrics (20+) which can be used as reward functions:
    • Lexical Metrics (eg: ROUGE, BLEU, SacreBLEU, METEOR)

    • Semantic Metrics (eg: BERTSCORE, BLEURT)

    • Task specific metrics (eg: PARENT, CIDER, SPICE)

    • Scores from pre-trained classifiers (eg: Sentiment scores)

  • On-policy algorithms of PPO, A2C, TRPO and novel NLPO (Natural Language Policy Optimization)

  • Actor-Critic Policies supporting causal LMs (eg. GPT-2/3) and seq2seq LMs (eg. T5, BART)

All of these building blocks can be customizable allowing users to train transformer-based LMs to optimize any arbitrary reward function on any dataset of their choice.

Todo

Solve the problem with rl4lms.algorithms not being shown

Citing RL4LMs

To cite this project in publications:

@inproceedings{Ramamurthy2022IsRL,
   title={Is Reinforcement Learning (Not) for Natural Language Processing?: Benchmarks, Baselines, and Building Blocks for Natural Language Policy Optimization},
   author={Rajkumar Ramamurthy and Prithviraj Ammanabrolu and Kiant{\'e} Brantley and Jack Hessel and Rafet Sifa and Christian Bauckhage and Hannaneh Hajishirzi and Yejin Choi},
   journal={arXiv preprint arXiv:2210.01241},
   url={https://arxiv.org/abs/2210.01241},
   year={2022}
}

Indices and tables