1. 1
    08
    Attention Is All You Need
    Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan Gomez, Łukasz Kaiser, Illia Polosukhin 2017 Intermediate
    Read
  2. 2
    10
    Improving Language Understanding by Generative Pre-Training
    Alec Radford, Karthik Narasimhan, Tim Salimans, Ilya Sutskever 2018 Intermediate
    Read
  3. 3
    11
    BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
    Jacob Devlin, Ming-Wei Chang, Kenton Lee, Kristina Toutanova 2018 Intermediate
    Read
  4. 4
    12
    Language Models are Few-Shot Learners
    Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, Sandhini Agarwal, Ariel Herbert-Voss, Gretchen Krueger, Tom Henighan, Rewon Child, Aditya Ramesh, Daniel Ziegler, Jeffrey Wu, Clemens Winter, Christopher Hesse, Mark Chen, Eric Sigler, Mateusz Litwin, Scott Gray, Benjamin Chess, Jack Clark, Christopher Berner, Sam McCandlish, Alec Radford, Ilya Sutskever, Dario Amodei 2020 Intermediate
    Read
  5. 5
    15
    Training Language Models to Follow Instructions with Human Feedback
    Long Ouyang, Jeff Wu, Xu Jiang, Diogo Almeida, Carroll L. Wainwright, Pamela Mishkin, Chong Zhang, Sandhini Agarwal, Katarina Slama, Alex Ray, John Schulman, Jacob Hilton, Fraser Kelley, Emma Coleman, Brennan Zoph, Amanda Askell, Solal Picciotto, Ariel Herbert-Voss, Jeff Engstrom, Christopher Olah, Gretchen Krueger, Ryan Felsher, Timothy Telleen-Lawton, Tom Conerly, Tamera Lanham, Karina Nguyen, Todd Henighan, Saurav Kadavath, Nick Joseph, Tom Brown, Jack Clark, Dawn Song, Dario Amodei, Ilya Sutskever, Paul Christiano, Sam Altman 2022 Intermediate
    Read

Want to go deeper? Browse all 24 papers or explore the math behind them.