Further reading — Paper 06

Blogs, videos, code, and Indian-language resources. Start at the top and work down.

The original papers

Sutskever, I., Vinyals, O., & Le, Q. V. (2014). Sequence to Sequence Learning with Neural Networks. NeurIPS. https://arxiv.org/abs/1409.3215 The paper we just read. Remarkably readable compared to modern papers — pay special attention to their explanation of the reverse-input trick in the introduction.
Cho, K. et al. (2014). Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation. EMNLP. https://arxiv.org/abs/1406.1078 Published a few months before Sutskever’s paper, introduced the encoder-decoder idea using GRUs instead of LSTMs. Complementary reading.

Jay Alammar — “Visualizing A Neural Machine Translation Model (Mechanics of Seq2seq Models With Attention)”. https://jalammar.github.io/visualizing-neural-machine-translation-mechanics-of-seq2seq-models-with-attention/ Gorgeous animated walkthrough. The first half covers pure seq2seq; the second half previews Paper 07 (attention).
Chris Olah & Shan Carter — “Attention and Augmented Recurrent Neural Networks” (Distill, 2016). https://distill.pub/2016/augmented-rnns/ Classic visual explanation of RNN-based translation and attention.
Lilian Weng — “Attention? Attention!” (2018). https://lilianweng.github.io/posts/2018-06-24-attention/ Starts with seq2seq, then builds up to Transformer-era attention. Excellent bridge between Papers 06 → 07 → 08.

Andrew Ng — “Sequence to Sequence Models” (DeepLearning.AI Coursera). https://www.coursera.org/lecture/nlp-sequence-models/basic-models-HAPhR Masterclass explanation with extreme clarity. Free to audit.
Stanford CS224N — “Machine Translation, Seq2Seq and Attention”. https://www.youtube.com/watch?v=XXtpJxZBa2c Chris Manning’s lecture covering the same material at grad-student depth.
Andrej Karpathy — “The Unreasonable Effectiveness of Recurrent Neural Networks”. https://karpathy.github.io/2015/05/21/rnn-effectiveness/ Not about seq2seq specifically, but the best intuition-builder for why RNNs generate coherent text at all.

Official PyTorch seq2seq tutorial — “NLP From Scratch: Translation with a Sequence to Sequence Network and Attention”. https://pytorch.org/tutorials/intermediate/seq2seq_translation_tutorial.html Walks through building a French-to-English translator from scratch, starting with pure seq2seq and then adding attention. Directly expands on the toy code in Section 6.
TensorFlow NMT tutorial. https://www.tensorflow.org/text/tutorials/nmt_with_attention The TF equivalent — English-to-Spanish this time.
OpenNMT. https://opennmt.net/ An open-source toolkit for NMT that started in the seq2seq era and still supports the architecture.

AI4Bharat — IndicTrans2. https://github.com/AI4Bharat/IndicTrans2 Translation models across 22 scheduled Indian languages, open-source and pretrained. Successor of this paper’s architecture.
AI4Bharat — IndicBART. https://github.com/AI4Bharat/indic-bart An encoder-decoder Transformer pre-trained on Indian languages — modern descendant of seq2seq.
Bhashini (National Language Translation Mission). https://bhashini.gov.in/ India’s government-backed translation platform. Read about the Digital India Bhashini Division (DIBD) and the underlying tech.
Samanantar corpus (AI4Bharat). https://ai4bharat.iitm.ac.in/samanantar 49 million parallel sentence pairs across 11 Indian languages. Excellent training dataset if you want to train your own seq2seq model from scratch.
iNLTK (Indic NLP Toolkit). https://inltk.readthedocs.io/ Simple Python API including translation utilities for Hindi, Tamil, Bengali, and more.

IIT Madras — AI4Bharat. https://ai4bharat.org/ Home of the state-of-the-art open-source Indian-language NLP work.
IIT Bombay — CFILT (Center for Indian Language Technology). http://www.cfilt.iitb.ac.in/ Long history of Indian-language MT, including seq2seq-era work and the Hindi WordNet.
IIIT Hyderabad — LTRC (Language Technologies Research Centre). http://ltrc.iiit.ac.in/ Parsing and translation research for Indian languages.

If you’re continuing through this series: