Skip to content

Basic seq2seq model including simplest encoder & decoder and attention-based ones

Notifications You must be signed in to change notification settings

yzabc007/Basic-Seq2Seq-Learning

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

18 Commits
 
 
 
 

Repository files navigation

Basic-Seq2Seq-Learning

Basic seq2seq model including simplest encoder & decoder and attention-based ones

This repository implements the "most" basic seq2seq learning as one small step of my own project.

Basically, it constains the essential ideas of the following papers:

SimpleSeq2Seq model: Sequence to Sequence Learning with Neural Networks

  • Simplest rnn encoder and rnn decoder

ContextSeq2Seq model: Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation

  • Adding context vector to each step of decoder (rnn inputs & classifier inputs)

BahdanauAttSeq2Seq model: Neural Machine Translation by Jointly Learning to Align and Translate

  • Bahdanau Attention mechanism: first calculating attention weight and vector with previous hidden state of the decoder and the hidden states of the encoder; then feeding all info to the decoder rnn and doing classification by current hidden state of the decoder

LuongAttSeq2Seq model: Effective Approaches to Attention-based Neural Machine Translation

  • Global attention mechanism: first getting hidden state of the decoder and calculate attention weight and vector with current hidden state of the decoder and the hidden states of the encoder; then feeding all info to the classifier

This implementation features the following functionalities:

  • Easy to use and understand code framework with simple pre-processing and decoding
  • Mini-batching (although the decoder goes one step at one time)

TODO:

  • Adding mask for attention
  • Making the passing of dimension para more clear
  • Comparing the performance with reported results of papers in formal datasets
  • Beam searching
  • Implementing "Attention is all you need".
  • Training Seq2seq by RL

Some resources I refer to during the implementation:

https://pytorch.org/tutorials/intermediate/seq2seq_translation_tutorial.html

https://github.com/bentrevett/pytorch-seq2seq

https://github.com/spro/practical-pytorch/tree/master/seq2seq-translation

https://github.com/IBM/pytorch-seq2seq

https://github.com/MaximumEntropy/Seq2Seq-PyTorch

https://github.com/pytorch/tutorials/tree/master/intermediate_source

Please feel free to point them out if there are errors whether in the implementation or my understanding of papers. Thanks.

About

Basic seq2seq model including simplest encoder & decoder and attention-based ones

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published