This repository will contain the code for the paper "A Window-based Approach to Data Augmentation for Text Normalization."
TODO:
- Port code from Google Repository--focus on character-based approach.--done
- Convert code to handle SOTA libraries.--done
- Restructure code to handle multiple inputs.--done
- Develop data for (plausibly) Chumash and Pomo data, consent permitting.
- Revisit handling issues in sents_util.py.--likely done, to revisit
- Generate requirements.txt.