A PHP implementation of OpenAI's BPE tokenizer tiktoken.
-
Updated
Jun 16, 2024 - PHP
A PHP implementation of OpenAI's BPE tokenizer tiktoken.
Byte-Pair Encoding tokenizer for training large language models on huge datasets
(1) Train large language models to help people with automatic essay scoring. (2) Extract essay features and train new tokenizer to build tree models for score prediction.
processing de LANguage NATural
The basic BPE-tokenizer in NLP
self made byte-pair-encoding tokenizer
Add a description, image, and links to the bpe-tokenizer topic page so that developers can more easily learn about it.
To associate your repository with the bpe-tokenizer topic, visit your repo's landing page and select "manage topics."