In the previous tutorial, we explored LLM tokenization and learned how to use BPE and WordPiece tokenization with the tokenizers library. In the second part of the tutorial, we will learn how to use SentencePiece and Byte-level BPE methods.
The tutorial will cover:
- Introduction to SentencePiece
- Implementing SentencePiece Tokenization
- Introduction to Byte-level BPE
- Implementing Byte-level BPE Tokenization
- Conclusion
Let's get started.