Below you will find pages that utilize the taxonomy term “Sub-Word”
Technical Posts
Sub-Word Tokenization: Breaking Words Like a Pro
Take a detour before diving into transformers and explore sub-word tokenization techniques like Byte-Pair Encoding, WordPiece, and Unigram models. Learn how they handle rare words, reduce vocabulary size, and make models more efficient!
read more