Sanjay KUMAR
Skillsoft issued completion badges are earned based on viewing the percentage required or receiving a passing score when assessment is required. Hugging Face, a leading company in the field of artificial intelligence (AI), offers a comprehensive platform that enables developers and researchers to build, train, and deploy state-of-the-art machine learning (ML) models with a strong emphasis on open collaboration and community-driven development.
In this course, you will discover the extensive libraries and tools Hugging Face offers, including the Transformers library, which provides access to a vast array of pre-trained models and datasets.
Next, you will set up your working environment in Google Colab. You will also explore the critical components of the text preprocessing pipeline: normalizers and pre-tokenizers.
Finally, you will master various tokenization techniques, including byte pair encoding (BPE), Wordpiece, and Unigram tokenization, which are essential for working with transformer models. Through hands-on exercises, you will build and train BPE and WordPiece tokenizers, configuring normalizers and pre-tokenizers to fine-tune these tokenization methods.
Issued on
September 19, 2024
Expires on
Does not expire