🖥️ Explore Jupyter Notebooks for mastering large language models on supercomputers, with resources from leading experts in the field.
-
Updated
Sep 2, 2025 - Jupyter Notebook
🖥️ Explore Jupyter Notebooks for mastering large language models on supercomputers, with resources from leading experts in the field.
Final project of the course "Large Scale AI Engineering" at ETH Zürich, FS2025. Implementation and benchmarking of pretokenization and Distributed Data Parallel (DDP) for efficient LLM training on the CSCS Alps supercomputer.
Add a description, image, and links to the pretokenization topic page so that developers can more easily learn about it.
To associate your repository with the pretokenization topic, visit your repo's landing page and select "manage topics."