jsulz 15 hours ago

Hugging Face is replacing Git LFS on the Hub with a more efficient, chunk-based storage system. Using content-defined chunking (CDC), we version variable-sized chunks of data instead of entire files. This dramatically reduces upload/download times by transferring only modified chunks—eliminating the need to re-upload or re-download massive files for small changes. It’s a game-changer for managing the ever-growing datasets and models hosted on the Hub.

Read the blog post to dive into the details of CDC and how it improves performance and scalability for ML workflows and let me know what you think!