sigmoid.social is one of the many independent Mastodon servers you can use to participate in the fediverse.
A social space for people researching, working with, or just interested in AI!

Server stats:

712
active users

𝐙𝐢𝐩𝐍𝐍 v0.5.0 introduces compression with CPU multithreading!
Keep models 𝐜𝐨𝐦𝐩𝐫𝐞𝐬𝐬𝐞𝐝 𝐚𝐥𝐥 𝐭𝐡𝐞 𝐰𝐚𝐲 𝐭𝐨 𝐭𝐡𝐞 𝐆𝐏𝐔
saving both 𝐭𝐫𝐚𝐧𝐬𝐟𝐞𝐫 𝐭𝐢𝐦𝐞 and 𝐬𝐭𝐨𝐫𝐚𝐠𝐞 𝐬𝐩𝐚𝐜𝐞
1️⃣From
@huggingface
to storage
2️⃣From storage to GPU

Git: github.com/zipnn/zipnn
Next, GPU!
📈🤖

Leshem Choshen

And compression is now super fast!
💻Performance on Mac M1:
✅𝐂𝐨𝐦𝐩𝐫𝐞𝐬𝐬𝐢𝐨𝐧: 7 GB/s
✅𝐃𝐞𝐜𝐨𝐦𝐩𝐫𝐞𝐬𝐬𝐢𝐨𝐧: 8 GB/s
Wait till multithreading happens on GPU and you only decompress on demand







𝐏𝐚𝐩𝐞𝐫: alphaxiv.org/abs/2411.05239

@LChoshen Cool! I haven't read the whole paper yet, but I thought the source of the compressibility was interesting. Here is a quote from the paper:

"We identify the source of model compressibility as the floating point range that actually exists in models. Specifically,
we find that the exponent component in a floating point parameter is highly skewed and therefore very compressible."