sigmoid.social is one of the many independent Mastodon servers you can use to participate in the fediverse.
A social space for people researching, working with, or just interested in AI!

Server stats:

609
active users

#datacompression

1 post1 participant0 posts today

#Python-Blosc2 is hitting 1 million weekly downloads on PyPI! 🎉 pypacktrends.com/?packages=blo

Users are rapidly adopting #Blosc2, which now accounts for over 95% of downloads compared to Blosc1. 📈 This success is thanks to our amazing users and community contributors. 🙏 We're dedicated to making Python-Blosc2 even better. 🚀

Our motto: Compress Better, Compute Bigger! 💪

So @rl_dane introduced #bzip3 to me to use instead of #bzip2. Let's turn some bz2 files into bz3 to see the difference.

First example: 90k opus files

hey snips wake word dataset. It has ~90k opus files and a tar file of 3.1GB. bzip2 produces the same 3.1GB which is as expected. bzip3 created 3.0GB but used tons of computation power. Not worth the 100MB

Second example: Windows 7 virtual box VM image

Windows7.vdi it's Windows 7 VM image for the "special" days. I think I have to get rid of it. But while it is still there, let's see how each will perform. It is 16GB uncompressed. bzip2 -9 is 7.0GB. bzip3 is 6.3GB but at the expense of like 3x CPU time. Deleting all of them anyway. Down with Windows.

Third example: Pure XML text file

Pure XML file. It's Persian and English characters. Uncompressed is 1.7GB. bzip2 -9 is 276M while bzip3 is 260MB

Final example: Creating a simple bomb

So I did this:

dd if=/dev/zero of=./justzero bs=2G count=6

So now I have a 16GB with only zero bytes. bzip2 -9 is 672KB. bzip3 is 46KB.

Conclusion

Thank you @rl_dane

Real nice thing!

New blog series: @folkertdev shows how we use SIMD in the zlib-rs project.

SIMD is crucial to good performance, but learning how to use it can be daunting. In this series we'll show concrete examples of using SIMD in a real world project.

Part 1 explains how the compiler already uses SIMD for us, how to evaluate whether it's doing a good job, and how to use a more optimal version when the current CPU supports it.

tweedegolf.nl/en/blog/153/simd

@trifectatech

tweedegolf.nlSIMD in zlib-rs (part 1): Autovectorization and target features - Blog - Tweede golfI'm fascinated by the creative use of SIMD instructions. When you first learn about SIMD, it is clear that doing more multiplications in a single instruction is useful for speeding up matrix multi ...

New versions of Jubako projects have been released ! 🚀

- Jubako 0.3.3
- Arx 0.3.2 (including tar2arx and zip2arx)
- Waj 0.3.0

There are nice improvements on Jubako and Arx side; and Waj finally rejoins the `0.3` series, few months after Arx !

You can `cargo install` them or simply get the binaries from release pages at github.com/jubako

GitHubJubakoJubako has 10 repositories available. Follow their code on GitHub.

As far as we know, our zlib-rs is the fastest WASM zlib implementation today.
Knowing SIMD is incredibly effective for the zlib algorithms, we were excited to use the WASM SIMD instructions. Read about the work and results:

trifectatech.org/blog/fastest-

Special thanks to our sponsor Devolutions for supporting the WASM SIMD milestone.

And to our maintainers, @folkertdev and @bjorn3 for the amazing work.

#rustlang #datacompression @awakecoding

trifectatech.orgThe fastest WASM zlib - Trifecta Tech Foundation

We're happy to see zlib-rs, the Rust implementation of zlib, move to Trifecta Tech Foundationand find its long-term home.

Our team started the initial development of zlib-rs in Dec 2023 as a Prossimo project. Work on WebAssembly optimizations (🙏 Devolutions) is almost complete and will be released soon.

Read the details here: trifectatech.org/blog/new-home

@trifectatech
@ProssimoISRG

trifectatech.orgTrifecta Tech Foundation is the new home for memory safe zlib - Trifecta Tech Foundation

Reading up on jpeg's [1], and discovered the "Lena" story [2]. Which bounced me to the Suzanne Vega "Mother of MP3" writeup [3].

As a nerdy geek, the compression designs are my primary interest, but the backstories are interesting* too.

* Putting aside the inappropriateness of the original Lena image source. Men! (I'm a man, ftr).

[1] en.wikipedia.org/wiki/JPEG#Los
[2] wired.com/story/finding-lena-t
[3] archive.nytimes.com/opinionato

en.wikipedia.orgJPEG - Wikipedia

Our zlib-rs project implements a memory-safe and performant drop-in replacement for zlib, a widely-used data compression library.

@folkertdev shares the status quo of zlib-rs, including the good news that performance for the highest compression level is on par with the zlib-ng fork of zlib.

Read the blog for all the details:

tweedegolf.nl/en/blog/134/curr

@trifectatech

tweedegolf.nlCurrent zlib-rs performance - Blog - Tweede golfOur zlib-rs project implements a drop-in replacement for libz.so, a dynamic library that is widely used to perform gzip (de)compression.

Mario Paint Data Overflow Error
There is a subtle flaw in SNES art creation tool Mario Paint. It has 32K of Save RAM, which is not technically enough to save an entire project, normal and animation canvases included. The program uses data compression to get everything to fit, and the compression is good enough that most of the time everything can be squeezed in,
setsideb.com/mario-paint-data-
#retro #datacompression #error #mario #mariopaint #overflow #retro #snes

Set Side B · Mario Paint Data Overflow ErrorThere is a subtle flaw in SNES art creation tool Mario Paint. It has 32K of Save RAM, which is not technically enough to save an entire project, normal and anim
Replied in thread

@FroehlichMarcel

All models are either lossy or lossless flamboyant data compression trees?

Recalled reading this short brilliant thoughtful take:

Prefix-Free Code
&
Huffman Coding

» The specialty of Huffman tree compared to an ordinary prefix-free code tree is that it minimizes the probability weighted mean of code length in the system «
leimao.github.io/blog/Huffman-

another clear cut:
control.com/technical-articles

#PrefixCode
#PrefixFreeCodeTree
#HuffmanCode
#HuffmanTree
#DataCompression
#DataLoss

Lei Mao's Log Book · Prefix-Free Code and Huffman CodingUnderstand Prefix-Free Code, Huffman Coding, and Try it Using Library Tools