sigmoid.social is one of the many independent Mastodon servers you can use to participate in the fediverse.
A social space for people researching, working with, or just interested in AI!

Server stats:

608
active users

#perf

0 posts0 participants0 posts today
Continued thread

... so, the final lesson here is that too many abstractions can impede the implementation's performance! many software abstractions introduce computational overhead, if they are not *very* carefully tailored to the application at hand. for something like mobile gaming on low-power devices, you need to treat each byte of processing energy with a high level of respect. that is what Vizflow was designed for #performance #perf #software #engineering #science #visualization #dataScience #art

Continued thread

the biggest question was: "can HTML5 games perform well on mobile?" and the answer for a while was "¯\_(ツ)_/¯"...

well, as someone who feels that visualizations and games are ultimately the exact same thing, and who likes both a lot, i wanted to try to answer this question for myself!

so, i set out to make some games using D3. That effort ended up failing, due to #performance (#perf) reasons, but it inspired me to make Vizflow :)

ES6 is partially based on CoffeeScript

github.com/dannyko/gpdk

Continued thread

anyway, a thing that this #dsl is: slow! on every frame it recompiles the layout. granted, the layout is simple. still it should be possible to compile it all to a thunk, and only recompile on edit...

that's the "editable ui" saga, anyway. and i'm not sure at all whether the dsl is the bottleneck, but once i add like 40 tracks and 40 scenes the rendering drops to <10FPS

so, #cargoflamegraph it is... #perf #profiler #flamegraph

Fun stuff: #linux #perf can't copy userspace stacks over 64k bytes due to u16 field size limit. You need the stack to do unwinding with dwarf.

I have a #rust program that has a single frame twice as big. It's 3 frames combined into one, but still.

The default stack size to copy in perf is just 8k.

:freebsd_logo: FBSD 14.x Kernel Build :freebsd_logo:

81 seconds to compile a copy of GENERIC_KCSAN kernel on GhostBSD 24.10 (FreeBSD 14.1 base).

That's generally acceptable performance for an often silent Micro-ATX workstation (EPYC 4564P 16C/32T, 4.5GHz, 128/ECC, MB: H13SAE-MF). Potential improvements abound, sort of, given two requirements:

1. Low-dB acoustics, not "pitperf maxxing*"
2. Usage for mid-level Ai/ML on VMs for LLMs

What could be improved?
a) Upgrade GPU: 2x A4000 →Ada Gen
b) Upgrade NVMe: 2x M.2 PCIe Gen4 → Gen5
c) Swap 4x 32GB ECC → 4x 48GB ECC
d) Swap 4x DDR5-4800 → DDR5-5200

Cost/Benefit on those potential upgrades?
a) Cost = $$$, Benefit = ~10-25% vector perf
b) Cost = $, Benefit = ~1.5x I/O perf
c) Cost = $$, Benefit = 128GB → 192GB 🤤
d) Cost = $$$, Benefit = not a big deal

* PiT-Perf == Point In Time Performance
* Maxxing == Engaging in Applied Maximalism

#freebsd#foss#oss
Continued thread

…The

“Let’s load a progress bar so we can take over 30 seconds to compile and render a 250 row pseudo-table and tell ourselves that this is #perf ormant”

…guide to React-ing to requests

- - -

The

“Let’s set targets for ‘net zero’ that are beyond the current parliament and tell ourselves we are world leaders”

…guide to reacting to scientific findings

Does anyone have any good #linux or #bsd resources for explaining the nitty gritty details of how #perf tends suffers under max load due to timing windows being missed, tasks needing to be retried, etc? I'm thinking about process stalls due to higher iowait, memory allocation pressure leading to inefficient paging, disk command queue saturation leading to inefficient process wake-ups, etc. Conference slides or recordings would be great. Blog posts too.