sigmoid.social is one of the many independent Mastodon servers you can use to participate in the fediverse.
A social space for people researching, working with, or just interested in AI!

Server stats:

595
active users

#statistics

85 posts35 participants12 posts today

Found 27 new servers and 25 servers died off since 9 hours ago

18,960 servers checked today. 17,057,968 Total Users with 1,253,964 Active Users today

Check out the #fediverse stats

History of servers found and deleted

Help others find a home, send them to fediverse.observer

fediverse.observerFediverse Observer checks all sites in the fediverse and gives you an easy way to find a home from a map or list or automatically.Fediverse Sites Status. Find a Fediverse server to sign up for, find one close to you!
Continued thread

Next was an amazing talk by Russell Poldrack on reproducibility in cognitive neuroscience at the UW eScience Institute. Poldrack combines incisive statistical commentary with devastating analyses and experiments demonstrating the extremely brittle nature of neuroscience results, moving on to suggest approaches to improve the field. Highly recommend youtube.com/watch?v=haNjW58rbWM (4/7) #statistics #neuroscience #science

Found 22 new servers and 7 servers died off since 5 hours ago

18,966 servers checked today. 17,057,968 Total Users with 1,253,964 Active Users today

Check out the #fediverse stats

History of servers found and deleted

Help others find a home, send them to fediverse.observer

fediverse.observerFediverse Observer checks all sites in the fediverse and gives you an easy way to find a home from a map or list or automatically.Fediverse Sites Status. Find a Fediverse server to sign up for, find one close to you!

Why Every Biotech Research Group Needs a Data Lakehouse

start tiny and scale fast without vendor lock-in

All biotech labs have data, tons of it. The problem is the same across scales. Accessing data across experiments is hard. Often data simply gets lost on somebody’s laptop with a pretty plot on a poster as the only clue it ever existed. The problem is almost insurmountable if you try to track multiple data types. Trying to run any kind of data management activity used to have large overhead. New technology like DuckDB and their new data lakehouse infrastructure, DuckLake, try to make it very easy to adopt and scale with your data. All while avoiding vendor lock-in.

American Scoter Duck from Birds of America (1827) by John James Audubon (1785 – 1851 ), etched by Robert Havell (1793 – 1878).

The data dilemma in modern biotech

High-content microscopy, single-cell sequencing, ELISAs, flow-cytometry FCS files, Lab Notebook PDFs—today’s wet-lab output is a torrent of heterogeneous, PB-scale assets. Traditional “raw-files-in-folders + SQL warehouse for analytics” architectures break down when you need to query an image-derived feature next to a CRISPR guide list under GMP audit. A lakehouse merges the cheap, schema-agnostic storage of a data lake with the ACID guarantees, time-travel, and governance of a warehouse—on one platform. Research teams, at discovery or clinical trial stages, can enjoy faster insights, lower duplication, and smoother compliance when they adopt a lakehouse model .

Lakehouse super-powers for biotech

  • Native multimodal storage: Keep raw TIFF stacks, Parquet tables, FASTQ files, and instrument logs side-by-side while preserving original resolution.
  • Column-level lineage & time-travel: Reproduce an analysis exactly as of “assay-plate upload on 2025-07-14” for FDA, EMA, or GLP audits.
  • In-place analytics for AI/ML: Push DuckDB/Spark/Trino compute to the data; no ETL ping-pong before model training.
  • Cost-elastic scaling: Store on low-cost S3/MinIO today; spin up GPU instances tomorrow without re-ingesting data.
  • Open formats: Iceberg/Delta/Hudi (and now DuckLake) keep your Parquet files portable and your exit costs near zero .

DuckLake: an open lakehouse format to prevent lock-in

DuckLake is still pretty new and isn’t quite production ready, but the team behind it is the same as DuckDB and I expect they will deliver high quality as 2025 progresses. Datalakes or even lakehouses, are not new at all. Iceberg and Delta pioneered open table formats, but still scatter JSON/Avro manifests across object storage and bolt on a separate catalog database. DuckLake flips the design: all metadata lives in a normal SQL database, while data stays in Parquet on blob storage. The result is simpler, faster, cross-table ACID transactions—and you can back the catalog with Postgres, MySQL, MotherDuck, or even DuckDB itself .

Key take-aways:

  • No vendor lock-in: Because operations are defined as plain SQL, any SQL-compatible engine can read or write DuckLake—good-bye proprietary catalogs.
  • Start on a laptop, finish on a cluster: DuckDB + DuckLake runs fine on your MacBook; point the same tables at MinIO-on-prem or S3 later without refactoring code.
  • Cross-table transactions: Need to update an assay table and its QC log atomically? One transaction—something Iceberg and Delta still treat as an “advanced feature.”

Psst… if you don’t understand or don’t care what ACID, manifests, or object stores mean, assign a grad student, it’s not complicated.

Found 35 new servers and 53 servers died off since 10 hours ago

18,951 servers checked today. 17,060,577 Total Users with 1,088,208 Active Users today

Check out the #fediverse stats

History of servers found and deleted

Help others find a home, send them to fediverse.observer

fediverse.observerFediverse Observer checks all sites in the fediverse and gives you an easy way to find a home from a map or list or automatically.Fediverse Sites Status. Find a Fediverse server to sign up for, find one close to you!

Five usability- and design sites I use:

1. Discount software-design method:
fivesketches.com/quality-softw

2. UK's methods and patterns:
design-system.service.gov.uk/

3. Nielsen Norman Group's posts about user research and design:
nngroup.com/articles/

4. MeasuringU's posts about research statistics:
measuringu.com/

5. USA's definitions and resources:
digital.gov/resources

fivesketches.comGenerate quality solutions for software- and content design challenges – Usability research and analysis

Found 25 new servers and 35 servers died off since 9 hours ago

18,960 servers checked today. 17,060,577 Total Users with 1,088,208 Active Users today

Check out the #fediverse stats

History of servers found and deleted

Help others find a home, send them to fediverse.observer

fediverse.observerFediverse Observer checks all sites in the fediverse and gives you an easy way to find a home from a map or list or automatically.Fediverse Sites Status. Find a Fediverse server to sign up for, find one close to you!