sigmoid.social is one of the many independent Mastodon servers you can use to participate in the fediverse.
A social space for people researching, working with, or just interested in AI!

Server stats:

525
active users

#rdkit

0 posts0 participants0 posts today

chemfp 5.0b2 is out. Get it while it's hot! For Linux:

python -m pip install chemfp==5.0b2 -i chemfp.com/packages/

I'm still updating the documentation. See 'What's new in 5.0' at chemfp.com/docs/whats_new_in_5

* shardsearch - search many target files

* simhistogram - histogram all the scores

* FPB file now handles 1B+ records

* sparse count fingerprints
- new FPC format
- rdkit2fpc to make them with
- fpc2fps to convert to binary fps
- fps2fpc for the other way

chemfp.comPackages from the chemfp project

It's official - the upcoming chemfp 5.0 release will have limited support sparse count fingerprints, in addition to the normal binary fingerprints.

The new format is "FPC", a variant of the FPS format. Details at chemfp.com/fpc_format/.

There will also be "rdkit2fpc" for the four count fingerprint generators.

Plus "fpc2fps" with several methods to convert sparse count features -> binary.

And "fps2fpc" for the reverse (it's just a list of on-bit indices.)

chemfp.comFPC format specification

Join Franciszek Job at EuroSciPy as he presents a scalable framework to unify chemical datasets from sources like PubChem, UniChem & COCONUT.

Canonicalize with RDKit
Scale via Dask
Deduplicate with InChI keys

Ideal for ML pretraining, benchmarking, and chemical data analysis.

Schedule: lnkd.in/eaAxwUN2
Tickets: lnkd.in/end9aYzE

lnkd.inLinkedInThis link will take you to a page that’s not on LinkedIn