sigmoid.social is one of the many independent Mastodon servers you can use to participate in the fediverse.
A social space for people researching, working with, or just interested in AI!

Server stats:

597
active users

#similarity

0 posts0 participants0 posts today

Here is a typical Russian milblogger channel. Note ”owning the libs” type naming.

Are you still surprised MAGA supports Russia?

I think they actually look up to Russia, honestly, and would love nothing more than the USA to become more like Russia.

Russian brashness and embracing being the bully excites them. Maybe even makes them jealous.

I’m excited to share my newest blog post, "Don't sure cosine similarity carelessly"

p.migdal.pl/blog/2025/01/dont-

We often rely on cosine similarity to compare embeddings—it's like “duct tape” for vector comparisons. But just like duct tape, it can quietly mask deeper problems. Sometimes, embeddings pick up a “wrong kind” of similarity, matching questions to questions instead of questions to answers or getting thrown off by formatting quirks and typos rather than the text's real meaning.

In my post, I discuss what can go wrong with off-the-shelf cosine similarity and share practical alternatives. If you’ve ever wondered why your retrieval system returns oddly matched items or how to refine your embeddings for more meaningful results, this is for you!
`
I want to thank Max Salamonowicz and Grzegorz Kossakowski for their feedback after my flash talk at the Warsaw AI Breakfast, Rafał Małanij for inviting me to give a talk at the Python Summit, and for all the curious questions at the conference, and LinkedIn.

p.migdal.plDon't use cosine similarity carelesslyCosine similarity - the duct tape of AI. Convenient but often misused. Let's find out how to use it better.
Continued thread

PS If they only had a slightly invested #phylogeneticist at hand; they easily could have learned a lot about the strengths and weaknesses of their data and preferred tree (a Bayesian MRC, by the way, is a summary tree of various competing topologies sampled in the MCMC chain, not a phylogenetic tree)

Here's a quick #NeighbourNet based on their "toutes" matrix (inferred in less than a minute), annotated.
Overall #similarity makes #clades, surprise, surprise.