sigmoid.social is one of the many independent Mastodon servers you can use to participate in the fediverse.
A social space for people researching, working with, or just interested in AI!

Server stats:

733
active users

#recsys

0 posts0 participants0 posts today

Published at : "On the challenges of studying bias in Recommender Systems: The effect of data characteristics and algorithm configuration" by Savvina Daniil, Manel Slokom, Mirjam Cuper, Cynthia Liem, Jacco van Ossenbruggen, and Laura Hollink.

doi.org/10.54195/irrj.19607

doi.orgOn the challenges of studying bias in Recommender Systems: The effect of data characteristics and algorithm configuration | Information Retrieval Research

New at my blog: I'd been bothered for a few years now in teaching #recsys by the need for a biased lift computation for informative product-relatedness measures that aren't dominated by low-information items. So this term, I sat down to solve the problem. md.ekstrandom.net/blog/2025/01

Michael Ekstrand on the Web · Biased Lift for Related-Item RecommendationBiasing the lift item association metric to reduce the propensity to learn high association scores based on a small number of instances.

Do you appreciate news? Do you want to help #recsys research? POPROX News is open for subscriptions!

The Platform for OPen Recommendation and Online eXpermentation is an #NSFfunded project to help scientists and journalists explore better ways to select and present news stories. Our MVP is a daily personalized newsletter — if you'd like to receive that, and periodic surveys, sign up at user.poprox.ai/enroll?source=f

After signup, you'll receive a confirmation email — check spam if needed.

POPROXPOPROXPOPROX is a personalized news service that delivers a daily, ad-free email newsletter featuring stories from the Associated Press, tailored to your interests

At the #dayofdh2024 introducing students on the basics of #textencoding (♥️#unicode, #xml, @TEIConsortium), #geneticediting , #criticalApparatus , managing dhinfra.at (#gpu clusters for the Austrian #dh community), talking about scholarly editing Ceija #Stojka notebooks ceijastojka.org/12924460-the-n and chatting with Dominik Kowald from KNOW Center Graz about #recsys for DH. Now to preparing my presentation on gams.uni-graz.at/o:voccod the #SKOS version Denis Muzerelle's Vocabulaire de la #Codicologie .

dhinfra.atDHInfra.at - Digital Humanities Infrastructure Austria

What if #RecSys put as much energy into explicitly anti-fascist recommendations as it does into viewpoint-diverse recommendations?

What if the goal isn’t to model tolerance for diversity, but to identify items that will increase the reader’s empathy and understanding of their neighbors and their needs?

This paper provides some thoughts applicable to that direction. doi.org/10.1108/JD-01-2020-000

doi.orgOn the problem of oppressive tastes in the public library | Emerald Insight1

Part2: #dailyreport #negativesampleing #sampling #llm #recsys
target word w and the negative samples

For binary Classification: Negative sampling transforms
the problem into a series of binary classification tasks,
where the model learns to distinguish between positive
and negative samples.

Example "The dog is playing with a bone," and assume a
window size of 2 positive samples for the target word
"dog" would include:
- ("dog", "The")
- ("dog", "is")
- ("dog", "playing")
- ("dog", "with")
- ("dog", "a")
- ("dog", "bone")

Negative Samples: ("dog", "car"), ("dog", "apple"),
("dog", "house"), ("dog", "tree")

calc: logσ(vdog​⋅vbone​) +
logσ(−vdog​⋅vcar​)+logσ(−vdog​⋅vapple​)+logσ(−vdog​⋅vhouse​)

Part2: Part1: #dailyreport #negativesampleing #sampling #llm #recsys
We use:
: L = log(sigmoid(v_w * v_c)) + sum(log(sigmoid(-v_w * v_neg_i))) for i in range(k)
where:
- vw - vector representation of the target word
- vc - vector representation of context word
- v_neg_i - vector representations of the k negative
sample.
- k - number of negative samples
- log(sigmoid(v_w * v_c)) - positive term with dot product
or cosine simularity.
- sum(log(sigmoid(-v_w * v_neg_i))) for i in range(k) -
negative term - minimize the similarity between the

Part1: Part1: #dailyreport #negativesampleing #sampling #llm #recsys
Negative sampling used in NLP, RecSys,retrival and
classification tasks to address the computational
challenges associated with large vocabularies or item
sets. It modifies the training objective: Instead of
computing the softmax over the entire vocabulary, it
focuses on distinguishing the target word from a few
randomly selected "noise" or "negative" words.

Instead of loss:
: softmax(x_i) = e^(x_i) / (sum of e^(x_j) for all j from 1 to n)
: L = -log(p(w | c)) = -log(softmax(x_i))