sigmoid.social is one of the many independent Mastodon servers you can use to participate in the fediverse.
A social space for people researching, working with, or just interested in AI!

Server stats:

723
active users

#negativesampleing

0 posts0 participants0 posts today
Anoncheg<p>Part2: <a href="https://techhub.social/tags/dailyreport" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>dailyreport</span></a> <a href="https://techhub.social/tags/negativesampleing" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>negativesampleing</span></a> <a href="https://techhub.social/tags/sampling" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>sampling</span></a> <a href="https://techhub.social/tags/llm" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>llm</span></a> <a href="https://techhub.social/tags/recsys" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>recsys</span></a> <br> target word w and the negative samples</p><p>For binary Classification: Negative sampling transforms<br> the problem into a series of binary classification tasks,<br> where the model learns to distinguish between positive<br> and negative samples.</p><p>Example "The dog is playing with a bone," and assume a<br> window size of 2 positive samples for the target word<br> "dog" would include:<br>- ("dog", "The")<br>- ("dog", "is")<br>- ("dog", "playing")<br>- ("dog", "with")<br>- ("dog", "a")<br>- ("dog", "bone")</p><p>Negative Samples: ("dog", "car"), ("dog", "apple"),<br> ("dog", "house"), ("dog", "tree")</p><p>calc: logσ(vdog​⋅vbone​) +<br> logσ(−vdog​⋅vcar​)+logσ(−vdog​⋅vapple​)+logσ(−vdog​⋅vhouse​)<br>蠡</p>
Anoncheg<p>Part2: Part1: <a href="https://techhub.social/tags/dailyreport" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>dailyreport</span></a> <a href="https://techhub.social/tags/negativesampleing" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>negativesampleing</span></a> <a href="https://techhub.social/tags/sampling" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>sampling</span></a> <a href="https://techhub.social/tags/llm" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>llm</span></a> <a href="https://techhub.social/tags/recsys" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>recsys</span></a> <br> We use:<br>: L = log(sigmoid(v_w * v_c)) + sum(log(sigmoid(-v_w * v_neg_i))) for i in range(k)<br>where:<br>- vw - vector representation of the target word<br>- vc - vector representation of context word<br>- v_neg_i - vector representations of the k negative<br> sample.<br>- k - number of negative samples<br>- log(sigmoid(v_w * v_c)) - positive term with dot product<br> or cosine simularity.<br>- sum(log(sigmoid(-v_w * v_neg_i))) for i in range(k) -<br> negative term - minimize the similarity between the</p>
Anoncheg<p>Part1: Part1: <a href="https://techhub.social/tags/dailyreport" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>dailyreport</span></a> <a href="https://techhub.social/tags/negativesampleing" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>negativesampleing</span></a> <a href="https://techhub.social/tags/sampling" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>sampling</span></a> <a href="https://techhub.social/tags/llm" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>llm</span></a> <a href="https://techhub.social/tags/recsys" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>recsys</span></a><br>Negative sampling used in NLP, RecSys,retrival and<br> classification tasks to address the computational<br> challenges associated with large vocabularies or item<br> sets. It modifies the training objective: Instead of<br> computing the softmax over the entire vocabulary, it<br> focuses on distinguishing the target word from a few<br> randomly selected "noise" or "negative" words.</p><p>Instead of loss:<br>: softmax(x_i) = e^(x_i) / (sum of e^(x_j) for all j from 1 to n)<br>: L = -log(p(w | c)) = -log(softmax(x_i))</p>