Smart negative sampling strategies by KulikovNikita · Pull Request #88 · sb-ai-lab/RePlay

KulikovNikita · 2026-02-02T04:55:48Z

Two negative sampling strategies were introduced:

FrequencyNegativeSamplingTransform - based on frequency of already made choices of negative indices

Computes frequencies of already made choices (number of times each item was already sampled)
Adjusts probabilities according to this frequencies like: softmax(1.0 / (1.0 + frequency))

ThresholdNegativeSamplingTransform - works as latter but ignores maxed-out items completely

Metrics on MovieLens (as in example):
0) UniformNegativeSamplingTransform(num_negative_samples = 128) - baseline

Validation (81-st epoch):

      "k              1        10        20         5\n",
      "map     0.019705  0.060155  0.067919  0.048319\n",
      "ndcg    0.019705  0.091545  0.120256  0.062412\n",
      "recall  0.019705  **0.196721**  0.311144  0.105647\n",

Link to proof.

Test (best epoch):

       "k                 1        10        20         5\n",
       "MAP        0.016065  0.054039  0.061749  0.043969\n",
       "Precision  0.016065  0.017655  0.014458  0.020073\n",
       "Recall     0.016065  **0.176549**  0.289169  0.100364\n",

Link to proof.

FrequencyNegativeSamplingTransform(num_negative_samples = 128)

Validation (96-th epoch):

      "k             1        10        20         5\n",
      "map     0.01954  0.060201  0.068178  0.048722\n",
      "ndcg    0.01954  0.091547  0.121043  0.063432\n",
      "recall  0.01954  **0.196225**  0.313794  0.108627\n",

Link to proof.

Test (best checkpoint):

       "k                 1        10        20         5\n",
       "MAP        0.020371  0.061189  0.068575  0.051101\n",
       "Precision  0.020371  0.018731  0.014790  0.022093\n",
       "Recall     0.020371  **0.187314**  0.295793  0.110467\n",

Link to proof.

Notice: I suspect that better results on the test fold are due to the model seeing more diverse embedding set.

ThresholdNegativeSamplingTransform(num_negative_samples = 128)

Validation (99-th epoch):

      "k             1        10        20         5\n",
      "map     0.01391  0.056347  0.064440  0.044130\n",
      "ndcg    0.01391  0.089694  0.119503  0.059647\n",
      "recall  0.01391  **0.201358**  0.319921  0.107468\n",

Link to proof.

Test (best checkpoint):

       "k                 1        10        20         5\n",
       "MAP        0.016065  0.056637  0.064067  0.045948\n",
       "Precision  0.016065  0.018665  0.014798  0.021133\n",
       "Recall     0.016065  **0.186651**  0.295959  0.105664\n",

Link to proof.

neil-kulikov added 5 commits February 2, 2026 08:06

Gitignore updated to recent examples

2f29eb4

Unnecessary files removed

584b54a

Documentation updated

5e98b75

Formatted and checked

bd614aa

Ruff format applied

3e03d8e

KulikovNikita force-pushed the feature/smart-sampiling branch from 73cc43b to 3e03d8e Compare February 2, 2026 05:13

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Smart negative sampling strategies#88

Smart negative sampling strategies#88
KulikovNikita wants to merge 5 commits intosb-ai-lab:mainfrom
KulikovNikita:feature/smart-sampiling

KulikovNikita commented Feb 2, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

KulikovNikita commented Feb 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

KulikovNikita commented Feb 2, 2026 •

edited

Loading