Skip to content

feat: difficulty based curriculum sampling strategy#428

Open
RUFFY-369 wants to merge 5 commits intoNousResearch:mainfrom
RUFFY-369:feat/curriculum-scheduler
Open

feat: difficulty based curriculum sampling strategy#428
RUFFY-369 wants to merge 5 commits intoNousResearch:mainfrom
RUFFY-369:feat/curriculum-scheduler

Conversation

@RUFFY-369
Copy link
Copy Markdown

@RUFFY-369 RUFFY-369 commented Mar 30, 2026

PR Type

  • Non-Environment PR - Complete Description, Related Issues & Type of Change sections

📝 General Information

Description

Implemented an Easy-First CurriculumScheduler to help with sample efficiency in complex tasks. It maps training items to difficulty bins and shifts the sampling distribution as the model hits "competence" thresholds.

I added three main strategies: easy_first, hard_first, and weighted_uniform. The goal is to let the model master the basics before the pipeline introduces high-difficulty edge cases.

Related Issues

Type of Change

  • New feature (non-breaking change which adds functionality)

✅ Developer & Reviewer Checklist

  • Code follows project style (black, isort, flake8 pass with pre-commit)
  • I have performed a self-review of my own code
  • New and existing unit tests pass locally with my changes (22/22 verified)
  • Docstrings added for all new public classes / functions
  • If .env vars required, did you add it to the .env.example in repo root? (N/A)

RUFFY-369 and others added 5 commits March 28, 2026 03:39
Add CurriculumScheduler to atroposlib/envs/ with:
- EMA-based per-item difficulty tracking from reward signals
- Quantile-based difficulty binning (configurable N bins)
- Three sampling strategies: uniform, easy_first, competence_based
- Competence-based strategy cites Platanios et al. 2019
- Opt-in integration in BaseEnv via 3 config fields
- WandB metrics for difficulty distribution tracking
- Checkpoint save/load support

22/22 tests passing.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant