feat: difficulty based curriculum sampling strategy by RUFFY-369 · Pull Request #428 · NousResearch/atropos

RUFFY-369 · 2026-03-30T21:05:29Z

PR Type

Non-Environment PR - Complete Description, Related Issues & Type of Change sections

📝 General Information

Description

Implemented an Easy-First CurriculumScheduler to help with sample efficiency in complex tasks. It maps training items to difficulty bins and shifts the sampling distribution as the model hits "competence" thresholds.

I added three main strategies: easy_first, hard_first, and weighted_uniform. The goal is to let the model master the basics before the pipeline introduces high-difficulty edge cases.

Related Issues

Part of [Enhancement] RL Training Infrastructure Stabilization & Observability #431 (RL Infrastructure Enhancements)
Depends on feat: online reward normalization (Welford’s algorithm) #427

Type of Change

New feature (non-breaking change which adds functionality)

✅ Developer & Reviewer Checklist

Code follows project style (black, isort, flake8 pass with pre-commit)
I have performed a self-review of my own code
New and existing unit tests pass locally with my changes (22/22 verified)
Docstrings added for all new public classes / functions
If .env vars required, did you add it to the .env.example in repo root? (N/A)

Add CurriculumScheduler to atroposlib/envs/ with: - EMA-based per-item difficulty tracking from reward signals - Quantile-based difficulty binning (configurable N bins) - Three sampling strategies: uniform, easy_first, competence_based - Competence-based strategy cites Platanios et al. 2019 - Opt-in integration in BaseEnv via 3 config fields - WandB metrics for difficulty distribution tracking - Checkpoint save/load support 22/22 tests passing.

for more information, see https://pre-commit.ci

RUFFY-369 and others added 5 commits March 28, 2026 03:39

style: fix linting and imports in curriculum scheduler

38731e0

fix: pin antlr4-python3-runtime for compatibility

95c3e49

Merge branch 'NousResearch:main' into feat/curriculum-scheduler

a708f74

[pre-commit.ci] auto fixes from pre-commit.com hooks

55ddbf4

for more information, see https://pre-commit.ci

This was referenced Mar 30, 2026

feat: API performance tracking and final infra integration #430

Open

[Enhancement] RL Training Infrastructure Stabilization & Observability #431

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: difficulty based curriculum sampling strategy#428

feat: difficulty based curriculum sampling strategy#428
RUFFY-369 wants to merge 5 commits intoNousResearch:mainfrom
RUFFY-369:feat/curriculum-scheduler

RUFFY-369 commented Mar 30, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

RUFFY-369 commented Mar 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Type

📝 General Information

Description

Related Issues

Type of Change

✅ Developer & Reviewer Checklist

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

RUFFY-369 commented Mar 30, 2026 •

edited

Loading