Integrate AIDE ML code optimization by efsiatras · Pull Request #2258 · oumi-ai/oumi

efsiatras · 2026-03-15T14:13:20Z

Description

This PR adds a new oumi aide command that brings agentic code optimization to Oumi using AIDE, the open-source ML engineering agent by Weco AI.

Why AIDE

Unlike oumi tune which searches a predefined hyperparameter space, AIDE uses an LLM to write, execute, and iteratively improve Python code through tree search (draft → debug → improve). It can explore code-level changes that parameter search cannot — reward function design, evaluation logic, training strategies.

AIDE implements the algorithm from arXiv:2502.13138 and has been independently validated by OpenAI (MLE-bench, 4x more medals than the best linear agent across 75 Kaggle competitions), Meta (LLM Speedrunning, AI Research Agents), Sakana AI (AI Scientist-v2), and METR (RE-Bench).

This PR integrates the open-source research version (aideml) which runs fully local. Weco AI also offers a separate production platform with experiment tracking and cloud-hybrid architecture.

Architecture

The integration mirrors the existing Optuna pattern:

core/configs/params/aide_params.py → AideParams (mirrors TuningParams)
core/configs/aide_config.py → AideConfig (mirrors TuningConfig)
core/agentic/base_agentic_optimizer.py → BaseAgenticOptimizer ABC (mirrors BaseTuner)
core/agentic/aide_optimizer.py → AideOptimizer (mirrors OptunaTuner)
core/agentic/workspace_helper.py → Helper script injected into workspace
builders/agentic.py → build_agentic_optimizer() (mirrors build_tuner)
aide.py → Orchestration (mirrors tune.py)
cli/aide_cmd.py → CLI command (mirrors cli/tune.py)

Optimization surfaces

CONFIG_SEARCH — modifies training hyperparameters (learning rate, optimizer, LoRA rank, etc.)
REWARD_FUNCTION — designs reward functions for GRPO/RLHF
EVAL_FUNCTION — generates custom evaluation functions
FULL_PIPELINE — writes complete training scripts with no constraints

Workspace helper

Instead of exposing raw Oumi APIs to the AIDE agent, we inject an oumi_helper.py into its workspace at runtime. The agent calls run_trial(), test_reward(), or test_eval() — these handle config loading from YAML, environment setup, and metric extraction. This works with any model, trainer, or config.

Dependency handling

aideml pins exact versions of pandas/numpy/scipy that conflict with Oumi's requirements, but the code runs fine with newer versions. For uv users, override-dependencies in pyproject.toml resolves this automatically. For pip users, the workaround is documented in the extra definition.

What's included

Source — 17 new files following Oumi conventions (copyright headers, dataclass configs, try/except optional imports, builder pattern, device cleanup, distributed support, telemetry).

Tests — 59 tests across 3 files:

test_aide_params.py — param validation, YAML roundtrip, CLI overrides
test_aide_optimizer.py — task descriptions for all 4 surfaces, config conversion, optimizer lifecycle
test_cli_aide.py — CLI with mocked AIDE runs

Notebooks — 3 tutorials:

AIDE Agentic Optimization — CONFIG_SEARCH, the main tutorial
AIDE Reward Function Design — tests existing reward as baseline, lets AIDE redesign it
AIDE Custom Evaluation — dataset exploration, eval pipeline breakdown, AIDE generation

Config and examples:

configs/recipes/smollm/aide/135m/aide.yaml with smollm-135m alias
scripts/examples/aide/run_aide_optimization.py

How to test

# Install
uv pip install -e ".[aide]"

# Unit tests (no GPU or API key needed)
pytest tests/unit/core/configs/params/test_aide_params.py tests/unit/core/agentic/test_aide_optimizer.py tests/unit/cli/test_cli_aide.py -v

# CLI
oumi aide --help

# End-to-end (requires LLM access + GPU)
oumi aide -c smollm-135m --aide.steps=5

Related issues

New feature; no existing issue.

Before submitting

This PR only changes documentation.
Did you read the contributor guideline Pull Request guidelines?
Did you link the issue(s) related to this PR in the section above?
Did you add / update tests where needed?

Introduce configuration dataclasses for AIDE ML integration, following the exact same pattern as Optuna/TuningParams integration: - AideParams, AideLLMParams, AideSearchParams, AideExecParams in core/configs/params/aide_params.py (mirrors tuning_params.py) - AideConfig in core/configs/aide_config.py (mirrors tuning_config.py) - AideOptimizationSurface enum for 4 surfaces: CONFIG_SEARCH, REWARD_FUNCTION, EVAL_FUNCTION, FULL_PIPELINE - Exports from core/configs/__init__.py - AliasType.AIDE for CLI config resolution - 32 unit tests covering validation, YAML roundtrip, CLI overrides

Core integration layer following the Optuna/BaseTuner pattern: - BaseAgenticOptimizer ABC in core/agentic/ (mirrors BaseTuner) - AideOptimizer with try/except import, wraps AIDE Agent/Journal/Interpreter - build_agentic_optimizer() factory in builders/agentic.py (mirrors build_tuner) - aide.py top-level orchestration with full lifecycle: dir creation, logging setup, telemetry, device info, search loop, distributed cleanup - CLI command oumi aide with device_cleanup and limit_per_process_memory - aide() exported from oumi.__init__ (mirrors tune()) - AideResult dataclass for structured optimization results - aide optional extra in pyproject.toml (mirrors tune extra for optuna) - All 4 surfaces: CONFIG_SEARCH, REWARD_FUNCTION, EVAL_FUNCTION, FULL_PIPELINE

- SmolLM 135M AIDE recipe: configs/recipes/smollm/aide/135m/aide.yaml - smollm-135m alias with AliasType.AIDE for CLI config resolution - Example script: scripts/examples/aide/run_aide_optimization.py demonstrating programmatic AIDE usage (mirrors custom_evaluation.py) - 13 unit tests for AideOptimizer: task description building for all 4 surfaces, OmegaConf config conversion, optimizer init/cleanup, search summary, empty journal handling, AideResult construction - pytest.skip pattern for optional aideml dependency (mirrors optuna tests) - Fixed empty journal handling in get_best_solution()

- Add tests/unit/cli/test_cli_aide.py with 5 CLI tests (mirrors test_cli_synth.py pattern: basic run, overrides, result display, missing config error, oumi:// prefix resolution) - Add complete_aide_config() to cli/completions.py for shell tab completion - Wire autocompletion to CLI config option - Fix get_search_summary() crash on empty journal (try/except + type cast) - 50 tests passing across params, optimizer, and CLI test files

…nts: - Workspace helper (oumi_helper.py) with run_trial(), test_reward(), test_eval(), get_config_fields() — agent imports these instead of raw Oumi API, eliminating pad_token/dataclass/path bugs - Copy base config YAML into workspace with multi-root path resolution - Fix isatty crash (OUMI_DISABLE_RICH_LOGGING + try/except in logging.py) - Fix save_run Path crash, missing OmegaConf fields (exp_name etc.) - Restore env var in cleanup(), add error logging, fix type hints - Detailed step logging: action, plan, error output, analysis - uv override-dependencies for clean aideml>=0.2.2 install

gitar-bot · 2026-03-15T14:13:24Z

Important

Upgrade your plan to unlock code review, CI analysis, custom rules, and more.

…entic-optimization # Conflicts: # pyproject.toml

…entic-optimization

gitar-bot · 2026-04-11T19:17:22Z

Gitar is working

_Gitar

oelachqar · 2026-04-14T22:10:34Z

Hi @efsiatras,

Thank you so much for the contribution! Were you able to use the integration? It would be interesting to have some results demonstrating the effectiveness for LLM fine-tuning

efsiatras added 5 commits March 10, 2026 14:32

efsiatras marked this pull request as ready for review March 15, 2026 16:08

sentry bot reviewed Mar 15, 2026

View reviewed changes

Comment thread src/oumi/core/agentic/workspace_helper.py Outdated

efsiatras added 2 commits March 15, 2026 18:56

Fix train() return None: pass additional_tuning_kwargs

f7b6f36

Merge branch 'main' into efsiatras/add-aide-agentic-optimization

0db1387

sentry bot reviewed Mar 16, 2026

View reviewed changes

Comment thread src/oumi/core/agentic/workspace_helper.py

efsiatras added 7 commits March 16, 2026 17:56

Add defensive checks in workspace_helper.py

7548b10

Merge branch 'main' into efsiatras/add-aide-agentic-optimization

be130be

Merge remote-tracking branch 'origin/main' into efsiatras/add-aide-ag…

9b12fa9

…entic-optimization # Conflicts: # pyproject.toml

Merge branch 'main' into efsiatras/add-aide-agentic-optimization

d7615db

Merge branch 'main' into efsiatras/add-aide-agentic-optimization

373f6de

Merge branch 'main' into efsiatras/add-aide-agentic-optimization

567c7eb

Merge remote-tracking branch 'origin/main' into efsiatras/add-aide-ag…

3cf6ffa

…entic-optimization

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Integrate AIDE ML code optimization#2258

Integrate AIDE ML code optimization#2258
efsiatras wants to merge 14 commits intooumi-ai:mainfrom
efsiatras:efsiatras/add-aide-agentic-optimization

efsiatras commented Mar 15, 2026

Uh oh!

gitar-bot bot commented Mar 15, 2026

Uh oh!

Uh oh!

Uh oh!

gitar-bot bot commented Apr 11, 2026

Uh oh!

oelachqar commented Apr 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

efsiatras commented Mar 15, 2026

Description

Why AIDE

Architecture

Optimization surfaces

Workspace helper

Dependency handling

What's included

How to test

Related issues

Before submitting

Uh oh!

gitar-bot bot commented Mar 15, 2026

Uh oh!

Uh oh!

Uh oh!

gitar-bot bot commented Apr 11, 2026

Uh oh!

oelachqar commented Apr 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants