Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
88 changes: 60 additions & 28 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,48 +11,48 @@

# HydroGym: Reinforcement Learning for Fluid Dynamics

**88 environments | 6 solver backends | 2D & 3D | Ready for RL training**
**61+ environments | 6 solver backends | 2D & 3D | Ready for RL training**

HydroGym is a comprehensive platform for applying reinforcement learning to fluid dynamics and flow control. With environments ranging from canonical benchmarks to turbulent flows, HydroGym provides a standardized Gymnasium-compatible interface for training RL agents on challenging CFD problems.

> **Paper**: Lagemann, C., et al. (2025). *HydroGym: A reinforcement learning platform for fluid dynamics.* arXiv:2512.17534 [[arxiv]](https://arxiv.org/abs/2512.17534)

## Key Features

- **Diverse Environments**: 88 pre-configured environments across 6 CFD solvers
- **Diverse Environments**: 61+ pre-configured environments across 6 CFD solvers
- **Standard RL Interface**: Gymnasium-compatible API works with Stable-Baselines3, RLlib, and other RL libraries
- **Compute Efficient**: Highly optimized GPU & CPU backends for efficient RL deployment ranging from local workstations to exascale HPC systems
- **Scalable**: MPI-parallelized solvers with distributed RL training support
- **Multiple Backends**: Finite Element (Firedrake), Lattice Boltzmann (MAIA LBM), Finite Volume (MAIA FV), Spectral Element (NEK5000), Fully Differentiable solvers (JAX-Fluids)
- **2D & 3D**: From simple 2D benchmarks to complex 3D turbulent flows (Re up to 400,000)
- **Research-Ready**: Includes checkpoints, observation strategies, and reward formulations managed by a complementary HuggingFace repository
- **Research-Ready**: Managed by a complementary HuggingFace repository

## Quick Start with Docker (Recommended)

**We strongly recommend using our pre-configured Docker containers** for hassle-free setup:

```bash
# For NVIDIA GPUs (CUDA)
docker pull clagemann/maia-cuda-12.8.1:latest
docker pull clagemann/hydrogym-nvhpc-26.1_cuda-12.9_hopper_blackwell:latest
# or
docker pull clagemann/hydrogym-nvhpc-26.1_cuda-12.9_turing_ampere:latest

# For AMD GPUs (ROCm)
docker pull clagemann/maia-rocm-6.3.3:latest
docker pull clagemann/hydrogym-rocm-6.3.3:latest

# Run container
docker run -it --gpus all clagemann/maia-cuda-12.8.1:latest
docker run -it --gpus all clagemann/hydrogym-nvhpc-26.1_cuda-12.9_turing_ampere:latest
```
## Available Environments

HydroGym provides **88 environments** across 6 solver backends:
HydroGym provides **61 environments** across 6 solver backends:

| Solver Backend | Count | Description | Dimensions |
|----------------|-------|-------------|------------|
| **Firedrake** (FEM) | 20 | Canonical flow control benchmarks | 2D |
| **MAIA LBM** | 55 | Lattice Boltzmann method environments | 2D, 3D |
| **MAIA Structured FV** | 8 | High-Reynolds turbulent boundary layers | 3D |
| **NEK5000** | 1 | Spectral element turbulent channel flow | 3D |
| **NEK5000** | 2 | Spectral element turbulent channel flow | 3D |
| **JAX** | 2 | Differentiable fluid dynamics | 2D, 3D |
| **JAX-Fluids** | 2 | Compressible jet engine control | 2D, 3D |
| **JAX-Fluids** | 2 | Compressible shock vector control | 2D, 3D |

### Environment Categories

Expand All @@ -65,25 +65,29 @@ HydroGym provides **88 environments** across 6 solver backends:
- Square cylinder (Re=200-3900, 2D/3D)
- Sphere (Re=300-3700, 3D)
- Cube (Re=300-3700, 3D)
- Turbulent channel flow (Re_tau=180, 3D)
- Turbulent channel flow (Re_tau=206, 3D)

**Airfoil Control**:
- NACA0012 steady (Re=100-50000, AOA=12-40°, 2D/3D)
- NACA0012 with gust disturbance (Re=100-50000, 2D/3D)

**High Reynolds Number Flows**:
- Zero-pressure-gradient turbulent boundary layer with jet/surface wave actuation (Re_Tau=1000-5000, 3D)
- Zero-pressure-gradient turbulent boundary layer with jet/surface wave actuation (Re_Tau=180-2200, 3D)
- DRA2303 airfoil with jet/surface wave actuation (Re=400000, Ma=0.2-0.7, 3D)
- NACA0012 airfoil with jet actuation (Re=200000, 3D)

**Fully Differentiable Flows**:
- Jet engine thrust vectoring (TVC/TVD, Ma=2.2, 2D/3D)
- Kolmogorov flow (Re=1000, 2D)
- Shock-Vector Control in single divergent nozzle (SVC, Ma>1.0, 2D/3D)
- Turbulent channel flow (Re_tau=180, 3D)
- Kolmogorov flow (up to Re=1000, 2D)

## Environment Checkpoints

See [`existing_environments.yaml`](existing_environments.yaml) for complete list with exact naming conventions.
All required environment checkpoints are available via [HuggingFace](https://huggingface.co/datasets/dynamicslab/HydroGym-environments/tree/main) and are downloaded on the fly when an environment is first created (internet connection required). If no internet connection is available at runtime — e.g. on compute nodes in HPC clusters — you can pre-download the environment files as outlined in [examples/maia/README.md](examples/maia/README.md).

## Examples

HydroGym includes comprehensive examples for each solver backend:
HydroGym includes comprehensive examples for each solver backend (internet connection required). We highly recommend using our provided docker containers:

### Firedrake Examples

Expand All @@ -93,10 +97,10 @@ See [examples/firedrake/getting_started/](examples/firedrake/getting_started/) f
cd examples/firedrake/getting_started

# Test environment interactively
python test_firedrake_env.py --environment cylinder --num-steps 10
./run_example_docker.sh

# Train with Stable-Baselines3
python train_sb3_firedrake.py --env cylinder --algo PPO --total-timesteps 100000
./run_example_docker.sh train
```

### MAIA Examples
Expand All @@ -106,26 +110,52 @@ See [examples/maia/getting_started/](examples/maia/getting_started/) for MPMD co
```bash
cd examples/maia/getting_started

# Prepare workspace (downloads from Hugging Face Hub)
python prepare_workspace.py --env Cylinder_2D_Re200 --work-dir ./test_run
# Prepare workspace (downloads from Hugging Face Hub) and
# Run with MPMD execution (1 Python + 1 MAIA process on GPU)
./run_example_docker.sh

# Run with MPMD execution (1 Python + 1 MAIA process)
cd test_run
mpirun -np 1 python ../test_maia_env.py --environment Cylinder_2D_Re200 : -np 1 maia properties.toml
# Prepare workspace and train with Stable-Baselines3
./run_example_docker.sh train
```

### NEK5000 Examples

See [examples/nek/getting_started/](examples/nek/getting_started/) for interface patterns.

```bash
cd examples/nek/getting_started/1_nekenv_single
cd examples/nek/getting_started

# Test single-agent environment
mpirun -np 1 python test_nek_direct.py --steps 100 : -np 10 nek5000
cd 1_nekenv_single
./run_nekenv_docker.sh

# ... or train with pettinzoo wrapper and SB3
cd 3_pettingzoo
./run_pettingzoo_docker.sh train

# ... or run zero-shot transfer learning
cd 6_zeroshot_wing_demo
./run_pettingzoo_docker.sh
```

# Train with SB3
mpirun -np 1 python train_sb3_nek_direct.py --env TCFmini_3D_Re180 --algo PPO : -np 10 nek5000
### JAX Examples

See [examples/jax/getting_started/](examples/jax/getting_started/) for detailed documentation.

```bash
cd examples/jax/getting_started

# Test Kolmogorov flow environment
cd 1_kolmogorov
./run_nekenv_docker.sh

# ... or test channel flow environment
cd 2_channel
./run_channel_docker.sh strong_actuation

# ... or run zero-shot transfer learning
cd 3_ppo
./run_ppo_docker.sh --env channel --num-envs 1 --num-steps 10 --num-minibatches 5
```

## Training RL Agents
Expand Down Expand Up @@ -158,6 +188,8 @@ model = PPO("MlpPolicy", env, verbose=1)
model.learn(total_timesteps=100000)
```

See also provided [examples/](examples/) for more details how to leverage individual solver backends for training.

## Advanced Features

- **Checkpoint management**: Automatic loading from Hugging Face Hub
Expand Down
1 change: 1 addition & 0 deletions examples/maia/getting_started/prepare_workspace.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@
"""

import argparse

from hydrogym.maia.workspace import prepare_maia_workspace # avoids mpi4py init

if __name__ == "__main__":
Expand Down
6 changes: 3 additions & 3 deletions examples/maia/getting_started/train_sb3_maia.py
Original file line number Diff line number Diff line change
Expand Up @@ -40,10 +40,10 @@
tensorboard --logdir logs/
"""

import sys
import argparse
from pathlib import Path
import sys
from datetime import datetime
from pathlib import Path
from typing import List, Tuple

import numpy as np
Expand Down Expand Up @@ -98,9 +98,9 @@ def train_single_agent(args):

# Import SB3 components
try:
from stable_baselines3.common.callbacks import CheckpointCallback
from stable_baselines3.common.monitor import Monitor
from stable_baselines3.common.vec_env import DummyVecEnv, VecNormalize
from stable_baselines3.common.callbacks import CheckpointCallback

if args.algo == "PPO":
from stable_baselines3 import PPO as Algorithm
Expand Down
99 changes: 0 additions & 99 deletions existing_environments.yaml

This file was deleted.

Loading
Loading