Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
77 commits
Select commit Hold shift + click to select a range
1b4c179
Update README.md with arxiv
RobertTLange Sep 25, 2025
2fb7548
add google gemini embeding model
takeruhukushima Sep 25, 2025
27af71c
fix: Fix database summary when patch_name metadata is missing
dexhunter Sep 25, 2025
9586cdb
Update README.md
RobertTLange Sep 26, 2025
396c66a
Merge pull request #2 from dexhunter/fix/display
RobertTLange Sep 26, 2025
a60bc9e
docs: change repo name on the onboarding doc
Koki-Kazaore Sep 28, 2025
0003552
Update README
Aladoro Sep 28, 2025
be2e203
Added a doc explaining how to add suport for a local LLM and embeddin…
vicruz99 Oct 12, 2025
bf0c1d4
Add rust to supported languages
LiaCastaneda Oct 13, 2025
77d1819
Ensure setuptools discovers subpackages
iwiwi Oct 14, 2025
929f072
Mark shinka.webui as a package
iwiwi Oct 14, 2025
59a338c
Merge pull request #18 from SakanaAI/fix-packaging
RobertTLange Oct 15, 2025
23ace36
fix apply_full.py when the patch has incomplete (0,1) markers instead…
51616 Oct 24, 2025
06209a2
Merge pull request #21 from 51616/fix-full-patch-no-markers-bug
RobertTLange Oct 27, 2025
c9c468b
Merge pull request #12 from vicruz99/feature/local-models
RobertTLange Oct 27, 2025
c5b1abe
Update README.md
RobertTLange Oct 27, 2025
ccc1326
Merge branch 'main' into lia/add-support-for-rust
RobertTLange Oct 27, 2025
e8ef6de
Merge pull request #15 from LiaCastaneda/lia/add-support-for-rust
RobertTLange Oct 27, 2025
d2211b2
Merge pull request #7 from Koki-Kazaore/docs/change_repo_name
RobertTLange Oct 27, 2025
ded4576
Update inspirations.py - archive
RobertTLange Oct 27, 2025
7ceea8c
Merge pull request #1 from takeruhukushima/main
RobertTLange Oct 27, 2025
ee6e8a5
Update dependencies gemini embed
RobertTLange Oct 27, 2025
a759778
Update dbase.py path default
RobertTLange Oct 30, 2025
c097a88
Fix reasoning token sampling
RobertTLange Oct 30, 2025
6d5e208
Fix anthropic budget sampling
RobertTLange Oct 30, 2025
9b4d7c7
fix shinka_launch --help
RobertTLange Nov 2, 2025
d7a3f7e
fix wrap_eval catch
RobertTLange Nov 2, 2025
397e0fd
add documentation for resuming experiments
RobertTLange Nov 2, 2025
f6896dc
fix OAI dependency db for visualization
RobertTLange Nov 2, 2025
94a2805
Merge pull request #28 from SakanaAI/fix_minor
RobertTLange Nov 2, 2025
1d9d498
Fix init program island copying -> archive
RobertTLange Nov 2, 2025
2f01b3e
fix:GEMINI_API_KEY name error
takeruhukushima Nov 3, 2025
12738f2
Merge pull request #29 from takeruhukushima/rename_gemini_api
RobertTLange Nov 3, 2025
f5f7e68
use dependency-groups.dev
ifsheldon Nov 8, 2025
14739fc
Add support for Claude Sonnet 4.5 (claude-sonnet-4-5-20250929)
arun-pathiban-ddog Nov 8, 2025
7dd7245
Merge pull request #35 from ifsheldon/dev-group
RobertTLange Nov 8, 2025
4f0708b
Merge pull request #36 from arun-pathiban-ddog/add-claude-sonnet-4.5-…
RobertTLange Nov 8, 2025
ed9f51f
Add Swift language support
jeethu Nov 3, 2025
0437118
ignore warning for correct behavior when no improvement is detected, …
Aladoro Nov 11, 2025
831ddf6
Merge pull request #40 from SakanaAI/ignore-logsubtract-warning
RobertTLange Nov 11, 2025
259e786
Allow boolean flags for eval jobs
jm424 Nov 12, 2025
8a615a4
Merge pull request #41 from jm424/jai/allow_eval_job_bool_flags
RobertTLange Nov 13, 2025
3251a70
Add json support
jeremycochoy Nov 17, 2025
1ac33cc
Merge pull request #46 from jeremycochoy/feature/json_support
RobertTLange Nov 19, 2025
3fb579c
Merge branch 'main' into jeethu/swift
RobertTLange Nov 19, 2025
929090c
Merge pull request #37 from jeethu/jeethu/swift
RobertTLange Nov 19, 2025
ed8f1b4
llm: Add GPT-5.1 and Gemini 3 Pro models
jm424 Nov 19, 2025
70e485f
Merge pull request #48 from jm424/jai/add-newer-models
RobertTLange Nov 20, 2025
ecf762b
Update README.md
RobertTLange Nov 22, 2025
c686d7f
Update getting_started.md
RobertTLange Nov 22, 2025
bad5b37
Update apply_diff.py
RobertTLange Dec 3, 2025
e72fb60
[example] add autoformalization example from natural language to lean
Racemuis Dec 18, 2025
4cd2dc4
[feat] add lean 4 evaluation
Racemuis Dec 18, 2025
7333b8b
[feat] add lean 4 edit support
Racemuis Dec 18, 2025
7a1dc9d
[example] add an autoformulation example based on group theory
Racemuis Dec 18, 2025
91924ee
update gemini embedding price
ifsheldon Dec 26, 2025
580cefd
add gemini-3-flash-preview
ifsheldon Dec 26, 2025
5ee0e49
Merge pull request #59 from ifsheldon/gemini-embedding
RobertTLange Dec 27, 2025
9f9917c
Merge pull request #60 from ifsheldon/gemini3-flash
RobertTLange Dec 27, 2025
a510262
add GPT 5.2
ifsheldon Dec 26, 2025
6c278f2
Merge pull request #61 from ifsheldon/gpt52
RobertTLange Dec 29, 2025
c5ef8dc
[feat] remove specific sampling params to increase compatibility with…
Racemuis Jan 5, 2026
e9e9545
[core] add lean-interact dependency
Racemuis Jan 5, 2026
ba29305
[fix] add file name to proof generation function
Racemuis Jan 5, 2026
18c57ce
[fix] add try-catch clause for proof parsing
Racemuis Jan 5, 2026
e250a51
[core] add autoformalization example config
Racemuis Jan 5, 2026
329abc8
add Lean block markers to global definition
Racemuis Jan 5, 2026
926a373
Merge branch 'SakanaAI:main' into lean_support
Racemuis Jan 5, 2026
db37955
[remove] remove lean verification from async apply
Racemuis Feb 16, 2026
431e1ba
[remove] remove prover_model dependencies from Shinka's core
Racemuis Feb 16, 2026
68b8048
[remove] remove prover_model dependencies from Shinka's core
Racemuis Feb 16, 2026
cd6e9f8
[core] move lean utils to 'examples' folder
Racemuis Feb 16, 2026
ba18166
chore: merge main into lean_support
RobertTLange Mar 13, 2026
5363d5e
fix: clean lean utility lint issues
RobertTLange Mar 13, 2026
a049a4d
docs: scope lean setup to autoformalization example
RobertTLange Mar 13, 2026
7dbb5a0
feat: improve autoformalization lean workflow
RobertTLange Mar 13, 2026
6457d52
fix: tighten autoformalization evaluation semantics
RobertTLange Mar 13, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,7 @@ The framework supports **parallel evaluation of candidates** locally or on a Slu
| ⚡ **[Async Evolution](https://github.com/SakanaAI/ShinkaEvolve/blob/main/docs/async_evolution.md)** | High-performance async pipeline (5-10x speedup) | Concurrent processing, proposal/eval concurrency tuning |
| 🧠 **[Local LLM](https://github.com/SakanaAI/ShinkaEvolve/blob/main/docs/support_local_llm.md)** | How to connect and use local LLMs with Shinka | Running open-source models, integration tips, performance notes |
| 🤖 **[Agentic Usage](https://github.com/SakanaAI/ShinkaEvolve/blob/main/docs/agentic_usage.md)** | Run Shinka with Claude/Codex skills | CLI install, skill placement, setup/run workflows |
| ∀ **[Lean Autoformalization](https://github.com/SakanaAI/ShinkaEvolve/tree/main/examples/autoformalization)** | Lean 4 example task for theorem formalization | Lean task setup, proof generation, autoformalization workflow |


## Installation & Quick Start 🚀
Expand Down Expand Up @@ -99,6 +100,7 @@ For detailed installation instructions and usage examples, see the [Getting Star
| ⭕ [Circle Packing](https://github.com/SakanaAI/ShinkaEvolve/tree/main/examples/circle_packing) | Optimize circle packing to maximize radii. | `LocalJobConfig` |
| 🎮 [Game 2048](https://github.com/SakanaAI/ShinkaEvolve/tree/main/examples/game_2048) | Optimize a policy for the Game of 2048. | `LocalJobConfig` |
| ∑ [Julia Prime Counting](https://github.com/SakanaAI/ShinkaEvolve/tree/main/examples/julia_prime_counting) | Optimize a Julia solver for prime-count queries. | `LocalJobConfig` |
| ∀ [Lean Autoformalization](https://github.com/SakanaAI/ShinkaEvolve/tree/main/examples/autoformalization) | Formalize a natural-language group-theory statement in Lean 4. | `LocalJobConfig` or `SlurmCondaJobConfig` |
| ✨ [Novelty Generator](https://github.com/SakanaAI/ShinkaEvolve/tree/main/examples/novelty_generator) | Generate creative, surprising outputs (e.g., ASCII art). | `LocalJobConfig` |


Expand Down
1 change: 1 addition & 0 deletions docs/getting_started.md
Original file line number Diff line number Diff line change
Expand Up @@ -393,6 +393,7 @@ The `run_shinka_eval` function returns three values:
| **Circle Packing** | Optimize circle arrangements | Geometric optimization |
| **Agent Design** | Design AI agent scaffolds | Algorithm architecture |
| **ALE-Bench** | Optimize competitive programming solutions | Code optimization |
| **Lean Autoformalization** | Formalize mathematical statements in Lean 4 | Theorem formalization and proof generation |
| **Novelty Generator** | Generate diverse creative outputs | Open-ended exploration |


Expand Down
104 changes: 104 additions & 0 deletions examples/autoformalization/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,104 @@
# Autoformalization Example

Lean 4 example task for ShinkaEvolve. The evolving program is a partial Lean theorem statement in [`initial.lean`](./initial.lean). The evaluator asks an LLM prover to complete that statement, checks the generated proof with Lean, and scores the result from the produced proof.

## What It Does

- evolves Lean 4 theorem/problem formulations rather than Python code
- keeps Lean-specific proof generation and verification inside this example directory
- uses [`evaluate.py`](./evaluate.py) as the Shinka entrypoint
- uses [`utils_lean.py`](./utils_lean.py) for:
- proof-prompt construction
- prover calls through an OpenAI-compatible client
- Lean validation through `lean-interact`

## Prerequisites

- Python environment with ShinkaEvolve installed
- `OPENAI_API_KEY` set for the prover model
- Lean toolchain available locally
- `lean-interact` installed into the same environment
- `lake` available on `PATH`

Lean setup is mandatory for this example. If `lake` is missing, evaluation will fail.

## Setup

From repo root:

```bash
uv venv --python 3.11
source .venv/bin/activate
uv pip install -e .
uv pip install lean-interact
```

Install Lean via `elan` if you do not already have it:

```bash
curl https://raw.githubusercontent.com/leanprover/elan/master/elan-init.sh -sSf | sh -s -- -y
export PATH="$HOME/.elan/bin:$PATH"
```

Verify the Lean toolchain before running the example:

```bash
which lean
which lake
```

If you use plain `pip` instead of `uv`:

```bash
pip install -e .
pip install lean-interact
```

Set the prover key:

```bash
export OPENAI_API_KEY=...
```

Optional: choose a different prover model at runtime with `--prover_model`.

## Files

- `initial.lean`: seed Lean formalization with `EVOLVE-BLOCK` markers
- `evaluate.py`: Shinka evaluation entrypoint; runs proof generation + scoring
- `utils_lean.py`: Lean prompt construction, proof post-processing, Lean validation

## Running It

Single evaluation:

```bash
python examples/autoformalization/evaluate.py \
--program_path examples/autoformalization/initial.lean \
--results_dir results/autoformalization_manual \
--lean_timeout 300
```

Hydra/Shinka launch:

```bash
python examples/autoformalization/run_evo.py --lean_timeout 300
```

The task preset lives at [`shinka/configs/task/autoformalization.yaml`](../../shinka/configs/task/autoformalization.yaml).

## Evaluation Flow

1. Shinka mutates the editable block in `initial.lean`.
2. `evaluate.py` passes the current Lean file to `generate_proof`.
3. The prover model completes the proof from the Lean prompt.
4. `utils_lean.py` post-processes the output and validates it with Lean.
5. Metrics are written to `metrics.json` and `correct.json`.

## Notes

- `lean-interact` is example-local on purpose; it is not required for the base Shinka package install.
- This example currently assumes an OpenAI-compatible prover client.
- Lean verification can be slow on first run because toolchain / dependency setup may be downloaded.
- `lake` must resolve successfully in the shell that runs `evaluate.py` or `run_evo.py`.
- If Lean setup or `mathlib` fetch is slow, raise `--lean_timeout` (default now `300` seconds).
Loading
Loading