Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion .github/workflows/benchmarks.yml
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,7 @@ concurrency:
env:
CARGO_TERM_COLOR: always
RUST_BACKTRACE: 1
UV_VERSION: "0.11.8"
BENCHMARK_TIMEOUT: 1800 # 30 min; pre-computed seeds + reduced 5D counts keep runtime well under this
DELAUNAY_BENCH_DISCOVER_SEEDS_LIMIT: 256 # fallback only; ci_performance_suite uses pre-computed seeds

Expand All @@ -64,7 +65,7 @@ jobs:
- name: Install uv (Python package manager)
uses: astral-sh/setup-uv@08807647e7069bb48b6ef5acd8ec9567f424441b # v8.1.0
with:
version: "latest"
version: ${{ env.UV_VERSION }}

- name: Verify uv installation
run: uv --version
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ env:
MARKDOWNLINT_VERSION: "0.47.0"
SHFMT_VERSION: "3.12.0"
TYPOS_VERSION: "1.43.4"
UV_VERSION: "0.9.21"
UV_VERSION: "0.11.8"

jobs:
build:
Expand Down
3 changes: 2 additions & 1 deletion .github/workflows/generate-baseline.yml
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,7 @@ permissions:
env:
CARGO_TERM_COLOR: always
RUST_BACKTRACE: 1
UV_VERSION: "0.11.8"
# Seed search limit for both old (pre-v0.8) and current env var names.
# Old tags read DELAUNAY_BENCH_SEED_SEARCH_LIMIT; current code reads
# DELAUNAY_BENCH_DISCOVER_SEEDS_LIMIT. Setting both ensures backward
Expand Down Expand Up @@ -54,7 +55,7 @@ jobs:
- name: Install uv (Python package manager)
uses: astral-sh/setup-uv@08807647e7069bb48b6ef5acd8ec9567f424441b # v8.1.0
with:
version: "latest"
version: ${{ env.UV_VERSION }}

- name: Verify uv installation
run: uv --version
Expand Down
17 changes: 14 additions & 3 deletions .github/workflows/profiling-benchmarks.yml
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,6 @@ permissions:
env:
CARGO_TERM_COLOR: always
RUST_BACKTRACE: 1
RUST_TOOLCHAIN: 1.92.0

jobs:
comprehensive-profiling:
Expand All @@ -56,7 +55,6 @@ jobs:
- name: Install Rust toolchain
uses: actions-rust-lang/setup-rust-toolchain@2b1f5e9b395427c92ee4e3331786ca3c37afe2d7 # v1.16.0
with:
toolchain: ${{ env.RUST_TOOLCHAIN }}
cache: false
rustflags: ""

Expand Down Expand Up @@ -112,6 +110,11 @@ jobs:
} >> "$GITHUB_ENV"
fi

- name: Capture profiling environment metadata
env:
BENCH_FILTER_VALUE: ${{ github.event.inputs.benchmark_filter || '' }}
run: ./scripts/ci/capture_profiling_metadata.sh

- name: Build profiling suite
run: |
# Build with the same perf profile used by `cargo bench --profile perf`
Expand Down Expand Up @@ -197,6 +200,7 @@ jobs:

- \`profiling_output.log\`: Complete benchmark output
- \`memory_profiling_detailed.log\`: Detailed memory allocation analysis
- \`environment_metadata.md\`: Code ref, compiler, profile, and filter metadata
- \`criterion/\`: HTML reports and detailed timing data

EOF
Expand Down Expand Up @@ -253,7 +257,6 @@ jobs:
- name: Install Rust toolchain
uses: actions-rust-lang/setup-rust-toolchain@2b1f5e9b395427c92ee4e3331786ca3c37afe2d7 # v1.16.0
with:
toolchain: ${{ env.RUST_TOOLCHAIN }}
cache: false
rustflags: ""

Expand All @@ -273,6 +276,13 @@ jobs:
echo "Running allocation API tests..."
cargo test --test allocation_api --features count-allocations --verbose

- name: Capture memory profiling environment metadata
env:
PROFILE_METADATA_TITLE: Memory Profiling Environment
PROFILE_METADATA_FILTER: memory_profiling
PROFILE_METADATA_MODE: development
run: ./scripts/ci/capture_profiling_metadata.sh

- name: Run memory scaling benchmarks
env:
PROFILING_DEV_MODE: "1"
Expand All @@ -292,5 +302,6 @@ jobs:
with:
name: memory-stress-results-${{ github.run_number }}
path: |
profiling-results/
target/criterion/
retention-days: 14
5 changes: 0 additions & 5 deletions Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -67,11 +67,6 @@ name = "circumsphere_containment"
path = "benches/circumsphere_containment.rs"
harness = false

[[bench]]
name = "microbenchmarks"
path = "benches/microbenchmarks.rs"
harness = false

[[bench]]
name = "topology_guarantee_construction"
path = "benches/topology_guarantee_construction.rs"
Expand Down
171 changes: 141 additions & 30 deletions benches/PERFORMANCE_RESULTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,51 +3,157 @@
This file contains performance benchmarks and analysis for the delaunay library.
The results are automatically generated and updated by the benchmark infrastructure.

**Last Updated**: 2026-04-25 15:39:16 UTC
**Last Updated**: 2026-04-27 19:30:43 UTC
**Generated By**: benchmark_utils.py
**Git Commit**: 7e42be8fba9abe571d0137710fbd7ed0151ebc85
**Git Commit**: 5f3e02917d813463716f7e2f009d6096d89148da
**Hardware**: Apple M4 Max (16 cores)
**Memory**: 64.0 GB
**OS**: macOS
**Rust**: rustc 1.95.0 (59807616e 2026-04-14)

## Performance Results Summary

### Circumsphere Performance Results
### Public API Performance Contract (`ci_performance_suite`)

This suite is the versioned benchmark contract for public Delaunay workflows.
It covers construction, hull extraction, validation, incremental insertion,
boundary traversal, and explicit bistellar flip roundtrips.

#### Construction

Public API: `DelaunayTriangulation::new_with_options`

| Benchmark ID | Dimension | Input | Variant | Mean | 95% CI |
|--------------|-----------|-------|---------|------|--------|
| `tds_new_2d/tds_new/10` | 2D | 10 | well-conditioned | 143.4 µs | 143.1 µs - 143.7 µs |
| `tds_new_2d/tds_new_adversarial/10` | 2D | 10 | adversarial | 336.1 µs | 334.7 µs - 337.6 µs |
| `tds_new_2d/tds_new/25` | 2D | 25 | well-conditioned | 904.6 µs | 902.6 µs - 906.8 µs |
| `tds_new_2d/tds_new_adversarial/25` | 2D | 25 | adversarial | 3.557 ms | 3.526 ms - 3.586 ms |
| `tds_new_2d/tds_new/50` | 2D | 50 | well-conditioned | 3.055 ms | 3.046 ms - 3.065 ms |
| `tds_new_2d/tds_new_adversarial/50` | 2D | 50 | adversarial | 16.089 ms | 16.055 ms - 16.121 ms |
| `tds_new_3d/tds_new/10` | 3D | 10 | well-conditioned | 1.004 ms | 999.9 µs - 1.009 ms |
| `tds_new_3d/tds_new_adversarial/10` | 3D | 10 | adversarial | 2.876 ms | 2.868 ms - 2.884 ms |
| `tds_new_3d/tds_new/25` | 3D | 25 | well-conditioned | 14.925 ms | 14.882 ms - 14.969 ms |
| `tds_new_3d/tds_new_adversarial/25` | 3D | 25 | adversarial | 33.642 ms | 33.512 ms - 33.773 ms |
| `tds_new_3d/tds_new/50` | 3D | 50 | well-conditioned | 74.230 ms | 73.980 ms - 74.482 ms |
| `tds_new_3d/tds_new_adversarial/50` | 3D | 50 | adversarial | 167.721 ms | 166.922 ms - 168.499 ms |
| `tds_new_4d/tds_new/10` | 4D | 10 | well-conditioned | 12.852 ms | 12.774 ms - 12.936 ms |
| `tds_new_4d/tds_new_adversarial/10` | 4D | 10 | adversarial | 9.161 ms | 9.115 ms - 9.206 ms |
| `tds_new_4d/tds_new/25` | 4D | 25 | well-conditioned | 287.991 ms | 286.462 ms - 289.393 ms |
| `tds_new_4d/tds_new_adversarial/25` | 4D | 25 | adversarial | 231.443 ms | 230.582 ms - 232.428 ms |
| `tds_new_4d/tds_new/50` | 4D | 50 | well-conditioned | 1.632 s | 1.624 s - 1.645 s |
| `tds_new_4d/tds_new_adversarial/50` | 4D | 50 | adversarial | 1.283 s | 1.280 s - 1.286 s |
| `tds_new_5d/tds_new/10` | 5D | 10 | well-conditioned | 24.993 ms | 24.906 ms - 25.072 ms |
| `tds_new_5d/tds_new_adversarial/10` | 5D | 10 | adversarial | 27.704 ms | 27.550 ms - 27.834 ms |
| `tds_new_5d/tds_new/25` | 5D | 25 | well-conditioned | 1.461 s | 1.457 s - 1.466 s |
| `tds_new_5d/tds_new_adversarial/25` | 5D | 25 | adversarial | 1.353 s | 1.350 s - 1.357 s |

#### Boundary facets

Public API: `DelaunayTriangulation::boundary_facets`

| Benchmark ID | Dimension | Input | Variant | Mean | 95% CI |
|--------------|-----------|-------|---------|------|--------|
| `boundary_facets/boundary_facets_2d/50` | 2D | 50 | well-conditioned | 15.9 µs | 15.9 µs - 15.9 µs |
| `boundary_facets/boundary_facets_2d_adversarial/50` | 2D | 50 | adversarial | 16.4 µs | 16.3 µs - 16.4 µs |
| `boundary_facets/boundary_facets_3d/50` | 3D | 50 | well-conditioned | 66.2 µs | 65.8 µs - 66.5 µs |
| `boundary_facets/boundary_facets_3d_adversarial/50` | 3D | 50 | adversarial | 65.4 µs | 65.1 µs - 65.8 µs |
| `boundary_facets/boundary_facets_4d/50` | 4D | 50 | well-conditioned | 270.1 µs | 267.8 µs - 272.3 µs |
| `boundary_facets/boundary_facets_4d_adversarial/50` | 4D | 50 | adversarial | 255.7 µs | 253.8 µs - 257.6 µs |
| `boundary_facets/boundary_facets_5d/25` | 5D | 25 | well-conditioned | 245.5 µs | 242.4 µs - 248.5 µs |
| `boundary_facets/boundary_facets_5d_adversarial/25` | 5D | 25 | adversarial | 233.8 µs | 231.4 µs - 236.3 µs |

#### Convex hull

Public API: `ConvexHull::from_triangulation`

| Benchmark ID | Dimension | Input | Variant | Mean | 95% CI |
|--------------|-----------|-------|---------|------|--------|
| `convex_hull/from_triangulation_2d/50` | 2D | 50 | well-conditioned | 16.0 µs | 16.0 µs - 16.1 µs |
| `convex_hull/from_triangulation_2d_adversarial/50` | 2D | 50 | adversarial | 16.5 µs | 16.5 µs - 16.6 µs |
| `convex_hull/from_triangulation_3d/50` | 3D | 50 | well-conditioned | 66.3 µs | 66.0 µs - 66.6 µs |
| `convex_hull/from_triangulation_3d_adversarial/50` | 3D | 50 | adversarial | 66.3 µs | 66.0 µs - 66.5 µs |
| `convex_hull/from_triangulation_4d/50` | 4D | 50 | well-conditioned | 271.7 µs | 270.0 µs - 273.3 µs |
| `convex_hull/from_triangulation_4d_adversarial/50` | 4D | 50 | adversarial | 256.6 µs | 254.9 µs - 258.4 µs |
| `convex_hull/from_triangulation_5d/25` | 5D | 25 | well-conditioned | 247.4 µs | 245.4 µs - 249.2 µs |
| `convex_hull/from_triangulation_5d_adversarial/25` | 5D | 25 | adversarial | 229.6 µs | 227.0 µs - 232.3 µs |

#### Validation

Public API: `DelaunayTriangulation::validate`

| Benchmark ID | Dimension | Input | Variant | Mean | 95% CI |
|--------------|-----------|-------|---------|------|--------|
| `validation/validate_3d/50` | 3D | 50 | well-conditioned | 1.071 ms | 1.057 ms - 1.088 ms |
| `validation/validate_3d_adversarial/50` | 3D | 50 | adversarial | 1.652 ms | 1.643 ms - 1.662 ms |
| `validation/validate_4d/50` | 4D | 50 | well-conditioned | 43.553 ms | 43.383 ms - 43.729 ms |
| `validation/validate_4d_adversarial/50` | 4D | 50 | adversarial | 39.152 ms | 38.994 ms - 39.326 ms |
| `validation/validate_5d/25` | 5D | 25 | well-conditioned | 78.675 ms | 78.339 ms - 78.994 ms |
| `validation/validate_5d_adversarial/25` | 5D | 25 | adversarial | 72.246 ms | 71.893 ms - 72.631 ms |

#### Incremental insert

Public API: `DelaunayTriangulation::insert`

| Benchmark ID | Dimension | Input | Variant | Mean | 95% CI |
|--------------|-----------|-------|---------|------|--------|
| `incremental_insert/insert_2d/10` | 2D | 10 | well-conditioned | 1.098 ms | 1.095 ms - 1.102 ms |
| `incremental_insert/insert_2d_adversarial/10` | 2D | 10 | adversarial | 2.071 ms | 2.067 ms - 2.075 ms |
| `incremental_insert/insert_3d/10` | 3D | 10 | well-conditioned | 5.988 ms | 5.960 ms - 6.018 ms |
| `incremental_insert/insert_3d_adversarial/10` | 3D | 10 | adversarial | 48.951 ms | 48.658 ms - 49.245 ms |
| `incremental_insert/insert_4d/6` | 4D | 6 | well-conditioned | 259.223 ms | 258.041 ms - 260.310 ms |
| `incremental_insert/insert_4d_adversarial/6` | 4D | 6 | adversarial | 431.328 ms | 429.736 ms - 433.006 ms |
| `incremental_insert/insert_5d/4` | 5D | 4 | well-conditioned | 930.065 ms | 927.662 ms - 932.270 ms |
| `incremental_insert/insert_5d_adversarial/4` | 5D | 4 | adversarial | 445.154 ms | 443.820 ms - 446.406 ms |

#### Bistellar flips

Public API: `BistellarFlips`

| Benchmark ID | Dimension | Input | Variant | Mean | 95% CI |
|--------------|-----------|-------|---------|------|--------|
| `bistellar_flips_4d/k1_roundtrip` | 4D | roundtrip | well-conditioned | 38.0 µs | 37.8 µs - 38.2 µs |
| `bistellar_flips_4d/k2_roundtrip` | 4D | roundtrip | well-conditioned | 40.6 µs | 40.4 µs - 40.8 µs |
| `bistellar_flips_4d/k3_roundtrip` | 4D | roundtrip | well-conditioned | 40.1 µs | 40.0 µs - 40.3 µs |

### Circumsphere Predicate Performance

This focused predicate suite tracks `la-stack`-backed circumsphere and
insphere query performance independently from full triangulation workflows.

#### Version 0.7.6 Results (2026-04-25)

#### Single Query Performance (2D)

| Test Case | insphere | insphere_distance | insphere_lifted | Winner |
|-----------|----------|------------------|-----------------|---------|
| Basic 2D | 15 ns | 25 ns | 7 ns | **insphere_lifted** |
| Boundary vertex | 2 ns | 24 ns | 196 ns | **insphere** |
| Far vertex | 15 ns | 25 ns | 7 ns | **insphere_lifted** |
| Basic 2D | 15 ns | 26 ns | 7 ns | **insphere_lifted** |
| Boundary vertex | 2 ns | 25 ns | 260 ns | **insphere** |
| Far vertex | 15 ns | 24 ns | 8 ns | **insphere_lifted** |

#### Single Query Performance (3D)

| Test Case | insphere | insphere_distance | insphere_lifted | Winner |
|-----------|----------|------------------|-----------------|---------|
| Basic 3D | 2.1 µs | 25 ns | 17 ns | **insphere_lifted** |
| Boundary vertex | 2 ns | 26 ns | 432 ns | **insphere** |
| Far vertex | 2.1 µs | 26 ns | 17 ns | **insphere_lifted** |
| Basic 3D | 2.8 µs | 26 ns | 18 ns | **insphere_lifted** |
| Boundary vertex | 2 ns | 26 ns | 563 ns | **insphere** |
| Far vertex | 2.8 µs | 26 ns | 17 ns | **insphere_lifted** |

#### Single Query Performance (4D)

| Test Case | insphere | insphere_distance | insphere_lifted | Winner |
|-----------|----------|------------------|-----------------|---------|
| Basic 4D | 5.1 µs | 53 ns | 2.9 µs | **insphere_distance** |
| Boundary vertex | 2 ns | 60 ns | 1.5 µs | **insphere** |
| Far vertex | 3.2 µs | 53 ns | 1.8 µs | **insphere_distance** |
| Basic 4D | 6.7 µs | 56 ns | 3.7 µs | **insphere_distance** |
| Boundary vertex | 2 ns | 57 ns | 1.9 µs | **insphere** |
| Far vertex | 4.4 µs | 54 ns | 2.5 µs | **insphere_distance** |

#### Single Query Performance (5D)

| Test Case | insphere | insphere_distance | insphere_lifted | Winner |
|-----------|----------|------------------|-----------------|---------|
| Basic 5D | 8.3 µs | 80 ns | 4.8 µs | **insphere_distance** |
| Boundary vertex | 2 ns | 81 ns | 2.3 µs | **insphere** |
| Far vertex | 4.9 µs | 79 ns | 2.8 µs | **insphere_distance** |
| Basic 5D | 10.4 µs | 82 ns | 6.0 µs | **insphere_distance** |
| Boundary vertex | 2 ns | 82 ns | 2.9 µs | **insphere** |
| Far vertex | 6.3 µs | 81 ns | 3.8 µs | **insphere_distance** |

## Triangulation Data Structure Performance

Expand Down Expand Up @@ -88,13 +194,13 @@ The results are automatically generated and updated by the benchmark infrastruct
| 10 | 27.463 ms | 0.364 Kelem/s | 1.0x |
| 25 | 5956.682 ms | 0.004 Kelem/s | 216.9x |

## Key Findings
## Circumsphere Predicate Analysis

### Performance Ranking

1. **insphere_distance** - (best in 4D, 5D) - Best average performance
2. **insphere_lifted** - (best in 2D, 3D) - ~33.6x average vs fastest
3. **insphere** - ~70.4x slower than fastest on average
2. **insphere_lifted** - (best in 2D, 3D) - ~42.6x average vs fastest
3. **insphere** - ~89.2x slower than fastest on average

### Numerical Accuracy Analysis

Expand All @@ -105,19 +211,19 @@ Based on random test cases:
- **insphere_distance vs insphere_lifted**: 100.0% agreement
- **All three methods agree**: 100.0% (expected due to different numerical approaches)

## Recommendations
### Recommendations

### Method Selection Guide
#### Method Selection Guide

**All three methods are mathematically correct** (they produce valid insphere test results).
Choose based on your specific requirements:

#### Performance Optimization by Dimension
##### Performance Optimization by Dimension

- **`insphere_distance`**: (best in 4D, 5D) - Best average performance
- **`insphere_lifted`**: (best in 2D, 3D) - ~33.6x average vs fastest
- **`insphere_lifted`**: (best in 2D, 3D) - ~42.6x average vs fastest

#### General Recommendations
##### General Recommendations

**For maximum performance**: Choose the method that performs best in your target dimension (see above)

Expand All @@ -127,20 +233,20 @@ and uses the standard determinant-based approach with well-understood numerical
**For algorithm transparency**: `insphere_distance` explicitly calculates the circumcenter,
making it excellent for educational purposes, debugging, and algorithm validation

#### Performance Comparison
##### Performance Comparison

Average performance across all non-boundary test cases:

- `insphere_distance`: 46 ns (best in 4D, 5D)
- `insphere_lifted`: 1.5 µs (best in 2D, 3D)
- `insphere`: 3.2 µs (third fastest)
- `insphere_distance`: 47 ns (best in 4D, 5D)
- `insphere_lifted`: 2.0 µs (best in 2D, 3D)
- `insphere`: 4.2 µs (third fastest)

## Conclusion
### Conclusion

All three methods are mathematically correct and produce valid results. Performance characteristics vary by dimension:

- `insphere_distance` (best in 4D, 5D) - Best average performance
- `insphere_lifted` (best in 2D, 3D) - ~33.6x average vs fastest
- `insphere_lifted` (best in 2D, 3D) - ~42.6x average vs fastest

For general-purpose applications, choose based on your primary use case:

Expand Down Expand Up @@ -188,6 +294,11 @@ The disagreements between methods are expected due to:

## Benchmark Structure

The `ci_performance_suite.rs` benchmark is the primary regression and
release-summary suite. It emits a versioned `api_benchmark_manifest` and
covers public construction, hull, validation, insertion, boundary, and
bistellar-flip workflows across supported dimensions.

The `circumsphere_containment.rs` benchmark includes:

- **Random queries**: Batch processing performance with 1000 random test points
Expand All @@ -203,7 +314,7 @@ This file is automatically generated from benchmark results. To update:
# Generate performance summary with current data
uv run benchmark-utils generate-summary

# Run fresh perf-profile benchmarks and generate summary (includes numerical accuracy)
# Run fresh perf-profile public API and circumsphere benchmarks
uv run benchmark-utils generate-summary --run-benchmarks --profile perf

# Generate baseline results for regression testing
Expand Down
Loading
Loading