Skip to content
Merged
Show file tree
Hide file tree
Changes from 7 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion .github/workflows/benchmarks.yml
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,7 @@ concurrency:
env:
CARGO_TERM_COLOR: always
RUST_BACKTRACE: 1
UV_VERSION: "0.11.8"
BENCHMARK_TIMEOUT: 1800 # 30 min; pre-computed seeds + reduced 5D counts keep runtime well under this
DELAUNAY_BENCH_DISCOVER_SEEDS_LIMIT: 256 # fallback only; ci_performance_suite uses pre-computed seeds

Expand All @@ -64,7 +65,7 @@ jobs:
- name: Install uv (Python package manager)
uses: astral-sh/setup-uv@08807647e7069bb48b6ef5acd8ec9567f424441b # v8.1.0
with:
version: "latest"
version: ${{ env.UV_VERSION }}

- name: Verify uv installation
run: uv --version
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ env:
MARKDOWNLINT_VERSION: "0.47.0"
SHFMT_VERSION: "3.12.0"
TYPOS_VERSION: "1.43.4"
UV_VERSION: "0.9.21"
UV_VERSION: "0.11.8"

jobs:
build:
Expand Down
3 changes: 2 additions & 1 deletion .github/workflows/generate-baseline.yml
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,7 @@ permissions:
env:
CARGO_TERM_COLOR: always
RUST_BACKTRACE: 1
UV_VERSION: "0.11.8"
# Seed search limit for both old (pre-v0.8) and current env var names.
# Old tags read DELAUNAY_BENCH_SEED_SEARCH_LIMIT; current code reads
# DELAUNAY_BENCH_DISCOVER_SEEDS_LIMIT. Setting both ensures backward
Expand Down Expand Up @@ -54,7 +55,7 @@ jobs:
- name: Install uv (Python package manager)
uses: astral-sh/setup-uv@08807647e7069bb48b6ef5acd8ec9567f424441b # v8.1.0
with:
version: "latest"
version: ${{ env.UV_VERSION }}

- name: Verify uv installation
run: uv --version
Expand Down
17 changes: 14 additions & 3 deletions .github/workflows/profiling-benchmarks.yml
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,6 @@ permissions:
env:
CARGO_TERM_COLOR: always
RUST_BACKTRACE: 1
RUST_TOOLCHAIN: 1.92.0

jobs:
comprehensive-profiling:
Expand All @@ -56,7 +55,6 @@ jobs:
- name: Install Rust toolchain
uses: actions-rust-lang/setup-rust-toolchain@2b1f5e9b395427c92ee4e3331786ca3c37afe2d7 # v1.16.0
with:
toolchain: ${{ env.RUST_TOOLCHAIN }}
cache: false
rustflags: ""

Expand Down Expand Up @@ -112,6 +110,11 @@ jobs:
} >> "$GITHUB_ENV"
fi

- name: Capture profiling environment metadata
env:
BENCH_FILTER_VALUE: ${{ github.event.inputs.benchmark_filter || '' }}
run: ./scripts/ci/capture_profiling_metadata.sh

- name: Build profiling suite
run: |
# Build with the same perf profile used by `cargo bench --profile perf`
Expand Down Expand Up @@ -197,6 +200,7 @@ jobs:

- \`profiling_output.log\`: Complete benchmark output
- \`memory_profiling_detailed.log\`: Detailed memory allocation analysis
- \`environment_metadata.md\`: Code ref, compiler, profile, and filter metadata
- \`criterion/\`: HTML reports and detailed timing data

EOF
Expand Down Expand Up @@ -253,7 +257,6 @@ jobs:
- name: Install Rust toolchain
uses: actions-rust-lang/setup-rust-toolchain@2b1f5e9b395427c92ee4e3331786ca3c37afe2d7 # v1.16.0
with:
toolchain: ${{ env.RUST_TOOLCHAIN }}
cache: false
rustflags: ""

Expand All @@ -273,6 +276,13 @@ jobs:
echo "Running allocation API tests..."
cargo test --test allocation_api --features count-allocations --verbose

- name: Capture memory profiling environment metadata
env:
PROFILE_METADATA_TITLE: Memory Profiling Environment
PROFILE_METADATA_FILTER: memory_profiling
PROFILE_METADATA_MODE: development
run: ./scripts/ci/capture_profiling_metadata.sh

- name: Run memory scaling benchmarks
env:
PROFILING_DEV_MODE: "1"
Expand All @@ -292,5 +302,6 @@ jobs:
with:
name: memory-stress-results-${{ github.run_number }}
path: |
profiling-results/
target/criterion/
retention-days: 14
5 changes: 0 additions & 5 deletions Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -67,11 +67,6 @@ name = "circumsphere_containment"
path = "benches/circumsphere_containment.rs"
harness = false

[[bench]]
name = "microbenchmarks"
path = "benches/microbenchmarks.rs"
harness = false

[[bench]]
name = "topology_guarantee_construction"
path = "benches/topology_guarantee_construction.rs"
Expand Down
171 changes: 141 additions & 30 deletions benches/PERFORMANCE_RESULTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,51 +3,157 @@
This file contains performance benchmarks and analysis for the delaunay library.
The results are automatically generated and updated by the benchmark infrastructure.

**Last Updated**: 2026-04-25 15:39:16 UTC
**Last Updated**: 2026-04-27 19:30:43 UTC
**Generated By**: benchmark_utils.py
**Git Commit**: 7e42be8fba9abe571d0137710fbd7ed0151ebc85
**Git Commit**: 5f3e02917d813463716f7e2f009d6096d89148da
**Hardware**: Apple M4 Max (16 cores)
**Memory**: 64.0 GB
**OS**: macOS
**Rust**: rustc 1.95.0 (59807616e 2026-04-14)

## Performance Results Summary

### Circumsphere Performance Results
### Public API Performance Contract (`ci_performance_suite`)

This suite is the versioned benchmark contract for public Delaunay workflows.
It covers construction, hull extraction, validation, incremental insertion,
boundary traversal, and explicit bistellar flip roundtrips.

#### Construction

Public API: `DelaunayTriangulation::new_with_options`

| Benchmark ID | Dimension | Input | Variant | Mean | 95% CI |
|--------------|-----------|-------|---------|------|--------|
| `tds_new_2d/tds_new/10` | 2D | 10 | well-conditioned | 143.4 µs | 143.1 µs - 143.7 µs |
| `tds_new_2d/tds_new_adversarial/10` | 2D | 10 | adversarial | 336.1 µs | 334.7 µs - 337.6 µs |
| `tds_new_2d/tds_new/25` | 2D | 25 | well-conditioned | 904.6 µs | 902.6 µs - 906.8 µs |
| `tds_new_2d/tds_new_adversarial/25` | 2D | 25 | adversarial | 3.557 ms | 3.526 ms - 3.586 ms |
| `tds_new_2d/tds_new/50` | 2D | 50 | well-conditioned | 3.055 ms | 3.046 ms - 3.065 ms |
| `tds_new_2d/tds_new_adversarial/50` | 2D | 50 | adversarial | 16.089 ms | 16.055 ms - 16.121 ms |
| `tds_new_3d/tds_new/10` | 3D | 10 | well-conditioned | 1.004 ms | 999.9 µs - 1.009 ms |
| `tds_new_3d/tds_new_adversarial/10` | 3D | 10 | adversarial | 2.876 ms | 2.868 ms - 2.884 ms |
| `tds_new_3d/tds_new/25` | 3D | 25 | well-conditioned | 14.925 ms | 14.882 ms - 14.969 ms |
| `tds_new_3d/tds_new_adversarial/25` | 3D | 25 | adversarial | 33.642 ms | 33.512 ms - 33.773 ms |
| `tds_new_3d/tds_new/50` | 3D | 50 | well-conditioned | 74.230 ms | 73.980 ms - 74.482 ms |
| `tds_new_3d/tds_new_adversarial/50` | 3D | 50 | adversarial | 167.721 ms | 166.922 ms - 168.499 ms |
| `tds_new_4d/tds_new/10` | 4D | 10 | well-conditioned | 12.852 ms | 12.774 ms - 12.936 ms |
| `tds_new_4d/tds_new_adversarial/10` | 4D | 10 | adversarial | 9.161 ms | 9.115 ms - 9.206 ms |
| `tds_new_4d/tds_new/25` | 4D | 25 | well-conditioned | 287.991 ms | 286.462 ms - 289.393 ms |
| `tds_new_4d/tds_new_adversarial/25` | 4D | 25 | adversarial | 231.443 ms | 230.582 ms - 232.428 ms |
| `tds_new_4d/tds_new/50` | 4D | 50 | well-conditioned | 1.632 s | 1.624 s - 1.645 s |
| `tds_new_4d/tds_new_adversarial/50` | 4D | 50 | adversarial | 1.283 s | 1.280 s - 1.286 s |
| `tds_new_5d/tds_new/10` | 5D | 10 | well-conditioned | 24.993 ms | 24.906 ms - 25.072 ms |
| `tds_new_5d/tds_new_adversarial/10` | 5D | 10 | adversarial | 27.704 ms | 27.550 ms - 27.834 ms |
| `tds_new_5d/tds_new/25` | 5D | 25 | well-conditioned | 1.461 s | 1.457 s - 1.466 s |
| `tds_new_5d/tds_new_adversarial/25` | 5D | 25 | adversarial | 1.353 s | 1.350 s - 1.357 s |

#### Boundary facets

Public API: `DelaunayTriangulation::boundary_facets`

| Benchmark ID | Dimension | Input | Variant | Mean | 95% CI |
|--------------|-----------|-------|---------|------|--------|
| `boundary_facets/boundary_facets_2d/50` | 2D | 50 | well-conditioned | 15.9 µs | 15.9 µs - 15.9 µs |
| `boundary_facets/boundary_facets_2d_adversarial/50` | 2D | 50 | adversarial | 16.4 µs | 16.3 µs - 16.4 µs |
| `boundary_facets/boundary_facets_3d/50` | 3D | 50 | well-conditioned | 66.2 µs | 65.8 µs - 66.5 µs |
| `boundary_facets/boundary_facets_3d_adversarial/50` | 3D | 50 | adversarial | 65.4 µs | 65.1 µs - 65.8 µs |
| `boundary_facets/boundary_facets_4d/50` | 4D | 50 | well-conditioned | 270.1 µs | 267.8 µs - 272.3 µs |
| `boundary_facets/boundary_facets_4d_adversarial/50` | 4D | 50 | adversarial | 255.7 µs | 253.8 µs - 257.6 µs |
| `boundary_facets/boundary_facets_5d/25` | 5D | 25 | well-conditioned | 245.5 µs | 242.4 µs - 248.5 µs |
| `boundary_facets/boundary_facets_5d_adversarial/25` | 5D | 25 | adversarial | 233.8 µs | 231.4 µs - 236.3 µs |

#### Convex hull

Public API: `ConvexHull::from_triangulation`

| Benchmark ID | Dimension | Input | Variant | Mean | 95% CI |
|--------------|-----------|-------|---------|------|--------|
| `convex_hull/from_triangulation_2d/50` | 2D | 50 | well-conditioned | 16.0 µs | 16.0 µs - 16.1 µs |
| `convex_hull/from_triangulation_2d_adversarial/50` | 2D | 50 | adversarial | 16.5 µs | 16.5 µs - 16.6 µs |
| `convex_hull/from_triangulation_3d/50` | 3D | 50 | well-conditioned | 66.3 µs | 66.0 µs - 66.6 µs |
| `convex_hull/from_triangulation_3d_adversarial/50` | 3D | 50 | adversarial | 66.3 µs | 66.0 µs - 66.5 µs |
| `convex_hull/from_triangulation_4d/50` | 4D | 50 | well-conditioned | 271.7 µs | 270.0 µs - 273.3 µs |
| `convex_hull/from_triangulation_4d_adversarial/50` | 4D | 50 | adversarial | 256.6 µs | 254.9 µs - 258.4 µs |
| `convex_hull/from_triangulation_5d/25` | 5D | 25 | well-conditioned | 247.4 µs | 245.4 µs - 249.2 µs |
| `convex_hull/from_triangulation_5d_adversarial/25` | 5D | 25 | adversarial | 229.6 µs | 227.0 µs - 232.3 µs |

#### Validation

Public API: `DelaunayTriangulation::validate`

| Benchmark ID | Dimension | Input | Variant | Mean | 95% CI |
|--------------|-----------|-------|---------|------|--------|
| `validation/validate_3d/50` | 3D | 50 | well-conditioned | 1.071 ms | 1.057 ms - 1.088 ms |
| `validation/validate_3d_adversarial/50` | 3D | 50 | adversarial | 1.652 ms | 1.643 ms - 1.662 ms |
| `validation/validate_4d/50` | 4D | 50 | well-conditioned | 43.553 ms | 43.383 ms - 43.729 ms |
| `validation/validate_4d_adversarial/50` | 4D | 50 | adversarial | 39.152 ms | 38.994 ms - 39.326 ms |
| `validation/validate_5d/25` | 5D | 25 | well-conditioned | 78.675 ms | 78.339 ms - 78.994 ms |
| `validation/validate_5d_adversarial/25` | 5D | 25 | adversarial | 72.246 ms | 71.893 ms - 72.631 ms |

#### Incremental insert

Public API: `DelaunayTriangulation::insert`

| Benchmark ID | Dimension | Input | Variant | Mean | 95% CI |
|--------------|-----------|-------|---------|------|--------|
| `incremental_insert/insert_2d/10` | 2D | 10 | well-conditioned | 1.098 ms | 1.095 ms - 1.102 ms |
| `incremental_insert/insert_2d_adversarial/10` | 2D | 10 | adversarial | 2.071 ms | 2.067 ms - 2.075 ms |
| `incremental_insert/insert_3d/10` | 3D | 10 | well-conditioned | 5.988 ms | 5.960 ms - 6.018 ms |
| `incremental_insert/insert_3d_adversarial/10` | 3D | 10 | adversarial | 48.951 ms | 48.658 ms - 49.245 ms |
| `incremental_insert/insert_4d/6` | 4D | 6 | well-conditioned | 259.223 ms | 258.041 ms - 260.310 ms |
| `incremental_insert/insert_4d_adversarial/6` | 4D | 6 | adversarial | 431.328 ms | 429.736 ms - 433.006 ms |
| `incremental_insert/insert_5d/4` | 5D | 4 | well-conditioned | 930.065 ms | 927.662 ms - 932.270 ms |
| `incremental_insert/insert_5d_adversarial/4` | 5D | 4 | adversarial | 445.154 ms | 443.820 ms - 446.406 ms |

#### Bistellar flips

Public API: `BistellarFlips`

| Benchmark ID | Dimension | Input | Variant | Mean | 95% CI |
|--------------|-----------|-------|---------|------|--------|
| `bistellar_flips_4d/k1_roundtrip` | 4D | roundtrip | well-conditioned | 38.0 µs | 37.8 µs - 38.2 µs |
| `bistellar_flips_4d/k2_roundtrip` | 4D | roundtrip | well-conditioned | 40.6 µs | 40.4 µs - 40.8 µs |
| `bistellar_flips_4d/k3_roundtrip` | 4D | roundtrip | well-conditioned | 40.1 µs | 40.0 µs - 40.3 µs |

### Circumsphere Predicate Performance

This focused predicate suite tracks `la-stack`-backed circumsphere and
insphere query performance independently from full triangulation workflows.

#### Version 0.7.6 Results (2026-04-25)

#### Single Query Performance (2D)

| Test Case | insphere | insphere_distance | insphere_lifted | Winner |
|-----------|----------|------------------|-----------------|---------|
| Basic 2D | 15 ns | 25 ns | 7 ns | **insphere_lifted** |
| Boundary vertex | 2 ns | 24 ns | 196 ns | **insphere** |
| Far vertex | 15 ns | 25 ns | 7 ns | **insphere_lifted** |
| Basic 2D | 15 ns | 26 ns | 7 ns | **insphere_lifted** |
| Boundary vertex | 2 ns | 25 ns | 260 ns | **insphere** |
| Far vertex | 15 ns | 24 ns | 8 ns | **insphere_lifted** |

#### Single Query Performance (3D)

| Test Case | insphere | insphere_distance | insphere_lifted | Winner |
|-----------|----------|------------------|-----------------|---------|
| Basic 3D | 2.1 µs | 25 ns | 17 ns | **insphere_lifted** |
| Boundary vertex | 2 ns | 26 ns | 432 ns | **insphere** |
| Far vertex | 2.1 µs | 26 ns | 17 ns | **insphere_lifted** |
| Basic 3D | 2.8 µs | 26 ns | 18 ns | **insphere_lifted** |
| Boundary vertex | 2 ns | 26 ns | 563 ns | **insphere** |
| Far vertex | 2.8 µs | 26 ns | 17 ns | **insphere_lifted** |

#### Single Query Performance (4D)

| Test Case | insphere | insphere_distance | insphere_lifted | Winner |
|-----------|----------|------------------|-----------------|---------|
| Basic 4D | 5.1 µs | 53 ns | 2.9 µs | **insphere_distance** |
| Boundary vertex | 2 ns | 60 ns | 1.5 µs | **insphere** |
| Far vertex | 3.2 µs | 53 ns | 1.8 µs | **insphere_distance** |
| Basic 4D | 6.7 µs | 56 ns | 3.7 µs | **insphere_distance** |
| Boundary vertex | 2 ns | 57 ns | 1.9 µs | **insphere** |
| Far vertex | 4.4 µs | 54 ns | 2.5 µs | **insphere_distance** |

#### Single Query Performance (5D)

| Test Case | insphere | insphere_distance | insphere_lifted | Winner |
|-----------|----------|------------------|-----------------|---------|
| Basic 5D | 8.3 µs | 80 ns | 4.8 µs | **insphere_distance** |
| Boundary vertex | 2 ns | 81 ns | 2.3 µs | **insphere** |
| Far vertex | 4.9 µs | 79 ns | 2.8 µs | **insphere_distance** |
| Basic 5D | 10.4 µs | 82 ns | 6.0 µs | **insphere_distance** |
| Boundary vertex | 2 ns | 82 ns | 2.9 µs | **insphere** |
| Far vertex | 6.3 µs | 81 ns | 3.8 µs | **insphere_distance** |

## Triangulation Data Structure Performance

Expand Down Expand Up @@ -88,13 +194,13 @@ The results are automatically generated and updated by the benchmark infrastruct
| 10 | 27.463 ms | 0.364 Kelem/s | 1.0x |
| 25 | 5956.682 ms | 0.004 Kelem/s | 216.9x |

## Key Findings
## Circumsphere Predicate Analysis

### Performance Ranking

1. **insphere_distance** - (best in 4D, 5D) - Best average performance
2. **insphere_lifted** - (best in 2D, 3D) - ~33.6x average vs fastest
3. **insphere** - ~70.4x slower than fastest on average
2. **insphere_lifted** - (best in 2D, 3D) - ~42.6x average vs fastest
3. **insphere** - ~89.2x slower than fastest on average

### Numerical Accuracy Analysis

Expand All @@ -105,19 +211,19 @@ Based on random test cases:
- **insphere_distance vs insphere_lifted**: 100.0% agreement
- **All three methods agree**: 100.0% (expected due to different numerical approaches)

## Recommendations
### Recommendations

### Method Selection Guide
#### Method Selection Guide

**All three methods are mathematically correct** (they produce valid insphere test results).
Choose based on your specific requirements:

#### Performance Optimization by Dimension
##### Performance Optimization by Dimension

- **`insphere_distance`**: (best in 4D, 5D) - Best average performance
- **`insphere_lifted`**: (best in 2D, 3D) - ~33.6x average vs fastest
- **`insphere_lifted`**: (best in 2D, 3D) - ~42.6x average vs fastest

#### General Recommendations
##### General Recommendations

**For maximum performance**: Choose the method that performs best in your target dimension (see above)

Expand All @@ -127,20 +233,20 @@ and uses the standard determinant-based approach with well-understood numerical
**For algorithm transparency**: `insphere_distance` explicitly calculates the circumcenter,
making it excellent for educational purposes, debugging, and algorithm validation

#### Performance Comparison
##### Performance Comparison

Average performance across all non-boundary test cases:

- `insphere_distance`: 46 ns (best in 4D, 5D)
- `insphere_lifted`: 1.5 µs (best in 2D, 3D)
- `insphere`: 3.2 µs (third fastest)
- `insphere_distance`: 47 ns (best in 4D, 5D)
- `insphere_lifted`: 2.0 µs (best in 2D, 3D)
- `insphere`: 4.2 µs (third fastest)

## Conclusion
### Conclusion

All three methods are mathematically correct and produce valid results. Performance characteristics vary by dimension:

- `insphere_distance` (best in 4D, 5D) - Best average performance
- `insphere_lifted` (best in 2D, 3D) - ~33.6x average vs fastest
- `insphere_lifted` (best in 2D, 3D) - ~42.6x average vs fastest

For general-purpose applications, choose based on your primary use case:

Expand Down Expand Up @@ -188,6 +294,11 @@ The disagreements between methods are expected due to:

## Benchmark Structure

The `ci_performance_suite.rs` benchmark is the primary regression and
release-summary suite. It emits a versioned `api_benchmark_manifest` and
covers public construction, hull, validation, insertion, boundary, and
bistellar-flip workflows across supported dimensions.

The `circumsphere_containment.rs` benchmark includes:

- **Random queries**: Batch processing performance with 1000 random test points
Expand All @@ -203,7 +314,7 @@ This file is automatically generated from benchmark results. To update:
# Generate performance summary with current data
uv run benchmark-utils generate-summary

# Run fresh perf-profile benchmarks and generate summary (includes numerical accuracy)
# Run fresh perf-profile public API and circumsphere benchmarks
uv run benchmark-utils generate-summary --run-benchmarks --profile perf

# Generate baseline results for regression testing
Expand Down
Loading
Loading