Skip to content

feat(rook): add gateway usage accounting#763

Merged
yacosta738 merged 2 commits into
mainfrom
feat/rook-usage-accounting-685
May 3, 2026
Merged

feat(rook): add gateway usage accounting#763
yacosta738 merged 2 commits into
mainfrom
feat/rook-usage-accounting-685

Conversation

@yacosta738
Copy link
Copy Markdown
Contributor

Related Issues

Fixes #685


Summary

Implements real Rook gateway usage accounting in place of the previous /api/usage placeholder.

  • Adds a persisted SQLite usage_events ledger and usage service wiring through RookRegistry.
  • Records safe request/outcome metadata for /v1/chat/completions, including invalid requests, unrouted requests, upstream failures, buffered successes, and streaming successes.
  • Extracts provider-reported token usage from buffered upstream responses when present, while leaving cost and streaming token accounting as nullable future extensions.
  • Replaces the admin usage endpoint with real summary totals grouped by model, vendor, and outcome.
  • Updates gateway specs and the acceptance regression matrix to document the new real accounting contract and redaction rules.

Tested Information

Ran the following validation locally:

cargo fmt --manifest-path "clients/rook/Cargo.toml" --all -- --check
cargo test --manifest-path "clients/rook/Cargo.toml" usage
cargo test --manifest-path "clients/rook/Cargo.toml" chat_completion
cargo clippy --manifest-path "clients/rook/Cargo.toml" --all-targets -- -D warnings

All passed.

Pre-commit and pre-push hooks also passed, including the offline doc link check.


Documentation Impact

  • Docs updated in:
    • openspec/specs/gateway/spec.md
    • openspec/specs/gateway/rook-acceptance-regression-matrix.md
  • No docs update required because: N/A
  • I verified the documentation matches the current behavior.

Breaking Changes

No breaking changes expected. /api/usage now returns the documented real usage summary instead of the former placeholder response; this is the intended feature change for #685.


Checklist

  • I have checked that there isn’t already a PR solving the same problem.
  • I have read the Contributing Guidelines.
  • I ensured my code follows the project's style guidelines.
  • I have added or updated tests that prove my fix is effective or that my feature works.
  • I have updated the documentation, or I explained above why no documentation update is needed.
  • I verified the documentation matches the current behavior.
  • I have documented any breaking changes in the Breaking Changes section.
  • I have linked the related issue (if any).

@cloudflare-workers-and-pages
Copy link
Copy Markdown

cloudflare-workers-and-pages Bot commented May 3, 2026

Deploying corvus with  Cloudflare Pages  Cloudflare Pages

Latest commit: a532f60
Status: ✅  Deploy successful!
Preview URL: https://6796dd7c.corvus-42x.pages.dev
Branch Preview URL: https://feat-rook-usage-accounting-6.corvus-42x.pages.dev

View logs

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 3, 2026

Warning

Rate limit exceeded

@yacosta738 has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 1 minute and 17 seconds before requesting another review.

To keep reviews running without waiting, you can enable usage-based add-on for your organization. This allows additional reviews beyond the hourly cap. Account admins can enable it under billing.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 51ae140f-892f-4f10-aebf-cab4003f9735

📥 Commits

Reviewing files that changed from the base of the PR and between a786870 and a532f60.

📒 Files selected for processing (9)
  • clients/rook/migrations/0007_usage_events.sql
  • clients/rook/src/admin/handlers.rs
  • clients/rook/src/admin/mod.rs
  • clients/rook/src/admin/types.rs
  • clients/rook/src/db/usage.rs
  • clients/rook/src/gateway/handlers.rs
  • clients/rook/src/gateway/streaming.rs
  • clients/rook/src/transport/rate_limit.rs
  • openspec/specs/gateway/spec.md
📝 Walkthrough

Walkthrough

This PR implements real usage accounting for the Rook gateway by adding a persistent usage_events table, recording request metadata (model, tokens, outcome, latency) per chat completion attempt, and replacing the placeholder /api/usage endpoint with a real aggregation and summary service. The implementation spans database migrations, a usage service layer, gateway instrumentation, and updated admin API handlers with accompanying tests and specification updates.

Changes

Usage Accounting Implementation

Layer / File(s) Summary
Database Schema & Persistence
clients/rook/migrations/0007_usage_events.sql, clients/rook/src/db/mod.rs (migration registration), clients/rook/src/db/usage.rs
New usage_events table stores event ID, timing, model/vendor/account, stream flag, outcome/status, latency, token counts, and cost with indexes on occurred_at, logical_model, vendor, account_id, outcome. Core functions insert_usage_event and summarize_usage provide record storage and time-windowed aggregation (totals + grouped top-N by model/vendor/outcome).
Service Layer & Registry Wiring
clients/rook/src/services/usage.rs, clients/rook/src/registry/mod.rs
UsageService trait defines record and summary operations. SqliteUsageService wraps SqliteDb and delegates to persistence functions. Registry integrates SqliteUsageService as a singleton singleton and exposes usage() accessor.
Admin API Response Contract
clients/rook/src/admin/types.rs
Replaces UsageStatusView placeholder with full contract: UsageSummaryPeriod (Hour/Day/Month with snake_case serde), UsageSummaryWindowView (period + RFC3339 since/until), UsageAggregateView (request/success/fail/stream counts, token totals, optional cost), UsageGroupView (key + flattened aggregate), UsageSummaryView (available flag + window + totals + grouped arrays).
Admin Endpoint Handler
clients/rook/src/admin/handlers.rs
New GetUsageQuery struct with optional limit and default period. Handler handle_get_usage now computes time window from period, clamps limit to [1,100], fetches summary via state.registry.usage().summary(...), and returns populated UsageSummaryView with window timestamps and grouped aggregates via helper functions usage_window_start, aggregate_view, group_views.
Gateway Usage Recording
clients/rook/src/gateway/handlers.rs
On each /v1/chat/completions attempt, record a StoredUsageEvent with latency, outcome (invalid_request / route_rejected / success / upstream_error), stream flag, and extracted token usage (for buffered success only). Token extraction via extract_token_usage parses upstream response JSON. Events spawned async via spawn_usage_record to avoid blocking. Refactored handle_chat_completions to create shared started_at timestamp and pass to buffered/streaming sub-handlers.
Tests & Integration
clients/rook/src/admin/mod.rs, clients/rook/src/server/mod.rs, clients/rook/src/registry/mod.rs (test), clients/rook/src/gateway/handlers.rs (tests)
Updated admin tests to assert real usage summary (available: true, correct totals, model keys). Integration test /api/usage now expects zeroed real data instead of placeholder. Registry test validates record→summary round-trip. Gateway tests poll usage summary via wait_for_usage_requests helper and assert counts/tokens/outcomes for buffered success, streaming success, failures, and invalid JSON.
Specification & Documentation
openspec/specs/gateway/spec.md, openspec/specs/gateway/rook-acceptance-regression-matrix.md
Requirement R25 evolves from returning M1 placeholder (available: false) to specifying real persisted usage accounting with safe (prompt/secret/raw-body-free) event metadata and aggregated summary response. New UsageSummaryView contract documents response structure and safe-field constraints. Acceptance matrix marks "Usage analytics/accounting" as Implemented (not Deferred) and updates slice #594 label and slice #599 positioning.
Minor Cleanup
clients/rook/src/transport/rate_limit.rs
pruning closure simplified from multi-line block to expression-bodied predicate (no logic change).

Sequence Diagram

sequenceDiagram
    participant Client
    participant GatewayHandler as Gateway Handler
    participant UsageService as Usage Service
    participant SqliteDb as SQLite DB
    participant AdminAPI as Admin API Handler

    Note over Client,AdminAPI: Usage Recording Flow

    Client->>GatewayHandler: POST /v1/chat/completions
    activate GatewayHandler
    
    GatewayHandler->>GatewayHandler: Parse request, start timer
    GatewayHandler->>GatewayHandler: Route to upstream
    alt Upstream Success
        GatewayHandler->>GatewayHandler: Extract tokens from response
    else Upstream Error or Route Failure
        GatewayHandler->>GatewayHandler: Classify outcome
    end
    
    GatewayHandler->>UsageService: spawn_usage_record(event)
    activate UsageService
    UsageService->>SqliteDb: insert_usage_event(StoredUsageEvent)
    activate SqliteDb
    SqliteDb->>SqliteDb: Write row to usage_events
    deactivate SqliteDb
    deactivate UsageService
    
    GatewayHandler->>Client: Response
    deactivate GatewayHandler

    Note over Client,AdminAPI: Usage Query Flow

    Client->>AdminAPI: GET /api/usage?period=day&limit=10
    activate AdminAPI
    
    AdminAPI->>AdminAPI: Parse query, compute window
    AdminAPI->>UsageService: summary(UsageSummaryQuery)
    activate UsageService
    
    UsageService->>SqliteDb: aggregate_totals(since, until)
    activate SqliteDb
    SqliteDb->>SqliteDb: SUM requests, tokens over window
    deactivate SqliteDb
    
    UsageService->>SqliteDb: aggregate_group(logical_model, limit)
    activate SqliteDb
    SqliteDb->>SqliteDb: GROUP BY logical_model, order, limit
    deactivate SqliteDb
    
    UsageService->>AdminAPI: UsageSummary {totals, by_model, …}
    deactivate UsageService
    
    AdminAPI->>AdminAPI: Build UsageSummaryView with RFC3339 window
    AdminAPI->>Client: JSON {available: true, window, totals, by_model, …}
    deactivate AdminAPI
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Suggested labels

area:rust, area:docs, area:web, risk:high, risk:security

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 43.64% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed Title clearly summarizes the main change: implementing real gateway usage accounting.
Description check ✅ Passed Description includes related issues, comprehensive summary of changes, testing validation, documentation updates, and breaking changes acknowledgment per template.
Linked Issues check ✅ Passed All coding requirements from #685 are met: real data persisted, safe metadata recorded, aggregation supported, production-safe endpoint provided, and secrets/payloads redacted.
Out of Scope Changes check ✅ Passed Changes are tightly scoped to usage accounting: database migrations, service layer, handlers, types, registry wiring, and documentation updates are all within scope.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/rook-usage-accounting-685

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
Review rate limit: 0/1 reviews remaining, refill in 1 minute and 17 seconds.

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 9

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
openspec/specs/gateway/spec.md (1)

21-21: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Stale "placeholder" language in the Purpose section contradicts R25.

Line 21 still reads "placeholder usage endpoint" but R25 now mandates that GET /api/usage returns real, persisted accounting data. A reader of the Purpose section alone would incorrectly conclude the endpoint is still a stub.

📝 Suggested fix
-settings, and the placeholder usage endpoint, including redacted response views and coexistence
+settings, and the usage accounting endpoint, including redacted response views and coexistence

As per coding guidelines, docs must stay aligned with code changes.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@openspec/specs/gateway/spec.md` at line 21, Update the Purpose section's
wording that currently reads "placeholder usage endpoint" to reflect R25 by
describing the endpoint as a real, persisted usage/accounting endpoint (e.g.,
mention that GET /api/usage returns real persisted accounting data per R25) and
remove any implication that it is a stub; ensure the updated sentence references
GET /api/usage and R25 so readers know the endpoint is implemented and returns
persisted data.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@clients/rook/migrations/0007_usage_events.sql`:
- Around line 21-25: The current single-column indexes
(idx_usage_events_occurred_at, idx_usage_events_logical_model,
idx_usage_events_vendor, idx_usage_events_account_id, idx_usage_events_outcome)
cause poor plan choices for the /api/usage time-bounded, grouped queries; add
composite time-window indexes that pair occurred_at with each grouping key to
support the WHERE occurred_at range plus GROUP BY on
logical_model/vendor/outcome/account_id (e.g., indexes on (logical_model,
occurred_at), (vendor, occurred_at), (outcome, occurred_at), and optionally
(account_id, occurred_at)); update the migration to create these composite
indexes alongside or instead of the single-column ones so SQLite can use the
range filter and grouping key together for efficient aggregation.

In `@clients/rook/src/admin/handlers.rs`:
- Around line 252-255: handle_get_usage currently accepts Query<GetUsageQuery>
which lets Axum produce its own rejection for malformed params; change the
extractor to Query<Result<GetUsageQuery,
axum::extract::rejection::QueryRejection>> (or the generic Query<Result<_, _>>)
so you can match on Err and return your structured error shape: map parsing
errors to (StatusCode::BAD_REQUEST, Json(AdminErrorResponse { ... })) and on
Ok(query) continue normal processing to produce Json<UsageSummaryView>; update
references inside handle_get_usage to use the unwrapped query on success and
ensure AdminErrorResponse is used for all query-parsing failures.

In `@clients/rook/src/admin/types.rs`:
- Around line 582-610: Add a non-empty grouped entry to the test to assert the
flattened group shape: construct a UsageGroupView in the
UsageSummaryView.by_model (e.g., key "gpt-4o" with an aggregate
UsageAggregateView having requests = 1) and serialize to json; then add
assertions that json["by_model"][0]["key"] == "gpt-4o" and
json["by_model"][0]["requests"] == 1 to verify UsageGroupView (with
#[serde(flatten]) flattens key and aggregate fields at the same level.
- Around line 261-266: The UsageSummaryWindowView struct currently types the
since and until fields as String which permits invalid timestamps; change the
types of UsageSummaryWindowView::since and ::until from String to
chrono::DateTime<chrono::Utc> and remove the .to_rfc3339() conversions at call
sites (e.g., in the handler that builds UsageSummaryWindowView) so you pass
DateTime<Utc> directly; rely on chrono's serde feature to produce RFC3339 on
serialization and update any imports/uses referencing these fields accordingly.

In `@clients/rook/src/db/usage.rs`:
- Around line 157-174: The query currently interpolates a raw &str `column` into
the SQL (in the format! block) creating a SQL injection risk; replace that with
a small allowlisted enum (e.g. Column { ProjectId, OrganizationId, Model, ... })
and change the function signature (the function that builds this SQL — e.g.
summarize_usage) to accept this enum instead of &str; map the enum to the exact
identifier string via a match (returning safe literals like "project_id",
"organization_id", etc.) and use that matched literal in the format! call,
leaving all other values as SQL parameters bound via sqlx::query_as, then update
all call sites to pass the enum variant. Ensure no user-controlled string is
ever interpolated directly into the SQL.
- Around line 81-95: The three bindings for event.prompt_tokens,
event.completion_tokens, and event.total_tokens silently convert overflowed u64
-> i64 to NULL via .and_then(|v| i64::try_from(v).ok()); change them to
propagate the conversion error instead of swallowing it: replace each
.and_then(... .ok()) with code that attempts i64::try_from(value) and returns
Err on failure (e.g., i64::try_from(value).map_err(|e| /* wrap into the
function's error type */)?), so the calling function (the usage persistence
routine) observes and returns the conversion error rather than persisting NULL;
use the same error-wrapping convention used elsewhere in this module (see how
latency_ms is handled) when mapping the conversion error.

In `@clients/rook/src/gateway/handlers.rs`:
- Around line 135-161: spawn_usage_record currently fire-and-forgets the ledger
write (tokio::spawn calling usage.record(event).await) which can lose writes on
crash; change it to perform the persistence inline or via a bounded background
worker that supports flush-on-shutdown: either await usage.record(event).await
directly inside spawn_usage_record and return/propagate errors, or push the
StoredUsageEvent into an existing bounded sender and ensure the worker drains
and awaits pending writes during application shutdown; update references to
spawn_usage_record, usage.record, and any tokio::spawn usage so the record is
not dropped silently (log errors as before but ensure the call completes or is
flushed on shutdown).
- Around line 341-353: The current code calls spawn_usage_record(...) with
stream: true and outcome "success" immediately when starting the SSE, causing
truncated/aborted streams to be misreported; instead, wrap the response stream
so that you only call spawn_usage_record(...) after the stream terminally
completes or errors (use the stream's completion/finalization handler), passing
the same UsageRecordInput fields (started_at, logical_model:
request.model.clone(), metric_context.clone_static(), account_id, tokens, etc.)
but set stream: true and outcome to "success" on clean completion or to an error
outcome/status and proper token counts on failure; replace the immediate call at
the start with deferred calls inside the stream finalizer (referencing
spawn_usage_record, UsageRecordInput, started_at, logical_model, metric_context,
account_id, and TokenUsageParts::none()) so ledger events reflect actual stream
termination rather than stream open.

In `@clients/rook/src/transport/rate_limit.rs`:
- Line 123: The prune closure is using
now.duration_since(state.window_started_at) which can panic if a newer
window_started_at is inserted before the async mutex is acquired; update the
comparison in pruning() (and any similar uses in check()) to call
now.saturating_duration_since(state.window_started_at) < ttl so late timestamps
won’t cause a panic—ensure you import/use Instant::saturating_duration_since and
replace the duration_since call inside windows.retain(|_, state| ...)
accordingly.

---

Outside diff comments:
In `@openspec/specs/gateway/spec.md`:
- Line 21: Update the Purpose section's wording that currently reads
"placeholder usage endpoint" to reflect R25 by describing the endpoint as a
real, persisted usage/accounting endpoint (e.g., mention that GET /api/usage
returns real persisted accounting data per R25) and remove any implication that
it is a stub; ensure the updated sentence references GET /api/usage and R25 so
readers know the endpoint is implemented and returns persisted data.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 32ace91b-6833-46a1-9b14-85669029de73

📥 Commits

Reviewing files that changed from the base of the PR and between a3da395 and a786870.

📒 Files selected for processing (14)
  • clients/rook/migrations/0007_usage_events.sql
  • clients/rook/src/admin/handlers.rs
  • clients/rook/src/admin/mod.rs
  • clients/rook/src/admin/types.rs
  • clients/rook/src/db/mod.rs
  • clients/rook/src/db/usage.rs
  • clients/rook/src/gateway/handlers.rs
  • clients/rook/src/registry/mod.rs
  • clients/rook/src/server/mod.rs
  • clients/rook/src/services/mod.rs
  • clients/rook/src/services/usage.rs
  • clients/rook/src/transport/rate_limit.rs
  • openspec/specs/gateway/rook-acceptance-regression-matrix.md
  • openspec/specs/gateway/spec.md
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
  • GitHub Check: sonar
  • GitHub Check: pr-checks
  • GitHub Check: semgrep-cloud-platform/scan
  • GitHub Check: submit-gradle
  • GitHub Check: Cloudflare Pages
🧰 Additional context used
📓 Path-based instructions (3)
**/*.rs

⚙️ CodeRabbit configuration file

**/*.rs: Focus on Rust idioms, memory safety, and ownership/borrowing correctness.
Flag unnecessary clones, unchecked panics in production paths, and weak error context.
Prioritize unsafe blocks, FFI boundaries, concurrency races, and secret handling.

Files:

  • clients/rook/src/services/mod.rs
  • clients/rook/src/transport/rate_limit.rs
  • clients/rook/src/services/usage.rs
  • clients/rook/src/admin/handlers.rs
  • clients/rook/src/db/mod.rs
  • clients/rook/src/db/usage.rs
  • clients/rook/src/admin/types.rs
  • clients/rook/src/server/mod.rs
  • clients/rook/src/gateway/handlers.rs
  • clients/rook/src/admin/mod.rs
  • clients/rook/src/registry/mod.rs
**/*

⚙️ CodeRabbit configuration file

**/*: Security first, performance second.
Validate input boundaries, auth/authz implications, and secret management.
Look for behavioral regressions, missing tests, and contract breaks across modules.

Files:

  • clients/rook/src/services/mod.rs
  • clients/rook/src/transport/rate_limit.rs
  • clients/rook/src/services/usage.rs
  • clients/rook/migrations/0007_usage_events.sql
  • clients/rook/src/admin/handlers.rs
  • clients/rook/src/db/mod.rs
  • openspec/specs/gateway/spec.md
  • clients/rook/src/db/usage.rs
  • clients/rook/src/admin/types.rs
  • clients/rook/src/server/mod.rs
  • clients/rook/src/gateway/handlers.rs
  • clients/rook/src/admin/mod.rs
  • clients/rook/src/registry/mod.rs
  • openspec/specs/gateway/rook-acceptance-regression-matrix.md
**/*.{md,mdx}

⚙️ CodeRabbit configuration file

**/*.{md,mdx}: Verify technical accuracy and that docs stay aligned with code changes.
For user-facing docs, check EN/ES parity or explicitly note pending translation gaps.

Files:

  • openspec/specs/gateway/spec.md
  • openspec/specs/gateway/rook-acceptance-regression-matrix.md
🔇 Additional comments (8)
openspec/specs/gateway/rook-acceptance-regression-matrix.md (1)

49-49: LGTM — traceability updates are accurate and consistent with the implementation.

The #594 caveat, #599 evidence command list, and the Implemented status for usage accounting all align with the PR's delivered artifacts and caveats (cost deferred, streaming token fields unknown).

Also applies to: 69-69, 75-75

openspec/specs/gateway/spec.md (1)

1137-1181: R25 and UsageSummaryView contract are technically accurate and implementation-aligned.

  • Redaction rules, bounded metadata, and route_rejected/unrouted semantics match the persistence layer.
  • estimated_cost_usd: null rule aligns with Option<f64> always being None today.
  • Group array contract ("same aggregate fields as totals") correctly describes the #[serde(flatten)] output from UsageGroupView.

Also applies to: 3053-3090

clients/rook/src/admin/types.rs (2)

252-259: UsageSummaryPeriod derivations and #[serde(rename_all)] are correct.

"hour" / "day" / "month" snake_case output matches the spec, #[default] on Day satisfies Default for query parameter defaulting, and both Serialize/Deserialize are needed since the type is used as both a query input and a response field.


268-296: Aggregate, group, and summary view types are correctly defined.

  • UsageAggregateView correctly derives only PartialEq (not Eq) because f64 doesn't implement Eq.
  • UsageGroupView with #[serde(flatten)] on aggregate produces { "key": "...", "requests": ..., ... }, matching the spec's "key + same aggregate fields as totals" contract. No field name collision exists since UsageAggregateView has no key field.
  • estimated_cost_usd: Option<f64> serializes as null by default (no skip_serializing_if), which is the correct behavior per spec.
clients/rook/src/services/mod.rs (1)

13-13: LGTM — follows the established module export pattern.

clients/rook/src/db/usage.rs (3)

5-60: LGTM — struct definitions match the schema.

Field types, optionality, and naming all align precisely with the usage_events schema defined in 0007_usage_events.sql.


122-148: LGTM — parameterized query with proper error mapping.


260-329: LGTM — test coverage for the core accounting paths.

Both time-window filtering and limit-clamped group aggregation are covered. Token arithmetic in the event() helper correctly exercises the COALESCE(SUM(...)) paths.

Comment thread clients/rook/migrations/0007_usage_events.sql Outdated
Comment thread clients/rook/src/admin/handlers.rs
Comment thread clients/rook/src/admin/types.rs
Comment thread clients/rook/src/admin/types.rs
Comment thread clients/rook/src/db/usage.rs Outdated
Comment thread clients/rook/src/db/usage.rs
Comment thread clients/rook/src/gateway/handlers.rs Outdated
Comment thread clients/rook/src/gateway/handlers.rs Outdated
Comment thread clients/rook/src/transport/rate_limit.rs Outdated
@yacosta738
Copy link
Copy Markdown
Contributor Author

Addressed the reviewed findings that apply to the current code. Summary:

  • Replaced single-column usage group indexes with composite grouping-key + occurred_at indexes.
  • Converted usage summary window timestamps to DateTime and added grouped serialization coverage.
  • Routed malformed usage query params through the admin error shape.
  • Removed raw SQL column interpolation by using an allowlisted grouping enum.
  • Made token u64 -> SQLite i64 overflow return an error instead of silently storing NULL.
  • Changed gateway usage writes to await persistence for non-streaming paths and record streaming usage after the SSE stream reaches its terminal [DONE] event.
  • Switched rate-limit window comparisons to saturating_duration_since and added future-window regression coverage.
  • Updated the active gateway spec purpose text to describe GET /api/usage as real persisted accounting.

Validation:

  • cargo fmt --manifest-path "clients/rook/Cargo.toml" --all -- --check
  • cargo test --manifest-path "clients/rook/Cargo.toml" usage
  • cargo test --manifest-path "clients/rook/Cargo.toml" chat_completion
  • cargo test --manifest-path "clients/rook/Cargo.toml" rate_limit
  • cargo clippy --manifest-path "clients/rook/Cargo.toml" --all-targets -- -D warnings

@sonarqubecloud
Copy link
Copy Markdown

sonarqubecloud Bot commented May 3, 2026

@yacosta738 yacosta738 merged commit 8381b6b into main May 3, 2026
17 checks passed
@yacosta738 yacosta738 deleted the feat/rook-usage-accounting-685 branch May 3, 2026 14:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Implement usage accounting for gateway requests

1 participant