Skip to content

feat: Add Redis-based workspace stream quota for WebRTC sessions#2025

Merged
rafel-roboflow merged 51 commits intomainfrom
feat/dg-232-set-rate-limit-to-10-concurrent-streams-and-update
Mar 19, 2026
Merged

feat: Add Redis-based workspace stream quota for WebRTC sessions#2025
rafel-roboflow merged 51 commits intomainfrom
feat/dg-232-set-rate-limit-to-10-concurrent-streams-and-update

Conversation

@rafel-roboflow
Copy link
Copy Markdown
Contributor

@rafel-roboflow rafel-roboflow commented Feb 20, 2026

  • Limit concurrent WebRTC streams per workspace (default: 10)
  • Return HTTP 429 when quota exceeded
  • Add heartbeat endpoint for Modal workers to refresh session TTL

What does this PR do?

Related Issue(s): DG-232

Type of Change

  • New feature (non-breaking change that adds functionality)

Testing

  • I have tested this change locally

Test details:
I put max connections=3;

  • easy case: one, two, three; one after each.
  • case 1: one, two, wait 2 min, retry 3 ... is blocked; wait 8 minutes, retry 3... is blocked.
  • case 2: one, two, close two, open two, ... 3rd is blocked.

Checklist

  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have commented my code where necessary, particularly in hard-to-understand areas
  • My changes generate no new warnings or errors
  • I have updated the documentation accordingly (if applicable)

Additional Context


Note

Medium Risk
Adds Redis-backed quota enforcement to WebRTC session startup plus new heartbeat/end endpoints, which can block sessions (HTTP 429) if misconfigured or if Redis/heartbeat traffic behaves unexpectedly.

Overview
Adds an optional per-workspace concurrent WebRTC stream quota enforced via Redis, including new env knobs (WEBRTC_WORKSPACE_STREAM_QUOTA_ENABLED, quota, TTL, and heartbeat URL/interval).

WebRTC worker startup now assigns a session_id/workspace_id, checks the workspace quota, registers the session, and returns HTTP 429 (WorkspaceStreamQuotaError) when exceeded. Modal workers periodically call new HTTP endpoints (/webrtc/session/heartbeat and /webrtc/session/heartbeat/end) via an updated watchdog to refresh or release session slots, and heartbeat callbacks are made nullable-safe.

Written by Cursor Bugbot for commit 4f0476c. This will update automatically on new commits. Configure here.

- Limit concurrent WebRTC streams per workspace (default: 10)
- Return HTTP 429 when quota exceeded
- Add heartbeat endpoint for Modal workers to refresh session TTL
@PawelPeczek-Roboflow
Copy link
Copy Markdown
Collaborator

@rafel-roboflow - sorry, will not be added to todays release

…-set-rate-limit-to-10-concurrent-streams-and-update
…-update' of github.com:roboflow/inference into feat/dg-232-set-rate-limit-to-10-concurrent-streams-and-update
@codeflash-ai
Copy link
Copy Markdown
Contributor

codeflash-ai bot commented Feb 20, 2026

⚡️ Codeflash found optimizations for this PR

📄 153% (1.53x) speedup for with_route_exceptions_async in inference/core/interfaces/http/error_handlers.py

⏱️ Runtime : 538 microseconds 212 microseconds (best of 5 runs)

A dependent PR with the suggested changes has been created. Please review:

If you approve, it will be merged into this PR (branch feat/dg-232-set-rate-limit-to-10-concurrent-streams-and-update).

Static Badge

@rafel-roboflow rafel-roboflow marked this pull request as ready for review February 24, 2026 08:26
The heartbeat and session-end endpoints were returning plain dicts with
{"status": "error"} on auth failure, which FastAPI serialized as HTTP 200.
This caused the watchdog client to log these as successful heartbeats,
masking authentication failures from operators.

Now raises HTTPException with 401 for unauthorized and 404 for session
not found.
…-set-rate-limit-to-10-concurrent-streams-and-update
…-update' of github.com:roboflow/inference into feat/dg-232-set-rate-limit-to-10-concurrent-streams-and-update
…-set-rate-limit-to-10-concurrent-streams-and-update
…-set-rate-limit-to-10-concurrent-streams-and-update
…-update' of github.com:roboflow/inference into feat/dg-232-set-rate-limit-to-10-concurrent-streams-and-update
Copy link
Copy Markdown

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

raise HTTPException(
status_code=401,
detail={"status": "error", "message": "unauthorized"},
)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lets raise if workspace_id is empty / None

…-set-rate-limit-to-10-concurrent-streams-and-update
…-update' of github.com:roboflow/inference into feat/dg-232-set-rate-limit-to-10-concurrent-streams-and-update
@rafel-roboflow rafel-roboflow enabled auto-merge (squash) March 19, 2026 12:37
@rafel-roboflow rafel-roboflow merged commit bb257f5 into main Mar 19, 2026
47 checks passed
@rafel-roboflow rafel-roboflow deleted the feat/dg-232-set-rate-limit-to-10-concurrent-streams-and-update branch March 19, 2026 12:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants