Skip to content

Fix run_for blocking indefinitely with outstanding work#220

Merged
sgerbino merged 1 commit intocppalliance:developfrom
sgerbino:fix/run-for-blocks-with-outstanding-work
Mar 30, 2026
Merged

Fix run_for blocking indefinitely with outstanding work#220
sgerbino merged 1 commit intocppalliance:developfrom
sgerbino:fix/run-for-blocks-with-outstanding-work

Conversation

@sgerbino
Copy link
Copy Markdown
Collaborator

@sgerbino sgerbino commented Mar 30, 2026

run_task() had no timeout parameter — the reactor always blocked forever or polled instantly, ignoring the timeout from do_one(). Propagate the timeout to each backend's I/O syscall (epoll_wait, select, kevent) and return after a timed reactor poll so run_one_until can recheck the deadline.

Summary by CodeRabbit

Release Notes

  • Improvements
    • Enhanced timeout handling across asynchronous I/O schedulers for improved scheduling precision and control.

run_task() had no timeout parameter — the reactor always blocked
forever or polled instantly, ignoring the timeout from do_one().
Propagate the timeout to each backend's I/O syscall (epoll_wait,
select, kevent) and return after a timed reactor poll so
run_one_until can recheck the deadline.
@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Mar 30, 2026

📝 Walkthrough

Walkthrough

This pull request adds a long timeout_us parameter to the virtual run_task method across all scheduler implementations (epoll, kqueue, reactor, and select). The schedulers now respect caller-provided timeouts rather than using fixed or infinite timeout values. Timeout computation logic is updated in each implementation to properly handle microsecond-to-millisecond conversions and interruption scenarios.

Changes

Cohort / File(s) Summary
Timeout Parameter Addition
include/boost/corosio/native/detail/epoll/epoll_scheduler.hpp, include/boost/corosio/native/detail/kqueue/kqueue_scheduler.hpp, include/boost/corosio/native/detail/select/select_scheduler.hpp
Added long timeout_us parameter to run_task method signature. Timeout computation changed from fixed values to use calculate_timeout(timeout_us) when not interrupted, with implementation-specific handling (epoll converts microseconds to milliseconds with rounding; kqueue and select preserve existing behavior with caller-provided timeout).
Virtual Interface & Control Flow Update
include/boost/corosio/native/detail/reactor/reactor_scheduler.hpp
Updated virtual run_task signature in base class to include long timeout_us parameter. Modified do_one to compute per-cycle task_timeout_us based on handler availability and return early after a complete polling cycle when original timeout exceeds zero.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

  • Epoll #101: Modifies epoll scheduler timeout handling with timerfd and rewrites timeout calculation logic, directly related to this PR's timeout parameter changes.

Poem

🐰 Tick-tock, no more wait,
Timeouts passed, don't hesitate!
Schedulers now respect the call,
Microseconds guide us all.

🚥 Pre-merge checks | ✅ 2
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'Fix run_for blocking indefinitely with outstanding work' directly describes the primary bug being fixed and aligns with the main objective stated in the PR description.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
include/boost/corosio/native/detail/epoll/epoll_scheduler.hpp (1)

301-307: Potential integer overflow for very large timeout_us values.

The expression timeout_us + 999 can overflow when timeout_us is close to LONG_MAX (signed overflow is undefined behavior). Additionally, the cast to int can truncate if the result exceeds INT_MAX (about 24.8 days in milliseconds).

The kqueue_scheduler and select_scheduler backends use calculate_timeout() which caps values appropriately. Consider applying similar bounds checking here for consistency and robustness:

♻️ Suggested fix with overflow protection
 int timeout_ms;
 if (task_interrupted_)
     timeout_ms = 0;
 else if (timeout_us < 0)
     timeout_ms = -1;
 else
-    timeout_ms = static_cast<int>((timeout_us + 999) / 1000);
+{
+    // Cap to INT_MAX milliseconds (~24.8 days) to prevent overflow
+    constexpr long max_timeout_us = static_cast<long>(INT_MAX) * 1000;
+    long capped_us = (std::min)(timeout_us, max_timeout_us);
+    timeout_ms = static_cast<int>((capped_us + 999) / 1000);
+}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@include/boost/corosio/native/detail/epoll/epoll_scheduler.hpp` around lines
301 - 307, The timeout conversion can overflow: avoid signed overflow from
`timeout_us + 999` and cap millisecond result to `INT_MAX`. In the
`epoll_scheduler` branch where `timeout_us` and `timeout_ms` are computed
(referencing `timeout_us`, `timeout_ms`, and `task_interrupted_`), replace the
naive `(timeout_us + 999) / 1000` with a safe check that if `timeout_us < 0` set
`timeout_ms = -1`, else if `timeout_us` is large enough that converting to
milliseconds would exceed `INT_MAX` set `timeout_ms = INT_MAX`, otherwise
compute `(timeout_us + 999) / 1000` using a cast to an unsigned or wider type to
avoid overflow; alternatively reuse the existing `calculate_timeout()` helper
used by `kqueue_scheduler`/`select_scheduler` for consistent bounds handling.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@include/boost/corosio/native/detail/epoll/epoll_scheduler.hpp`:
- Around line 301-307: The timeout conversion can overflow: avoid signed
overflow from `timeout_us + 999` and cap millisecond result to `INT_MAX`. In the
`epoll_scheduler` branch where `timeout_us` and `timeout_ms` are computed
(referencing `timeout_us`, `timeout_ms`, and `task_interrupted_`), replace the
naive `(timeout_us + 999) / 1000` with a safe check that if `timeout_us < 0` set
`timeout_ms = -1`, else if `timeout_us` is large enough that converting to
milliseconds would exceed `INT_MAX` set `timeout_ms = INT_MAX`, otherwise
compute `(timeout_us + 999) / 1000` using a cast to an unsigned or wider type to
avoid overflow; alternatively reuse the existing `calculate_timeout()` helper
used by `kqueue_scheduler`/`select_scheduler` for consistent bounds handling.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: d17ef516-16e9-4c80-ba49-5e9e5a7717fe

📥 Commits

Reviewing files that changed from the base of the PR and between adca257 and 133b129.

⛔ Files ignored due to path filters (1)
  • test/unit/io_context.cpp is excluded by !**/test/**
📒 Files selected for processing (4)
  • include/boost/corosio/native/detail/epoll/epoll_scheduler.hpp
  • include/boost/corosio/native/detail/kqueue/kqueue_scheduler.hpp
  • include/boost/corosio/native/detail/reactor/reactor_scheduler.hpp
  • include/boost/corosio/native/detail/select/select_scheduler.hpp

@cppalliance-bot
Copy link
Copy Markdown

An automated preview of the documentation is available at https://220.corosio.prtest3.cppalliance.org/index.html

If more commits are pushed to the pull request, the docs will rebuild at the same URL.

2026-03-30 19:27:13 UTC

@codecov
Copy link
Copy Markdown

codecov bot commented Mar 30, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 77.90%. Comparing base (adca257) to head (133b129).
⚠️ Report is 1 commits behind head on develop.

Additional details and impacted files

Impacted file tree graph

@@             Coverage Diff             @@
##           develop     #220      +/-   ##
===========================================
- Coverage    77.90%   77.90%   -0.01%     
===========================================
  Files           96       96              
  Lines         7397     7395       -2     
  Branches      1806     1805       -1     
===========================================
- Hits          5763     5761       -2     
  Misses        1110     1110              
  Partials       524      524              
Files with missing lines Coverage Δ
...st/corosio/native/detail/epoll/epoll_scheduler.hpp 83.20% <ø> (-0.13%) ⬇️
.../corosio/native/detail/kqueue/kqueue_scheduler.hpp 70.08% <ø> (-0.26%) ⬇️
.../corosio/native/detail/select/select_scheduler.hpp 83.65% <ø> (ø)

Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update adca257...133b129. Read the comment docs.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@cppalliance-bot
Copy link
Copy Markdown

GCOVR code coverage report https://220.corosio.prtest3.cppalliance.org/gcovr/index.html
LCOV code coverage report https://220.corosio.prtest3.cppalliance.org/genhtml/index.html
Coverage Diff Report https://220.corosio.prtest3.cppalliance.org/diff-report/index.html

Build time: 2026-03-30 19:38:08 UTC

@sgerbino sgerbino merged commit f34a1ba into cppalliance:develop Mar 30, 2026
42 checks passed
@sgerbino sgerbino deleted the fix/run-for-blocks-with-outstanding-work branch March 30, 2026 20:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants