[ace6tao2] Linux Proactor Improvements#2524
[ace6tao2] Linux Proactor Improvements#2524simpsont-oci wants to merge 3 commits intoDOCGroup:ace6tao2from
Conversation
…X Proactor Bug Fixes and Improvements
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 13
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
ACE/tests/Proactor_Scatter_Gather_Test.cpp (1)
1338-1365:⚠️ Potential issue | 🟡 MinorDocument the new
-toption inprint_usage().The parser accepts backend selection now, but the usage text still omits it, so the error path doesn't tell users how to invoke the new option.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@ACE/tests/Proactor_Scatter_Gather_Test.cpp` around lines 1338 - 1365, The usage text is missing documentation for the new -t option; update the print_usage() function to include a concise description of "-t <backend>" (or similar) and explain accepted backend values (as parsed by Proactor_Test_Backend::parse_type) so users see how to select a backend when the parser hits the error path; add the flag and a short example/valid values list to the existing usage string returned/printed by print_usage().
🧹 Nitpick comments (2)
ACE/ace/Uring_Asynch_IO.h (1)
145-148: Consider ACE container alternatives tostd::set.Using
std::setforpending_results_is functional, but ACE traditionally uses its own containers likeACE_Unbounded_Setfor consistency. However,std::set's O(log n) operations for insert/erase/find may be preferable here for tracking pending async operations.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@ACE/ace/Uring_Asynch_IO.h` around lines 145 - 148, The declaration of pending_results_ uses std::set<ACE_Uring_Asynch_Result*> which diverges from ACE container conventions; replace it with ACE_Unbounded_Set<ACE_Uring_Asynch_Result*> (and add the corresponding `#include` <ace/Unbounded_Set.h>) and update all usages of pending_results_ (insert/erase/find iterations) to the ACE_Unbounded_Set API (e.g., insert, remove, find/seek as appropriate), or if you intentionally need ordered/O(log n) semantics retain std::set but add a comment near pending_results_ and in ACE_Uring_Asynch_Result usage explaining the performance/order requirement to justify keeping the STL container.ACE/ace/Uring_Asynch_IO.cpp (1)
51-55: Consider adding null-safety tohandler()method.If
handler_proxy_is ever an emptyProxy_Ptr, callingget()returnsnullptrand the subsequent->handler()dereference would crash. While this may follow existing ACE patterns assuming valid proxies, adding a defensive check would improve robustness.Suggested defensive check
ACE_Handler * ACE_Uring_Asynch_Result::handler (void) const { - return this->handler_proxy_.get ()->handler (); + ACE_Handler::Proxy *proxy = this->handler_proxy_.get (); + return proxy ? proxy->handler () : 0; }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@ACE/ace/Uring_Asynch_IO.cpp` around lines 51 - 55, The ACE_Uring_Asynch_Result::handler() currently dereferences this->handler_proxy_.get() without checking for null; modify ACE_Uring_Asynch_Result::handler() to first obtain a pointer from handler_proxy_.get(), check if it's null, and if so return nullptr (or an appropriate default), otherwise call and return ptr->handler(); reference the method ACE_Uring_Asynch_Result::handler and the member handler_proxy_ (and its get()) when locating the code to update.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@ACE/ace/POSIX_Proactor.cpp`:
- Around line 1083-1085: The no-arg handle_events() was changed to use a 1000ms
timeout causing busy polling; restore its original blocking behavior so
ACE_Proactor::proactor_run_event_loop() sleeps until work arrives. Locate the
call to handle_events_i in ACE_Proactor::handle_events() (currently invoked as
handle_events_i(1000)) and change it to use the blocking/infinite-wait variant
(i.e. pass the infinite/blocking timeout or call the blocking overload) so idle
proactor threads do not poll for work.
- Around line 1093-1103: post_completion added lazy init via
ensure_notify_manager(), but cancel_aio() and start_deferred_aio() still call
putq_result() directly so notify_completion() can return success without writing
to the pipe; ensure the notify-manager is created before any enqueue. Modify
cancel_aio() and start_deferred_aio() to call ensure_notify_manager() and handle
non-zero failure before calling putq_result(), or move the
ensure_notify_manager() call into putq_result()/enqueue path so
aiocb_notify_pipe_manager_ is guaranteed initialized whenever a result is
queued; keep notify_completion() behavior consistent and return error if
ensure_notify_manager() fails.
- Around line 545-551: The change replaces the real handler proxy with a
null_handler_proxy and turns the POSIX wakeup completion into a no-op, which
prevents dispatching ACE_Handler::handle_wakeup() and diverges from the WIN32
behavior; restore the original proxy usage by constructing
ACE_POSIX_Wakeup_Completion with the actual handler proxy (not
null_handler_proxy) so wakeup_completion will invoke handle_wakeup as before
(update the same pattern at the other occurrence noted around lines 2119-2121),
i.e., pass the real ACE_Handler::Proxy_Ptr instance used for the target handler
into ACE_POSIX_Wakeup_Completion instead of the null proxy.
In `@ACE/ace/Uring_Asynch_IO.cpp`:
- Around line 1172-1185: The send path sets result->msg()->msg_name to
remote_addr.get_addr() which can point at a temporary/stack buffer and cause a
use-after-free when the io_uring operation completes; modify
ACE_Uring_Asynch_Write_Dgram_Result to own a sockaddr buffer (e.g. add a member
like remote_addr_ and remote_addr_len_), copy remote_addr.get_addr() into that
buffer in the ACE_Uring_Asynch_Write_Dgram_Result constructor, and in
ACE_Uring_Asynch_Write_Dgram::send set result->msg()->msg_name =
result->remote_addr_ and result->msg()->msg_namelen = result->remote_addr_len_
before submitting with submit_result so the async io_uring sendmsg uses the
owned copy instead of the caller's buffer.
- Around line 952-996: The error log prefixes incorrectly reference
ACE_POSIX_Asynch_Connect::connect_i; update the ACELIB_ERROR messages in this
Uring implementation to use ACE_Uring_Asynch_Connect::connect_i instead —
specifically change the three occurrences around the setsockopt (near the
result->set_error(errno) after setsockopt), the bind (after ACE_OS::bind
failure), and the set_flags (after ACE::set_flags failure) so the log lines
reflect the ACE_Uring_Asynch_Connect::connect_i prefix while keeping the
existing message format and errno handling intact.
In `@ACE/ace/Uring_Proactor.cpp`:
- Around line 65-75: The two ACE_Uring_Proactor::handle_events overloads
currently return the raw batch size from process_cqes (e.g., 2..32) which breaks
the documented contract; change both handle_events() and the other call sites
that directly return process_cqes (e.g., the variants at the other noted
locations) to map the process_cqes return value into the documented semantic:
return 1 if process_cqes returns a positive count, 0 if it returns 0 (no
completions), and propagate -1 on error; locate the calls to process_cqes in
ACE_Uring_Proactor::handle_events(Time_Value&),
ACE_Uring_Proactor::handle_events(), and the other occurrences mentioned and
replace the direct return with this normalized mapping so callers receive only
1/0/-1.
- Around line 27-38: The init path leaves a partially initialized
ACE_Uring_Proactor when ::io_uring_queue_init fails; ensure the object is not
usable by either cleaning up and marking it fully uninitialized or by failing
construction: when io_uring_queue_init returns <0, call the appropriate cleanup
for ring_ (e.g., io_uring_queue_exit or equivalent), clear/zero ring_, set
is_initialized_ to false, and propagate the failure (return an error/throw) so
callers (which call process_cqes(), post_wakeup_completions(), etc.) never
receive a half-initialized instance; update
ACE_Uring_Proactor::io_uring_queue_init error path and any factory/creation code
to handle the propagated failure.
- Around line 116-177: The code incorrectly treats two completions from the same
handler as a duplicate and returns before calling io_uring_cqe_seen(), which can
leave a CQE at the head of the ring and starve later completions; change the
duplicate-detection to track ACE_Uring_Asynch_Result* (the actual pending key)
instead of result->handler(), and ensure io_uring_cqe_seen(&this->ring_, cqe) is
always called before any early return or continue. Concretely: replace
dispatched_handlers (std::set<const void *>) with a set of
ACE_Uring_Asynch_Result* (or equivalent name), check
dispatched_results.find(result) instead of dispatched_handlers.find(handler),
call io_uring_cqe_seen(&this->ring_, cqe) before skipping/returning, and remove
the handler-based logic that returns when the same handler is seen so that
multiple CQEs for the same handler are treated as independent completions.
In `@ACE/include/makeinclude/platform_linux.GNU`:
- Around line 72-74: When $(uring) is 1 the makefile currently adds -luring but
does not define the ACE_HAS_IO_URING macro required by guarded code (e.g., files
Uring_Proactor.h and Uring_Asynch_IO.h); modify the conditional that appends
LIBS so that it also adds a preprocessor definition (e.g., append
-DACE_HAS_IO_URING to the compiler flags such as CPPFLAGS or CXXFLAGS) inside
the same ifeq ($(uring),1) block so the guarded symbols are compiled when
linking liburing.
In `@ACE/tests/Proactor_Scatter_Gather_Test.cpp`:
- Around line 875-879: The extra call to initiate_write_file() in the
final-flush branch can fail and leave the writer alive with receiver_count_ == 0
and io_count_ == 0; after calling initiate_write_file() you must check its
return value and, on failure, explicitly perform the same completion/cleanup
path so the writer can terminate. Modify the branch that contains the
initiate_write_file() call to capture its return value and, if non-zero, invoke
the object's completion/cleanup helper (e.g., call the existing
handle_write_complete() or the code path that decrements
receiver_count_/io_count_ and triggers shutdown) so the event loop is not left
without a terminator.
In `@ACE/tests/Proactor_Stress_Test.cpp`:
- Around line 177-190: schedule_one() currently increments pending_ after
calling proactor_.schedule_timer(), which lets a zero-delay timer fire before
bookkeeping and causes wait_for_idle() races; move the bookkeeping ahead of
arming the timer: under the existing lock (ACE_GUARD_RETURN on this->lock_),
increment this->pending_ before calling this->proactor_.schedule_timer(...),
then call schedule_timer; if schedule_timer returns -1, decrement this->pending_
and increment this->schedule_failures_ to roll back, otherwise return success as
before; references: schedule_one(), proactor_.schedule_timer(), pending_,
schedule_failures_, wait_for_idle().
In `@ACE/tests/Proactor_Test_IPV6.cpp`:
- Around line 1733-1735: The current set_proactor_type uses only
Proactor_Test_Backend::parse_type(ptype, proactor_type) which merely parses
names and can allow selecting a backend that isn't build-available, causing
task1.start() to fail and run_main() to hang; restore the previous availability
gate here by checking the backend availability after parsing (e.g., call the
backend availability helper for proactor_type or
Proactor_Test_Backend::is_available/proactor_available(proactor_type)) and
return false (or perform a hard exit) when the chosen backend cannot be
instantiated so tests fail fast instead of waiting forever.
In `@ACE/tests/Proactor_UDP_Test.cpp`:
- Around line 1955-1957: set_proactor_type currently calls
Proactor_Test_Backend::parse_type(ptype, proactor_type) which only validates the
token and allows platform-incompatible backends through; update
set_proactor_type to also check backend availability (reuse the previous
availability check in Proactor_Test_Backend or call a new
is_available()/validate_availability method) and return false if the backend is
unsupported so the test fails fast, or alternatively ensure run_main() detects
startup failure (e.g., task1.start() failure) and returns immediately; reference
Proactor_Test_Backend::parse_type, set_proactor_type, run_main, and task1.start
when applying the change.
---
Outside diff comments:
In `@ACE/tests/Proactor_Scatter_Gather_Test.cpp`:
- Around line 1338-1365: The usage text is missing documentation for the new -t
option; update the print_usage() function to include a concise description of
"-t <backend>" (or similar) and explain accepted backend values (as parsed by
Proactor_Test_Backend::parse_type) so users see how to select a backend when the
parser hits the error path; add the flag and a short example/valid values list
to the existing usage string returned/printed by print_usage().
---
Nitpick comments:
In `@ACE/ace/Uring_Asynch_IO.cpp`:
- Around line 51-55: The ACE_Uring_Asynch_Result::handler() currently
dereferences this->handler_proxy_.get() without checking for null; modify
ACE_Uring_Asynch_Result::handler() to first obtain a pointer from
handler_proxy_.get(), check if it's null, and if so return nullptr (or an
appropriate default), otherwise call and return ptr->handler(); reference the
method ACE_Uring_Asynch_Result::handler and the member handler_proxy_ (and its
get()) when locating the code to update.
In `@ACE/ace/Uring_Asynch_IO.h`:
- Around line 145-148: The declaration of pending_results_ uses
std::set<ACE_Uring_Asynch_Result*> which diverges from ACE container
conventions; replace it with ACE_Unbounded_Set<ACE_Uring_Asynch_Result*> (and
add the corresponding `#include` <ace/Unbounded_Set.h>) and update all usages of
pending_results_ (insert/erase/find iterations) to the ACE_Unbounded_Set API
(e.g., insert, remove, find/seek as appropriate), or if you intentionally need
ordered/O(log n) semantics retain std::set but add a comment near
pending_results_ and in ACE_Uring_Asynch_Result usage explaining the
performance/order requirement to justify keeping the STL container.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 60d270fc-a2dd-4cab-b7ad-bac479ebb317
📒 Files selected for processing (23)
ACE/ace/Asynch_IO.hACE/ace/POSIX_Asynch_IO.cppACE/ace/POSIX_Proactor.cppACE/ace/POSIX_Proactor.hACE/ace/POSIX_Proactor.inlACE/ace/Proactor.cppACE/ace/Uring_Asynch_IO.cppACE/ace/Uring_Asynch_IO.hACE/ace/Uring_Proactor.cppACE/ace/Uring_Proactor.hACE/ace/ace.mpcACE/include/makeinclude/platform_linux.GNUACE/tests/Proactor_Contract_Test.cppACE/tests/Proactor_File_Test.cppACE/tests/Proactor_Scatter_Gather_Test.cppACE/tests/Proactor_Stress_Test.cppACE/tests/Proactor_Test.cppACE/tests/Proactor_Test_Backend.hACE/tests/Proactor_Test_IPV6.cppACE/tests/Proactor_Timer_Test.cppACE/tests/Proactor_UDP_Test.cppACE/tests/run_test.lstACE/tests/tests.mpc
| ACE_Handler::Proxy_Ptr null_handler_proxy; | ||
|
|
||
| for (int ci = 0; ci < how_many; ci++) | ||
| { | ||
| ACE_NEW_RETURN | ||
| (wakeup_completion, | ||
| ACE_POSIX_Wakeup_Completion (this->wakeup_handler_.proxy ()), | ||
| ACE_POSIX_Wakeup_Completion (null_handler_proxy), |
There was a problem hiding this comment.
Don't silently drop the POSIX wakeup handler.
These changes replace the previous proxy with a null proxy and then turn wakeup completion into a no-op. That means the POSIX backend no longer dispatches ACE_Handler::handle_wakeup() at all, which is an observable contract change and now diverges from the WIN32 path.
Also applies to: 2119-2121
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@ACE/ace/POSIX_Proactor.cpp` around lines 545 - 551, The change replaces the
real handler proxy with a null_handler_proxy and turns the POSIX wakeup
completion into a no-op, which prevents dispatching ACE_Handler::handle_wakeup()
and diverges from the WIN32 behavior; restore the original proxy usage by
constructing ACE_POSIX_Wakeup_Completion with the actual handler proxy (not
null_handler_proxy) so wakeup_completion will invoke handle_wakeup as before
(update the same pattern at the other occurrence noted around lines 2119-2121),
i.e., pass the real ACE_Handler::Proxy_Ptr instance used for the target handler
into ACE_POSIX_Wakeup_Completion instead of the null proxy.
| // Avoid an unbounded wait here so the event loop can observe | ||
| // end_event_loop() even if the notify-pipe wakeup is lost. | ||
| return this->handle_events_i (1000); |
There was a problem hiding this comment.
Keep the no-arg handle_events() blocking.
ACE_Proactor::proactor_run_event_loop() repeatedly calls this overload expecting it to sleep until work arrives. Changing it to a 1-second timeout turns idle proactor threads into a polling loop and papers over lost-wakeup bugs instead of fixing them.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@ACE/ace/POSIX_Proactor.cpp` around lines 1083 - 1085, The no-arg
handle_events() was changed to use a 1000ms timeout causing busy polling;
restore its original blocking behavior so
ACE_Proactor::proactor_run_event_loop() sleeps until work arrives. Locate the
call to handle_events_i in ACE_Proactor::handle_events() (currently invoked as
handle_events_i(1000)) and change it to use the blocking/infinite-wait variant
(i.e. pass the infinite/blocking timeout or call the blocking overload) so idle
proactor threads do not poll for work.
| if (this->aiocb_notify_pipe_manager_ == 0) | ||
| return 0; | ||
|
|
||
| return this->aiocb_notify_pipe_manager_->notify (); | ||
| } | ||
|
|
||
| int | ||
| ACE_POSIX_AIOCB_Proactor::post_completion (ACE_POSIX_Asynch_Result *result) | ||
| { | ||
| if (this->ensure_notify_manager () != 0) | ||
| return -1; |
There was a problem hiding this comment.
Lazy notify-manager setup doesn't cover every queued-completion path.
post_completion() now initializes the pipe, but cancel_aio() and start_deferred_aio() still enqueue directly through putq_result(). If either path runs before the manager exists, notify_completion() returns success at Line 1093 without writing to the pipe, so the queued completion only surfaces because of the new polling fallback.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@ACE/ace/POSIX_Proactor.cpp` around lines 1093 - 1103, post_completion added
lazy init via ensure_notify_manager(), but cancel_aio() and start_deferred_aio()
still call putq_result() directly so notify_completion() can return success
without writing to the pipe; ensure the notify-manager is created before any
enqueue. Modify cancel_aio() and start_deferred_aio() to call
ensure_notify_manager() and handle non-zero failure before calling
putq_result(), or move the ensure_notify_manager() call into
putq_result()/enqueue path so aiocb_notify_pipe_manager_ is guaranteed
initialized whenever a result is queued; keep notify_completion() behavior
consistent and return error if ensure_notify_manager() fails.
| ACELIB_ERROR ((LM_ERROR, | ||
| ACE_TEXT("ACE_POSIX_Asynch_Connect::connect_i: %p\n"), | ||
| ACE_TEXT("socket"))); | ||
| } | ||
| else | ||
| { | ||
| // Reuse the address | ||
| int one = 1; | ||
| if (protocol_family != PF_UNIX && | ||
| reuse_addr != 0 && | ||
| ACE_OS::setsockopt (handle, | ||
| SOL_SOCKET, | ||
| SO_REUSEADDR, | ||
| (const char*) &one, | ||
| sizeof one) == -1 ) | ||
| { | ||
| result->set_error (errno); | ||
| ACELIB_ERROR ((LM_ERROR, | ||
| ACE_TEXT("ACE_POSIX_Asynch_Connect::connect_i: %p\n"), | ||
| ACE_TEXT("setsockopt"))); | ||
| } | ||
| } | ||
| } | ||
|
|
||
| if (result->error () == 0 && local_sap != ACE_Addr::sap_any) | ||
| { | ||
| sockaddr * laddr = reinterpret_cast<sockaddr *> (local_sap.get_addr ()); | ||
| size_t size = local_sap.get_size (); | ||
|
|
||
| if (ACE_OS::bind (handle, laddr, size) == -1) | ||
| { | ||
| result->set_error (errno); | ||
| ACELIB_ERROR ((LM_ERROR, | ||
| ACE_TEXT("ACE_POSIX_Asynch_Connect::connect_i: %p\n"), | ||
| ACE_TEXT("bind"))); | ||
| } | ||
| } | ||
|
|
||
| // set non blocking mode | ||
| if (result->error () == 0 && ACE::set_flags (handle, ACE_NONBLOCK) != 0) | ||
| { | ||
| result->set_error (errno); | ||
| ACELIB_ERROR ((LM_ERROR, | ||
| ACE_TEXT("ACE_POSIX_Asynch_Connect::connect_i: %p\n") | ||
| ACE_TEXT("set_flags"))); |
There was a problem hiding this comment.
Incorrect error message prefix: should reference Uring, not POSIX.
These error messages were likely copied from the POSIX implementation but retain the wrong class name prefix. This will mislead developers during debugging.
Proposed fix
- ACELIB_ERROR ((LM_ERROR,
- ACE_TEXT("ACE_POSIX_Asynch_Connect::connect_i: %p\n"),
- ACE_TEXT("socket")));
+ ACELIB_ERROR ((LM_ERROR,
+ ACE_TEXT("ACE_Uring_Asynch_Connect::connect: %p\n"),
+ ACE_TEXT("socket")));Apply similar changes to lines 969-971 (setsockopt), 984-986 (bind), and 994-996 (set_flags).
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@ACE/ace/Uring_Asynch_IO.cpp` around lines 952 - 996, The error log prefixes
incorrectly reference ACE_POSIX_Asynch_Connect::connect_i; update the
ACELIB_ERROR messages in this Uring implementation to use
ACE_Uring_Asynch_Connect::connect_i instead — specifically change the three
occurrences around the setsockopt (near the result->set_error(errno) after
setsockopt), the bind (after ACE_OS::bind failure), and the set_flags (after
ACE::set_flags failure) so the log lines reflect the
ACE_Uring_Asynch_Connect::connect_i prefix while keeping the existing message
format and errno handling intact.
| ssize_t ACE_Uring_Asynch_Write_Dgram::send (ACE_Message_Block *message_block, size_t &/*number_of_bytes_sent*/, int flags, const ACE_Addr &remote_addr, const void *act, int, int) | ||
| { | ||
| ACE_Uring_Asynch_Write_Dgram_Result *result = 0; | ||
| ACE_NEW_RETURN (result, ACE_Uring_Asynch_Write_Dgram_Result (this->handler_proxy_, this->handle_, message_block, message_block->length (), flags, act, this->proactor_), -1); | ||
| result->msg ()->msg_name = (void *)remote_addr.get_addr (); | ||
| result->msg ()->msg_namelen = (socklen_t)remote_addr.get_size (); | ||
|
|
||
| ACE_GUARD_RETURN (ACE_Thread_Mutex, ace_mon, this->uring_proactor_->sq_mutex (), -1); | ||
| struct io_uring_sqe *sqe = this->uring_proactor_->get_sqe (); | ||
| if (!sqe) { delete result; errno = EAGAIN; return -1; } | ||
| ::io_uring_prep_sendmsg (sqe, this->handle_, result->msg (), flags); | ||
| ::io_uring_sqe_set_data (sqe, result); | ||
| return this->submit_result (result); | ||
| } |
There was a problem hiding this comment.
Potential use-after-free: remote_addr pointer may become invalid before async completion.
The msg_name pointer is set to point directly at the caller's remote_addr.get_addr() buffer. Since send() returns immediately while the actual sendmsg operation executes asynchronously via io_uring, if remote_addr is a stack variable or temporary, it may be destroyed before the kernel reads the address.
Compare with ACE_Uring_Asynch_Read_Dgram_Result which correctly stores remote_addr_ as a member field (line 1052). The write result should similarly copy the destination address into an internal buffer.
Proposed fix
Add a member to store the address in ACE_Uring_Asynch_Write_Dgram_Result:
// In header file, add member:
+ struct sockaddr_storage remote_addr_;
// In constructor or send():
- result->msg ()->msg_name = (void *)remote_addr.get_addr ();
- result->msg ()->msg_namelen = (socklen_t)remote_addr.get_size ();
+ ACE_OS::memcpy (&result->remote_addr_, remote_addr.get_addr (), remote_addr.get_size ());
+ result->msg ()->msg_name = &result->remote_addr_;
+ result->msg ()->msg_namelen = (socklen_t)remote_addr.get_size ();🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@ACE/ace/Uring_Asynch_IO.cpp` around lines 1172 - 1185, The send path sets
result->msg()->msg_name to remote_addr.get_addr() which can point at a
temporary/stack buffer and cause a use-after-free when the io_uring operation
completes; modify ACE_Uring_Asynch_Write_Dgram_Result to own a sockaddr buffer
(e.g. add a member like remote_addr_ and remote_addr_len_), copy
remote_addr.get_addr() into that buffer in the
ACE_Uring_Asynch_Write_Dgram_Result constructor, and in
ACE_Uring_Asynch_Write_Dgram::send set result->msg()->msg_name =
result->remote_addr_ and result->msg()->msg_namelen = result->remote_addr_len_
before submitting with submit_result so the async io_uring sendmsg uses the
owned copy instead of the caller's buffer.
| ifeq ($(uring),1) | ||
| LIBS += -luring | ||
| endif |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Check if ACE_HAS_IO_URING is defined elsewhere in the build system
rg -n 'ACE_HAS_IO_URING' ACE/include/makeinclude/ ACE/ace/config*.h 2>/dev/null || echo "Not found in makeinclude or config headers"Repository: DOCGroup/ACE_TAO
Length of output: 102
🏁 Script executed:
# Find Uring_Proactor.h and check its contents
fd -name '*[Uu]ring*' ACE/Repository: DOCGroup/ACE_TAO
Length of output: 290
🏁 Script executed:
# Search entire codebase for ACE_HAS_IO_URING
rg 'ACE_HAS_IO_URING' --no-headingRepository: DOCGroup/ACE_TAO
Length of output: 1266
🏁 Script executed:
# Check the context around lines 72-74 in platform_linux.GNU
sed -n '60,90p' ACE/include/makeinclude/platform_linux.GNURepository: DOCGroup/ACE_TAO
Length of output: 666
🏁 Script executed:
# Check if similar patterns in platform_linux.GNU define preprocessor flags
rg -A2 'ifeq.*\$\(' ACE/include/makeinclude/platform_linux.GNU | head -40Repository: DOCGroup/ACE_TAO
Length of output: 1028
🏁 Script executed:
# Search all files for where ACE_HAS_IO_URING is DEFINED (not just used)
rg 'define.*ACE_HAS_IO_URING|ACE_HAS_IO_URING.*=' --no-headingRepository: DOCGroup/ACE_TAO
Length of output: 701
🏁 Script executed:
# Check if there's a configure script or other build scripts that might define it
find ACE -name 'configure*' -o -name '*.m4' -o -name 'config.h*' | head -20Repository: DOCGroup/ACE_TAO
Length of output: 263
🏁 Script executed:
# Look at the broader context: is this a new feature being added? Check git history if available
git log --oneline -20 -- ACE/include/makeinclude/platform_linux.GNU 2>/dev/null || echo "Git history not available"Repository: DOCGroup/ACE_TAO
Length of output: 84
🏁 Script executed:
# Check if ACE_HAS_IO_URING might be defined in generated headers or cmake files
find . -name 'config.h' -o -name 'CMakeLists.txt' -o -name '*.in' | grep -i config | head -10Repository: DOCGroup/ACE_TAO
Length of output: 226
🏁 Script executed:
# Search for how other ACE_HAS_* macros are defined to understand the pattern
rg 'define.*ACE_HAS_AIO_CALLS' --no-headingRepository: DOCGroup/ACE_TAO
Length of output: 11059
🏁 Script executed:
# Check the ACE config header patterns
fd 'config' ACE/ace | head -10Repository: DOCGroup/ACE_TAO
Length of output: 340
🏁 Script executed:
# See if there's documentation about the uring feature
fd -i uring ACE | grep -E '\.(md|txt|rst|doc)$'Repository: DOCGroup/ACE_TAO
Length of output: 42
🏁 Script executed:
# Check config-linux.h to see if ACE_HAS_IO_URING might be defined there
cat ACE/ace/config-linux.h | grep -A5 -B5 'IO_URING\|uring' || echo "Not found in config-linux.h"Repository: DOCGroup/ACE_TAO
Length of output: 87
🏁 Script executed:
# Check all config-*.h files for any reference to uring or IO_URING
rg 'uring|IO_URING' ACE/ace/config-*.hRepository: DOCGroup/ACE_TAO
Length of output: 283
🏁 Script executed:
# Check if there's any cmake or build configuration that might set this
fd 'CMakeLists.txt|MPC|*.cmake' ACE | xargs grep -l uring 2>/dev/null || echo "No cmake files with uring"Repository: DOCGroup/ACE_TAO
Length of output: 427
🏁 Script executed:
# Final check: search the entire codebase for where ACE_HAS_IO_URING is defined with `#define`
rg '#define.*ACE_HAS_IO_URING' --no-headingRepository: DOCGroup/ACE_TAO
Length of output: 42
Missing ACE_HAS_IO_URING preprocessor definition in makefile conditional.
The makefile links -luring when $(uring)=1, but doesn't define the ACE_HAS_IO_URING preprocessor macro. Code in Uring_Proactor.h, Uring_Asynch_IO.h, and other files is guarded by #if defined (ACE_HAS_AIO_CALLS) && defined (ACE_HAS_IO_URING), so without the macro definition, this code won't compile even with the library linked.
Proposed fix to add preprocessor definition
ifeq ($(uring),1)
+ CPPFLAGS += -DACE_HAS_IO_URING
LIBS += -luring
endif📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| ifeq ($(uring),1) | |
| LIBS += -luring | |
| endif | |
| ifeq ($(uring),1) | |
| CPPFLAGS += -DACE_HAS_IO_URING | |
| LIBS += -luring | |
| endif |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@ACE/include/makeinclude/platform_linux.GNU` around lines 72 - 74, When
$(uring) is 1 the makefile currently adds -luring but does not define the
ACE_HAS_IO_URING macro required by guarded code (e.g., files Uring_Proactor.h
and Uring_Asynch_IO.h); modify the conditional that appends LIBS so that it also
adds a preprocessor definition (e.g., append -DACE_HAS_IO_URING to the compiler
flags such as CPPFLAGS or CXXFLAGS) inside the same ifeq ($(uring),1) block so
the guarded symbols are compiled when linking liburing.
| if (0 != this->odd_chain_ || 0 != this->even_chain_) | ||
| { | ||
| this->initiate_write_file (); | ||
| return; | ||
| } |
There was a problem hiding this comment.
Handle a failed final flush explicitly.
If this extra initiate_write_file() call fails, the callback returns with receiver_count_ == 0, io_count_ == 0, and no more completions pending. That leaves the writer alive and the event loop with nothing left to terminate it.
Suggested fix
if (0 != this->odd_chain_ || 0 != this->even_chain_)
{
- this->initiate_write_file ();
+ if (this->initiate_write_file () != 0)
+ {
+ ACE_Proactor::instance ()->end_event_loop ();
+ delete this;
+ }
return;
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if (0 != this->odd_chain_ || 0 != this->even_chain_) | |
| { | |
| this->initiate_write_file (); | |
| return; | |
| } | |
| if (0 != this->odd_chain_ || 0 != this->even_chain_) | |
| { | |
| if (this->initiate_write_file () != 0) | |
| { | |
| ACE_Proactor::instance ()->end_event_loop (); | |
| delete this; | |
| } | |
| return; | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@ACE/tests/Proactor_Scatter_Gather_Test.cpp` around lines 875 - 879, The extra
call to initiate_write_file() in the final-flush branch can fail and leave the
writer alive with receiver_count_ == 0 and io_count_ == 0; after calling
initiate_write_file() you must check its return value and, on failure,
explicitly perform the same completion/cleanup path so the writer can terminate.
Modify the branch that contains the initiate_write_file() call to capture its
return value and, if non-zero, invoke the object's completion/cleanup helper
(e.g., call the existing handle_write_complete() or the code path that
decrements receiver_count_/io_count_ and triggers shutdown) so the event loop is
not left without a terminator.
| int schedule_one () | ||
| { | ||
| long timer_id = this->proactor_.schedule_timer (*this, | ||
| 0, | ||
| ACE_Time_Value::zero); | ||
| ACE_GUARD_RETURN (ACE_Thread_Mutex, guard, this->lock_, -1); | ||
| if (timer_id == -1) | ||
| { | ||
| ++this->schedule_failures_; | ||
| return -1; | ||
| } | ||
|
|
||
| ++this->pending_; | ||
| return 0; |
There was a problem hiding this comment.
Account for the timer before you arm it.
With zero-delay timers and 8 dispatch threads, the callback can run before the later ++pending_. That makes pending_ drift from the real outstanding count and turns wait_for_idle() into a flaky race.
💡 Move the bookkeeping ahead of `schedule_timer()`
int schedule_one ()
{
- long timer_id = this->proactor_.schedule_timer (*this,
- 0,
- ACE_Time_Value::zero);
- ACE_GUARD_RETURN (ACE_Thread_Mutex, guard, this->lock_, -1);
+ {
+ ACE_GUARD_RETURN (ACE_Thread_Mutex, guard, this->lock_, -1);
+ ++this->pending_;
+ }
+
+ long timer_id = this->proactor_.schedule_timer (*this,
+ 0,
+ ACE_Time_Value::zero);
if (timer_id == -1)
{
+ ACE_GUARD_RETURN (ACE_Thread_Mutex, guard, this->lock_, -1);
+ --this->pending_;
++this->schedule_failures_;
+ if (this->dispatch_scale_ == 0 && this->pending_ == 0)
+ this->idle_.broadcast ();
return -1;
}
- ++this->pending_;
return 0;
}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@ACE/tests/Proactor_Stress_Test.cpp` around lines 177 - 190, schedule_one()
currently increments pending_ after calling proactor_.schedule_timer(), which
lets a zero-delay timer fire before bookkeeping and causes wait_for_idle()
races; move the bookkeeping ahead of arming the timer: under the existing lock
(ACE_GUARD_RETURN on this->lock_), increment this->pending_ before calling
this->proactor_.schedule_timer(...), then call schedule_timer; if schedule_timer
returns -1, decrement this->pending_ and increment this->schedule_failures_ to
roll back, otherwise return success as before; references: schedule_one(),
proactor_.schedule_timer(), pending_, schedule_failures_, wait_for_idle().
| set_proactor_type (const ACE_TCHAR *ptype) | ||
| { | ||
| if (!ptype) | ||
| return 0; | ||
|
|
||
| switch (ACE_OS::ace_toupper (*ptype)) | ||
| { | ||
| case 'D': | ||
| proactor_type = DEFAULT; | ||
| return 1; | ||
| case 'A': | ||
| proactor_type = AIOCB; | ||
| return 1; | ||
| case 'I': | ||
| proactor_type = SIG; | ||
| return 1; | ||
| #if defined (sun) | ||
| case 'S': | ||
| proactor_type = SUN; | ||
| return 1; | ||
| #endif /* sun */ | ||
| #if !defined (ACE_HAS_BROKEN_SIGEVENT_STRUCT) | ||
| case 'C': | ||
| proactor_type = CB; | ||
| return 1; | ||
| #endif /* !ACE_HAS_BROKEN_SIGEVENT_STRUCT */ | ||
| default: | ||
| break; | ||
| } | ||
| return 0; | ||
| return Proactor_Test_Backend::parse_type (ptype, proactor_type) == 0; |
There was a problem hiding this comment.
Keep backend availability validation in this path.
parse_type() only parses names; it doesn't reject backends that this build can't instantiate. In this test that means task1.start() can fail and run_main() still waits forever for sessions that will never start. Please keep the old availability gate here, or add a hard exit on startup failure.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@ACE/tests/Proactor_Test_IPV6.cpp` around lines 1733 - 1735, The current
set_proactor_type uses only Proactor_Test_Backend::parse_type(ptype,
proactor_type) which merely parses names and can allow selecting a backend that
isn't build-available, causing task1.start() to fail and run_main() to hang;
restore the previous availability gate here by checking the backend availability
after parsing (e.g., call the backend availability helper for proactor_type or
Proactor_Test_Backend::is_available/proactor_available(proactor_type)) and
return false (or perform a hard exit) when the chosen backend cannot be
instantiated so tests fail fast instead of waiting forever.
| set_proactor_type (const ACE_TCHAR *ptype) | ||
| { | ||
| if (!ptype) | ||
| return 0; | ||
|
|
||
| switch (ACE_OS::ace_toupper (*ptype)) | ||
| { | ||
| case 'D': | ||
| proactor_type = DEFAULT; | ||
| return 1; | ||
| case 'A': | ||
| proactor_type = AIOCB; | ||
| return 1; | ||
| case 'I': | ||
| proactor_type = SIG; | ||
| return 1; | ||
| #if defined (sun) | ||
| case 'S': | ||
| proactor_type = SUN; | ||
| return 1; | ||
| #endif /* sun */ | ||
| #if !defined (ACE_HAS_BROKEN_SIGEVENT_STRUCT) | ||
| case 'C': | ||
| proactor_type = CB; | ||
| return 1; | ||
| #endif /* !ACE_HAS_BROKEN_SIGEVENT_STRUCT */ | ||
| default: | ||
| break; | ||
| } | ||
| return 0; | ||
| return Proactor_Test_Backend::parse_type (ptype, proactor_type) == 0; |
There was a problem hiding this comment.
Don't accept unsupported backends here.
parse_type() only validates the token. It now lets platform-incompatible backends through, and when task1.start() later fails this test still drops into the while (!test.testing_done()) wait with zero sessions, so the process can hang instead of failing fast. Please keep the old availability check here, or make run_main() return immediately on startup failure.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@ACE/tests/Proactor_UDP_Test.cpp` around lines 1955 - 1957, set_proactor_type
currently calls Proactor_Test_Backend::parse_type(ptype, proactor_type) which
only validates the token and allows platform-incompatible backends through;
update set_proactor_type to also check backend availability (reuse the previous
availability check in Proactor_Test_Backend or call a new
is_available()/validate_availability method) and return false if the backend is
unsupported so the test fails fast, or alternatively ensure run_main() detects
startup failure (e.g., task1.start() failure) and returns immediately; reference
Proactor_Test_Backend::parse_type, set_proactor_type, run_main, and task1.start
when applying the change.
| int shutting_down_; | ||
|
|
||
| /// Serialize access to the shutdown flag between destroy() and svc(). | ||
| ACE_Thread_Mutex shutting_down_lock_; |
|
Does this fixed any of the open proactor bugzilla issues? Is #2479 something related? |
jwillemsen
left a comment
There was a problem hiding this comment.
did just a quick scan, not a detailed review, for master there are several places where C++17 features should be used
| ACE_Select_Reactor select_reactor_; | ||
|
|
||
| ACE_Reactor reactor_; | ||
| ACE_Atomic_Op<ACE_SYNCH_MUTEX, int> started_; |
| { | ||
| ACE_SEH_TRY | ||
| { | ||
| this->result_->complete (this->bytes_transferred_, |
|
|
||
| for (ai = 0; ai < aiocb_list_max_size_; ai++) | ||
| { | ||
| if (this->aiocb_list_[ai] == 0 ) // not active operation |
|
|
||
| if (flg_completed == 0) // not completed | ||
| { | ||
| num_pending++; |
| aiocb_list_[i] = result; | ||
| num_deferred_aiocb_ --; | ||
| this->aiocb_list_[deferred_index] = result; | ||
| this->num_deferred_aiocb_ --; |
A few things:
Proactor_Test_Backendparsing and instantiation support and having all Proactor tests make use of it.Summary by CodeRabbit
Release Notes
New Features
Bug Fixes
Tests