Skip to content

connmgr: Convert to synchronous model.#3646

Open
davecgh wants to merge 11 commits intodecred:masterfrom
davecgh:connmgr_concurrent
Open

connmgr: Convert to synchronous model.#3646
davecgh wants to merge 11 commits intodecred:masterfrom
davecgh:connmgr_concurrent

Conversation

@davecgh
Copy link
Member

@davecgh davecgh commented Mar 13, 2026

This converts all of the code related to connection manager handling events to make use of separate mutexes and atomics to protect concurrent access.

The motivation for the change in architecture follows.

Currently, nearly all operations performed by the connection manager are implemented via a single event handler goroutine and a single message channel to protect concurrent access.

Generally speaking, mutexes and atomics are best suited to caches and state tracking, while channels are better suited to distributing units of work and communicating async results.

While the existing implementation has worked well for many years, it has several downsides such as:

  • Adding any new functionality requires additional plumbing to provide access to any related information
  • The state is mostly inaccessible since it is limited to a single goroutine
  • The use of channels significantly hinders dynamically adjusting to changing conditions based on the state due to the previous

There are several other improvements that would be ideal to make, but in the effort of making it easier to follow the changes and assert correctness, this series of commits focuses only on converting it to a synchronous model.

This consists of a series of commits to help ease the review process. Each commit is intended to be a self-contained and logically easy to follow change such that the code continues to compile and works properly at each step.

See the description of each commit for further details.

@davecgh davecgh added this to the 2.2.0 milestone Mar 13, 2026
davecgh added 11 commits March 17, 2026 01:12
This is the first in a series of commits that will ultimately convert
all of the code related to connection manager handling events to make
use of separate mutexes and atomics to protect concurrent access.

The motivation for the change in architecture follows.

Currently, nearly all operations performed by the connection manager are
implemented via a single event handler goroutine and a single message
channel to protect concurrent access.

Generally speaking, mutexes and atomics are best suited to caches and
state tracking, while channels are better suited to distributing units
of work and communicating async results.

While the existing implementation has worked well for many years, it has
several downsides such as:

- Adding any new functionality requires additional plumbing to provide
  access to any related information
- The state is mostly inaccessible since it is limited to a single
  goroutine
- The use of channels significantly hinders dynamically adjusting to
  changing conditions based on the state due to the previous

There are several other improvements that would be ideal to make, but in
the effort of making it easier to follow the changes and assert
correctness, this series of commits focuses only on converting it to a
synchronous model.

With all of the in mind, the commit starts the conversion by introducing
a separate mutex to protect the connection requests and moving the map
that tracks the connection requests out of the event handler to the
connection manager itself.

This will allow future commits to methodically refactor the various
operations without introducing races.
This moves the pending connection requests map out of the event handler
goroutine to the connection manager itself and makes it concurrent safe
by protecting it with the new connection mutex.

This is part of the overall effort to convert the code related to
handling the various connection manager events to synchronous code.
This refactors the logic related to registering a pending connection out
of the event handler since it is now independently concurrent safe.

This is a part of the overall effort to convert the code related to
handling the various connection manager events to synchronous code.
This refactors the logic related to canceling a pending connection out
of the event handler since it is now independently concurrent safe.

This is a part of the overall effort to convert the code related to
handling the various connection manager events to synchronous code.
This refactors the logic related to iterating active connections out of
the event handler since it is now independently concurrent safe.

This is a part of the overall effort to convert the code related to
handling the various connection manager events to synchronous code.
This refactors the logic related to handled established connections out
of the event handler since it is now independently concurrent safe.

This is a part of the overall effort to convert the code related to
handling the various connection manager events to synchronous code.
This refactors the logic related to handled failed connections out of
the event handler since it is now independently concurrent safe.

This is a part of the overall effort to convert the code related to
handling the various connection manager events to synchronous code.
This refactors the logic related to disconnection out of the event
handler since it is now independently concurrent safe.

This is a part of the overall effort to convert the code related to
handling the various connection manager events to synchronous code.
This removes the requests message channel and connection handler since
they are no longer used.

This is a part of the overall effort to convert the code related to
handling the various connection manager events to synchronous code.
This makes TestPassAddrAlongDialAddr more accurate by explicitly
detecting the dailed address directly in the provided dialer as opposed
to from the conn request object of the mock connection.

An invalid address, such as the one that is used in the test, would
never result in a valid connection.
This reworks the TestTargetOutbound test a bit to ensure it fails
if the expected number of connections are not made within a certain
timeout.  The prevents a test timeout in the failure case.
@davecgh davecgh force-pushed the connmgr_concurrent branch from 4d1e97e to 1a4d8a0 Compare March 17, 2026 06:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant