Skip to content

peer: Convert lifecycle to context.#3633

Open
davecgh wants to merge 8 commits intodecred:masterfrom
davecgh:peer_context_lifecycle
Open

peer: Convert lifecycle to context.#3633
davecgh wants to merge 8 commits intodecred:masterfrom
davecgh:peer_context_lifecycle

Conversation

@davecgh
Copy link
Member

@davecgh davecgh commented Mar 4, 2026

This requires #3632.

This modifies the lifecycle of peers to use the more modern Run pattern that based on contexts.

In particular, this replaces the Start and WaitForDisconnect methods with a single method named Run and arranges for it to block until the provided context is cancelled or the peer is disconnected. This is more flexible for the caller since it can easily turn blocking code into async code while the reverse is not true.

The new Run method waits for all goroutines that it starts to shutdown before returning to help ensure an orderly shutdown.

Since all exported methods that send messages to the various groroutines via channels already select across the quit channel which is closed when the peer disconnects, the peer is now forcibly disconnected when the context is cancelled.

This approach allows the flexibility for callers to use any combination of manually disconnecting peers via the Disconnect method and allowing them to automatically be disconnected when the context is cancelled.

It also updates the server code accordingly.

@davecgh davecgh added this to the 2.2.0 milestone Mar 4, 2026
@davecgh davecgh force-pushed the peer_context_lifecycle branch 2 times, most recently from 858a00e to d084cc9 Compare March 6, 2026 18:30
@davecgh davecgh force-pushed the peer_context_lifecycle branch 5 times, most recently from be31954 to fff233c Compare March 10, 2026 05:31
@davecgh davecgh force-pushed the peer_context_lifecycle branch from fff233c to e3b8d67 Compare March 14, 2026 16:03
davecgh added 8 commits March 17, 2026 01:07
This correct the version in README.md to the most recent released
version and brings the documentation in doc.go to more modern standards.
This does some basic test cleanup and modernizes some of the peer tests
as follows:

- Consolidates the mock peer config used throughout the tests
- Consolidates and simplifies the mock pipe creation
- Marks peer state tests as a helper
- Uses t.Fatalf where appropriate
- Removes additional newlines in failure strings
The majority of the tests in TestOutboundPeer are not actually testing
anything because nothing is checked.  This moves the one thing that is
being tested into a separate test func and removes the rest since it is
already tested elsewhere.
The refactors the primary inbound message processing that checks
requirements, updates state, and invokes any configure message handlers
into a separate method.

This is primarily being done to support an upcoming change that will
need to make use of the same logic before the main read loop.
Due to legacy reasons that no longer apply, connections are currently
associated with a peer after the constructors have been called via
AssociateConnection.

This modifies the code to instead accept the connections in the inbound
and outbound constructors and exports the Start method in its place.

Ultimately, the goal is to split the handshake into a separate method
and convert the lifecycle over to use contexts.
The current design where the handshake happens asynchronously when the
async I/O is started is less than ideal and is quite brittle.  It also
significantly complicates everything as evidenced by several minor bugs
over the years that have resulted from faulty assumptions which directly
stem from its asynchronous nature.

For an example of some of the complexity it causes, it means that a
bunch of additional flags are required that solely related to the
handshake.  Namely, whether or not the version if known, whether the
verack has been received, and whether the handshake is done.  Then,
because it's all happening asynchronously, later code has to be vigilant
about checking that those events have happened.

All of this complexity can entirely be avoided by simply requiring a
successful synchronous handshake to take place prior to starting async
I/O.

With that in mind, this significantly reworks the way the handshake is
handled so that happens via a separate blocking method and removes async
handlers which are no longer required as a result.

The following is a high level overview of the changes:

- Introduce programmatically detectable errors consistent with other
  code throughout the repository
- Move the handshake code to a separate blocking method named Handshake
  that accepts a callback to invoke with the received version message
  - The new method returns an error that callers can use to reliably
    detect a failed handshake
  - The callback can return an error to cause the handshake to fail
    and pass the error along to the caller
- Make the initial handshake block until both the version and verack
  message are received
- Introduce delayed processing for up to 3 messages sent between the
  version and verack message on old protocol versions
- Any further received version or verack messages in the async I/O
  handlers are now unconditionally an error
- Removes the OnVersion and OnVerAck async listeners that no longer apply
- Updates the calling server code thread the overall process context
  down to the handshake and Run methods
- Adds several additional tests for correctness
- Updates the example to clearly show the new semantics
- Includes extra documentation to elucidate the exact requirements for
  establishing a new peer as well as exactly which properties the caller
  can and can't rely on during the handshake
Now that the handshake is required to take place prior to starting
async i/o processing, the version and verack messages are guaranteed to
have been seen for a successful handshake.

Given that, this removes the related fields and methods since they are
no longer needed.
This modifies the lifecycle of peers to use the more modern Run pattern
that based on contexts.

In particular, this replaces the Start and WaitForDisconnect methods
with a single method named Run and arranges for it to block until the
provided context is cancelled or the peer is disconnected.  This is more
flexible for the caller since it can easily turn blocking code into
async code while the reverse is not true.

The new Run method waits for all goroutines that it starts to shutdown
before returning to help ensure an orderly shutdown.

Since all exported methods that send messages to the various groroutines
via channels already select across the quit channel which is closed when
the peer disconnects, the peer is now forcibly disconnected when the
context is cancelled.

This approach allows the flexibility for callers to use any combination
of manually disconnecting peers via the Disconnect method and allowing
them to automatically be disconnected when the context is cancelled.

It also updates the server code accordingly.
@davecgh davecgh force-pushed the peer_context_lifecycle branch from e3b8d67 to 67005e3 Compare March 17, 2026 06:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants