Skip to content

aperture: enable WebSocket-level pings for LNC connections#212

Merged
Roasbeef merged 3 commits intomasterfrom
enable-ws-pings-lnc
Mar 24, 2026
Merged

aperture: enable WebSocket-level pings for LNC connections#212
Roasbeef merged 3 commits intomasterfrom
enable-ws-pings-lnc

Conversation

@Roasbeef
Copy link
Copy Markdown
Member

@Roasbeef Roasbeef commented Mar 5, 2026

In this PR, we configure the WebSocket reverse proxy to send periodic ping
frames to connected LNC clients. Previously the proxy was initialized with
zero-value ping/pong parameters, which effectively disabled WebSocket-level
keepalive entirely. This left LNC connections vulnerable to being silently
dropped by intermediary proxies and load balancers that tear down idle TCP
connections (often after 60-120s of inactivity).

The new defaults send a WebSocket ping every 30 seconds and expect a pong
response within 10 seconds. Both values are configurable via
`wspinginterval` and `wspongwait` in the aperture config.

This complements the GBN-level keepalive pings that operate at the
application layer: the WebSocket pings keep the underlying transport
connection alive through infrastructure that only inspects L4/L7 frames,
while GBN pings detect application-level liveness. Without the WS-level
pings, the GBN layer would eventually detect the dead connection via its
own timeout, but only after the user experiences a noticeable hang. With
both layers active, the connection stays alive through intermediaries and
the GBN layer serves as a backup liveness check.

This is the aperture-side companion to
lightninglabs/lightning-node-connect#123
which fixes the GBN-level timeouts.

In this commit, we configure the WebSocket reverse proxy to send
periodic ping frames to connected LNC clients. Previously the proxy
was initialized with zero-value ping/pong parameters, which effectively
disabled WebSocket-level keepalive entirely. This left LNC connections
vulnerable to being silently dropped by intermediary proxies and load
balancers that tear down idle TCP connections (often after 60-120s of
inactivity).

The new defaults send a WebSocket ping every 30 seconds and expect a
pong response within 10 seconds. Both values are configurable via
`wspinginterval` and `wspongwait` in the aperture config.

This complements the GBN-level keepalive pings that operate at the
application layer: the WebSocket pings keep the underlying transport
connection alive through infrastructure that only inspects L4/L7
frames, while GBN pings detect application-level liveness.
@hieblmi hieblmi self-requested a review March 19, 2026 09:18
Copy link
Copy Markdown
Collaborator

@hieblmi hieblmi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, pending two suggestions:

  1. Config validation in Config.validate() — catch negative durations and pongWait >= pingInterval misconfigurations
  2. Update sample-conf.yaml — new fields are undocumented while all other timeout fields are present

In this commit, we add startup validation for the new `wspinginterval`
and `wspongwait` config values. We reject negative durations outright,
and also reject configurations where the pong wait is greater than or
equal to the ping interval (since overlapping ping/pong cycles would
cause spurious connection kills). The constraint is only enforced when
pings are enabled, i.e., `wspinginterval > 0`.

We also add both fields to `sample-conf.yaml` so they're discoverable
alongside the existing timeout knobs, and update the `WsPongWait`
struct comment to call out the "must be less than ping interval"
invariant explicitly.
@claude
Copy link
Copy Markdown

claude bot commented Mar 24, 2026

Code review

No issues found. Checked for bugs and CLAUDE.md compliance.

In this commit, we increase the default pong wait from 10s to 15s. Since
aperture serves as a multi-hop mailbox proxy (client -> mailbox relay ->
aperture), pong frames need to traverse two network hops. Under degraded
conditions or high-latency mobile connections, 10s can be tight. The new
15s default gives a 45s total detection window (pingInterval + pongWait),
which is still well under typical load balancer idle timeouts of 60-120s.
@Roasbeef Roasbeef merged commit f808154 into master Mar 24, 2026
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants