Skip to content

Fix flaky TestExitDetection in peertracker package#6854

Open
amartinezfayo wants to merge 1 commit intospiffe:mainfrom
amartinezfayo:fix-peertracker-test
Open

Fix flaky TestExitDetection in peertracker package#6854
amartinezfayo wants to merge 1 commit intospiffe:mainfrom
amartinezfayo:fix-peertracker-test

Conversation

@amartinezfayo
Copy link
Copy Markdown
Member

TestExitDetection could time out when the child process exited before Listener.Accept called NewWatcher to open /proc/<pid>. When that happened, NewWatcher failed, Accept discarded the connection and looped forever waiting for a new one that never arrived.

An example of a test failure is here: https://github.com/spiffe/spire/actions/runs/24145996427/job/70460046788?pr=6847

The child binary now blocks on stdin after writing the grandchild PID, keeping it alive until the test explicitly releases it. The test calls releaseChild() after Accept returns, ensuring the watcher is created while the child is still alive.

The test still validates the same behavior: exit detection works correctly when the original caller dies but a descendant holds the connection open.

Signed-off-by: Agustín Martínez Fayó <amartinezfayo@gmail.com>
Copy link
Copy Markdown
Collaborator

@sorindumitru sorindumitru left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the grandchild process now will block on that new io.ReadAll, but I'm not sure. Either way, it's probably ok because we send it a SIGKILL at the end of the test.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants