Skip to content

fix: prevent refreshAliasesTask from reverting rotated aliases in aws_kms plugin#6805

Open
angabini wants to merge 1 commit intospiffe:mainfrom
angabini:fix-aws-kms-refresh-alias-race
Open

fix: prevent refreshAliasesTask from reverting rotated aliases in aws_kms plugin#6805
angabini wants to merge 1 commit intospiffe:mainfrom
angabini:fix-aws-kms-refresh-alias-race

Conversation

@angabini
Copy link
Copy Markdown

Pull Request check list

  • Commit conforms to CONTRIBUTING.md?
  • Proper tests/regressions included?
  • Documentation updated?

Affected functionality

aws_kms KeyManager plugin — specifically the refreshAliasesTask background task that periodically refreshes alias timestamps.

Description of change

In multi-replica SPIRE server deployments sharing a key_identifier_value, refreshAliasesTask on non-leader replicas can revert a KMS alias to a stale cached key after the leader rotates it. If the old key is subsequently scheduled for deletion, the alias ends up pointing to a disabled key, causing all future server startups to crash with FailedPrecondition.

This PR adds a DescribeKey check in refreshAliases() before calling UpdateAlias. If the alias's current target differs from the local cache (indicating a rotation by another replica), the refresh is skipped to avoid reverting the rotation.

Which issue this PR fixes

fixes #6804

…_kms plugin

Signed-off-by: angabini <494150+angabini@users.noreply.github.com>
@amartinezfayo amartinezfayo self-assigned this Mar 31, 2026
Copy link
Copy Markdown
Member

@amartinezfayo amartinezfayo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @angabini for this contribution!

}
if describeResp == nil || describeResp.KeyMetadata == nil || describeResp.KeyMetadata.Arn == nil {
p.log.Error("Malformed describe key response during alias refresh", aliasNameTag, entry.AliasName)
continue
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this path should also append to errs, similar to the DescribeKey error path.

require.Equal(t, "key_id_rotated", *aliasAfter.KeyEntry.KeyID, "alias should not have been reverted to the stale cached key")
}

func TestRefreshAliasesDescribeKeyError(t *testing.T) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we could also add a test for the malformed response branch (where DescribeKey succeeds but returns nil KeyMetadata or nil Arn)?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

aws_kms: refreshAliasesTask can revert rotated aliases in multi-replica deployments

2 participants