Skip to content

Fix the issue where list user throws an error when executed concurrently with create user.#17583

Open
wenyanshi-123 wants to merge 2 commits intoapache:masterfrom
wenyanshi-123:fixV2-568
Open

Fix the issue where list user throws an error when executed concurrently with create user.#17583
wenyanshi-123 wants to merge 2 commits intoapache:masterfrom
wenyanshi-123:fixV2-568

Conversation

@wenyanshi-123
Copy link
Copy Markdown
Contributor

Fix the issue where list user throws an error when executed concurrently with create user.

Copy link
Copy Markdown
Contributor

@CRZbulabula CRZbulabula left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

Overview

This PR fixes a ConcurrentModificationException that occurs when list user is executed concurrently with create user / drop user. The root cause is that listAllEntities() and listAllEntitiesInfo() in BasicRoleManager iterate entityMap (a HashMap) without locking, while concurrent createUser / deleteEntity calls modify the map under per-entity HashLock. HashMap iterators are fail-fast and throw ConcurrentModificationException on structural modification.

The fix replaces HashMap with ConcurrentHashMap, whose iterators are weakly consistent and never throw CME.

Analysis

The fix is correct. ConcurrentHashMap is the right solution here because:

  • listAllEntities() (line 235) and listAllEntitiesInfo() (line 243) iterate the map without any lock
  • The existing HashLock is per-entity-name, not a global map lock, so it can't protect iteration
  • Adding a global read/write lock would be more invasive and could hurt write throughput
  • Weakly consistent iteration (seeing some but not all concurrent modifications) is acceptable for a LIST USERS operation

Inheritance covers both managers. BasicUserManager extends BasicRoleManager and calls super(accessor), so the ConcurrentHashMap change applies to the user manager too. The unprotected entityMap.entrySet() iteration in BasicUserManager:135 (getEntity(long entityId)) also benefits from this fix.

Issues & Suggestions

1. Test: Missing timeout on barrier.await() (minor risk)

If one thread fails before reaching the barrier, the other thread blocks indefinitely. The 30-second pool termination timeout would eventually unblock, but an explicit barrier timeout is cleaner:

barrier.await(10, TimeUnit.SECONDS);

2. Test: Consider adding a message to assertTrue

The assertTrue(pool.awaitTermination(30, ...)) will cause a generic assertion failure. A descriptive message would help debugging:

assertTrue("Threads did not finish in 30s", pool.awaitTermination(30, TimeUnit.SECONDS));

3. Pre-existing issue (not in scope, but worth noting): reset() is not thread-safe

BasicRoleManager.reset() (line 219-229) does entityMap.clear() then loads entities — even with ConcurrentHashMap, concurrent reads during reset() can see an empty or partial map. This is likely only called during initialization so it's low risk, but it's worth noting.

Summary

Aspect Assessment
Correctness ConcurrentHashMap is the right fix
Scope ✅ Minimal change targeting the root cause
Test coverage ✅ Reproduces the concurrent scenario well
Performance ✅ No concern
Risk ✅ Low — drop-in replacement for the usage patterns

Overall looks good. The core fix is correct and well-scoped. Consider the minor test robustness improvements above.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants