Skip to content

CollapsConfigs and Wildcard Matching for Opens (and Paths in Endpoints) while reproducing default storage behavior if not used#316

Open
entlein wants to merge 16 commits intokubescape:mainfrom
k8sstormcenter:upstream-pr/analyzer-zero-alloc
Open

CollapsConfigs and Wildcard Matching for Opens (and Paths in Endpoints) while reproducing default storage behavior if not used#316
entlein wants to merge 16 commits intokubescape:mainfrom
k8sstormcenter:upstream-pr/analyzer-zero-alloc

Conversation

@entlein
Copy link
Copy Markdown

@entlein entlein commented Apr 23, 2026

This is the beginning of the storage pr to bring the opensWildcards into upstream
Testing how to create clean PRs that also build with the respective node-agent version

What I want to achieve with this PR:

  • Have the CollapsConfig defined
  • Have the Defaults all set such that they reproduce the previous behavior
  • Have the Wildcard Comparison Function in there, even if not used

I m also running the benchmark test to check that this doesnt incur too much perf-overhead

Summary by CodeRabbit

  • New Features

    • Per-prefix path-collapse tuning with adjacent dynamic segments compacted into a single wildcard.
    • Order-aware wildcard-port merging for endpoints; SBOM-provided paths remain preserved.
  • Bug Fixes

    • Cleanup now skips resources marked as user-managed via annotations.
  • Chores

    • Adjusted Go module dependency versions.
  • Tests

    • Extensive new and updated unit tests, benchmarks, and profiling for path/endpoint analysis.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 23, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds per-prefix collapse configs and wildcard (*) handling to the dynamic path analyzer, folds endpoints across ports into a :0 wildcard, updates processors to use config-driven analyzers, expands tests/benchmarks/profiling, short-circuits cleanup for user-managed annotated resources, and downgrades three Go module dependencies.

Changes

Cohort / File(s) Summary
Dependency Management
go.mod
Downgraded: github.com/anchore/syft (v1.42.3 → v1.32.0), github.com/anchore/stereoscope (v0.1.22 → v0.1.9), github.com/opencontainers/runtime-spec (v1.3.0 → v1.2.1).
Processor callers
pkg/registry/file/applicationprofile_processor.go, pkg/registry/file/containerprofile_processor.go
Processors now construct analyzers via dynamicpathdetector.NewPathAnalyzerWithConfigs(...), using analyzer-exported thresholds/configs rather than local hardcoded thresholds.
Resource Cleanup
pkg/registry/file/cleanup.go, pkg/registry/file/cleanup_test.go
Added isUserManaged check that early-skips deletion for resources annotated as user-managed; added TestIsUserManaged.
Path Detector Types & Config
pkg/registry/file/dynamicpathdetector/types.go
Introduced WildcardIdentifier ("*") and exported thresholds (OpenDynamicThreshold, EndpointDynamicThreshold), CollapseConfig type, DefaultCollapseConfigs() and DefaultCollapseConfig, plus PathAnalyzer fields to hold per-prefix configs and default config.
Path Detector Core Logic
pkg/registry/file/dynamicpathdetector/analyzer.go
Large refactor: NewPathAnalyzerWithConfigs, longest-prefix config selection, per-segment thresholding, pooled buffers, adjacent-dynamic compaction, new recursive wildcard (*) matcher, and APIs FindConfigForPath / CollapseAdjacentDynamicIdentifiers.
Endpoint Analysis
pkg/registry/file/dynamicpathdetector/analyze_endpoints.go, pkg/registry/file/dynamicpathdetector/tests/analyze_endpoints_test.go
Two-pass endpoint processing to preserve real ports during trie build, parse port/path strings, and fold endpoints sharing (path,direction,internal) into a :0 wildcard with order-aware merging; tests expanded for wildcard-port merging and Internal isolation.
Opens Analysis & Tests
pkg/registry/file/dynamicpathdetector/*, pkg/registry/file/dynamicpathdetector/tests/analyze_opens_test.go, .../coverage_test.go
Analyzer and tests migrated to config-driven constructor; extensive tests added for per-prefix thresholds, collapse boundaries, * vs outcomes, SBOM path preservation, idempotence, multi-level collapse, and regressions.
Processor Tests
pkg/registry/file/applicationprofile_processor_test.go
Added tests validating opens collapsing behavior, SBOM-preservation, mixed-prefix collapse, nil-SBOM handling, sorting, and end-to-end PreSave collapse.
Benchmarks & Profiling
pkg/registry/file/dynamicpathdetector/tests/benchmark_test.go, pkg/registry/file/dynamicpathdetector/tests/profile_test.go
Benchmarks updated to use new constructor; one benchmark commented out; added opt-in profiling test and warm/cold benchmarks.

Sequence Diagram(s)

sequenceDiagram
    participant Proc as Processor (app/container)
    participant Analyzer as PathAnalyzer
    participant SBOM as SBOM set
    participant Merger as EndpointMerger
    participant Store as Persist/Save

    Proc->>Analyzer: NewPathAnalyzerWithConfigs(defaultThreshold, configs)
    Proc->>Analyzer: AnalyzeOpens(opens, sbomSet)
    Analyzer->>SBOM: check SBOM-preserved paths
    Analyzer-->>Proc: collapsed opens (dynamic/wildcard entries)
    Proc->>Analyzer: AnalyzeEndpoints(endpoints)
    Analyzer->>Analyzer: build trie using real ports (pass 1)
    Analyzer->>Analyzer: process endpoints with copied loop vars (pass 2)
    Analyzer->>Merger: MergeDuplicateEndpoints(list)
    Merger-->>Analyzer: merged endpoints (may include :0 wildcard)
    Proc->>Store: PreSave(collapsed opens, merged endpoints)
    Store-->>Proc: saved
Loading

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120 minutes

Poem

🐇 I hop through tries where prefixes meet,
I stitch wildcards where long paths fleet,
Ports fold to zero when siblings align,
Tests squeak approval — collapses refine. 🥕

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 52.58% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title references CollapseConfigs and Wildcard Matching for Opens and Endpoints, which are core features introduced across multiple files in this PR, making it partially related to the main changes rather than fully capturing the primary objective.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@entlein entlein changed the title storage PR 1 - test if it has clean git history CollapsConfigs and Wildcard Matching for Opens (and Paths in Endpoints) while reproducing default storage behavior if not used Apr 23, 2026
@entlein
Copy link
Copy Markdown
Author

entlein commented Apr 23, 2026

The first commit had the following baseline test

go test -bench='^BenchmarkAnalyzePath(|Warm|Cold)$' -benchmem -benchtime=3s ./pkg/registry/file/dynamicpathdetector/tests/
? github.com/kubescape/storage/pkg/registry/file/dynamicpathdetector [no test files]
ok github.com/kubescape/storage/pkg/registry/file/dynamicpathdetector/tests 2.359s
goos: darwin
goarch: amd64
pkg: github.com/kubescape/storage/pkg/registry/file/dynamicpathdetector/tests
cpu: VirtualApple @ 2.50GHz
BenchmarkAnalyzePath-14 14158999 247.8 ns/op 39 B/op 1 allocs/op
BenchmarkAnalyzePathWarm-14 14414613 246.6 ns/op 39 B/op 1 allocs/op
BenchmarkAnalyzePathCold-14 4350681 831.9 ns/op 1913 B/op 21 allocs/op

@entlein
Copy link
Copy Markdown
Author

entlein commented Apr 23, 2026

go test -count=1 -timeout=2m ./pkg/registry/file/dynamicpathdetector/...
go test -bench='^BenchmarkAnalyzePathWarm$' -benchmem -benchtime=3s -run=^$ ./pkg/registry/file/dynamicpathdetector/tests/
? github.com/kubescape/storage/pkg/registry/file/dynamicpathdetector [no test files]
ok github.com/kubescape/storage/pkg/registry/file/dynamicpathdetector/tests 2.442s
goos: darwin
goarch: amd64
pkg: github.com/kubescape/storage/pkg/registry/file/dynamicpathdetector/tests
cpu: VirtualApple @ 2.50GHz
BenchmarkAnalyzePathWarm-14 13851844 257.3 ns/op 39 B/op 1 allocs/op
PASS
ok github.com/kubescape/storage/pkg/registry/file/dynamicpathdetector/tests 6.065s

@entlein
Copy link
Copy Markdown
Author

entlein commented Apr 23, 2026

pkg: github.com/kubescape/storage/pkg/registry/file/dynamicpathdetector/tests
cpu: VirtualApple @ 2.50GHz
BenchmarkAnalyzePathWarm-14 11753203 307.5 ns/op 39 B/op 1 allocs/op

@entlein
Copy link
Copy Markdown
Author

entlein commented Apr 23, 2026

pkg: github.com/kubescape/storage/pkg/registry/file/dynamicpathdetector/tests
cpu: VirtualApple @ 2.50GHz
BenchmarkAnalyzePathWarm-14 10592332 334.0 ns/op 39 B/op 1 allocs/op

@entlein
Copy link
Copy Markdown
Author

entlein commented Apr 24, 2026

go test -count=1 -timeout=2m ./pkg/registry/file/dynamicpathdetector/...
go test -bench='^BenchmarkAnalyzePathWarm$' -benchmem -benchtime=3s -run=^$ ./pkg/registry/file/dynamicpathdetector/tests/
? github.com/kubescape/storage/pkg/registry/file/dynamicpathdetector [no test files]
ok github.com/kubescape/storage/pkg/registry/file/dynamicpathdetector/tests 2.665s
goos: darwin
goarch: amd64
pkg: github.com/kubescape/storage/pkg/registry/file/dynamicpathdetector/tests
cpu: VirtualApple @ 2.50GHz
BenchmarkAnalyzePathWarm-14 10105309 339.7 ns/op 39 B/op 1 allocs/op
PASS
ok github.com/kubescape/storage/pkg/registry/file/dynamicpathdetector/tests 6.023s

@entlein
Copy link
Copy Markdown
Author

entlein commented Apr 24, 2026

Component tests green here https://github.com/k8sstormcenter/node-agent/actions/runs/24909769622/job/72949695747 . The randomx-test is red for totally unrelated reasons.

@entlein entlein marked this pull request as ready for review April 24, 2026 20:29
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
pkg/registry/file/dynamicpathdetector/analyze_endpoints.go (1)

128-156: ⚠️ Potential issue | 🟡 Minor

Order-sensitive wildcard merging in MergeDuplicateEndpoints.

The wildcard-absorbs-specific logic at lines 140–149 only fires for a non-wildcard endpoint when a wildcard variant is already in seen. In the reverse order — specific endpoint encountered first, then wildcard — the wildcard is simply inserted as a new entry, and the two are not merged.

Example (as if called directly):

input: [":80/api/data" POST, ":0/api/data" GET]  →  output keeps BOTH entries
input: [":0/api/data" GET, ":80/api/data" POST]  →  output merges to one ":0/api/data"

The test TestMergeDuplicateEndpointsWildcardPort only covers the wildcard-first ordering. In practice, callers within AnalyzeEndpoints are safe because ports are rewritten uniformly before MergeDuplicateEndpoints is invoked, so the asymmetry is unreachable via that path. However, the function is exported, and no guard prevents direct callers from encountering this ordering issue.

Consider either (a) making the merge symmetric by also checking, when the current endpoint has a wildcard port, whether any specific-port variant already in seen should be folded into it, or (b) documenting on MergeDuplicateEndpoints that wildcard-port endpoints must precede their specific-port siblings, and adding a reverse-order test to lock that contract in.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/registry/file/dynamicpathdetector/analyze_endpoints.go` around lines 128
- 156, MergeDuplicateEndpoints currently only folds specific endpoints into an
existing wildcard but not the reverse; fix by making merging symmetric: inside
MergeDuplicateEndpoints, after computing port and pathPart via
splitEndpointPortAndPath(endpoint.Endpoint), if isWildcardPort(port) is true
then scan the seen map for keys matching the same pathPart and
endpoint.Direction but with specific ports (non-wildcard), for each match call
existing.Methods = MergeStrings(existing.Methods, endpoint.Methods) and
mergeHeaders(existing, endpoint), remove the specific entry from seen and from
newEndpoints, then insert the wildcard (or merge into the wildcard entry) to
ensure one combined entry; keep using getEndpointKey, mergeHeaders, seen and
newEndpoints as the identifiers and update/add a unit test (e.g.,
TestMergeDuplicateEndpointsWildcardPort) to cover specific-first and
wildcard-first orders.
🧹 Nitpick comments (4)
pkg/registry/file/dynamicpathdetector/tests/benchmark_test.go (1)

75-82: Remove commented-out BenchmarkCompareDynamic rather than leaving dead code.

Commented-out benchmark functions accumulate over time and obscure git history. If CompareDynamic no longer exists or is being retired, delete this block; if it still exists and benchmarking it matters, re-enable the function.

🧹 Proposed cleanup
-// func BenchmarkCompareDynamic(b *testing.B) {
-// 	dynamicPath := "/api/\u22ef/\u22ef"
-// 	regularPath := "/api/users/123"
-// 	for i := 0; i < b.N; i++ {
-// 		_ = dynamicpathdetector.CompareDynamic(dynamicPath, regularPath)
-// 	}
-// 	b.ReportAllocs()
-// }
-
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/registry/file/dynamicpathdetector/tests/benchmark_test.go` around lines
75 - 82, Remove the dead commented-out benchmark block for
BenchmarkCompareDynamic: either delete the entire commented function or restore
it as an active benchmark if you intend to keep measuring CompareDynamic;
specifically remove or re-enable the block that references
BenchmarkCompareDynamic and the call to dynamicpathdetector.CompareDynamic so
the test file no longer contains commented-out code.
pkg/registry/file/dynamicpathdetector/tests/analyze_endpoints_test.go (1)

259-263: Brittle port extraction — prefer strings.HasPrefix with a delimiter.

ep.Endpoint[:len(":0")] takes the first two bytes and compares them to :0. It relies on a hard-coded length, panics if an endpoint is ever shorter than 2 chars, and only checks a prefix with no delimiter (e.g. a hypothetical :0x/... would match). Using strings.HasPrefix(ep.Endpoint, ":0/") is safer and expresses intent directly.

🔧 Proposed fix
 	for _, ep := range result {
-		port := ep.Endpoint[:len(":0")]
-		assert.Equal(t, ":0", port, "endpoint %s should have wildcard port", ep.Endpoint)
+		assert.True(t, strings.HasPrefix(ep.Endpoint, ":0/"),
+			"endpoint %s should have wildcard port", ep.Endpoint)
 	}

(Applies to both TestAnalyzeEndpointsWildcardPortAbsorbsSpecificPort and TestAnalyzeEndpointsWildcardPortAfterSpecificPorts; requires importing "strings".)

Also applies to: 283-287

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/registry/file/dynamicpathdetector/tests/analyze_endpoints_test.go` around
lines 259 - 263, The tests TestAnalyzeEndpointsWildcardPortAbsorbsSpecificPort
and TestAnalyzeEndpointsWildcardPortAfterSpecificPorts use brittle slicing
ep.Endpoint[:len(":0")] to check the wildcard port; replace those checks with a
proper prefix check using strings.HasPrefix(ep.Endpoint, ":0/") (or
strings.HasPrefix(ep.Endpoint, ":0") plus an explicit delimiter check) to avoid
panics and false matches, update the assertion accordingly (e.g., assert.True(t,
strings.HasPrefix(...), "...") or assert.Equal for the intended prefix), and add
an import for "strings".
pkg/registry/file/dynamicpathdetector/tests/profile_test.go (1)

83-167: Optional cleanup: unchecked Close errors and the dead fmt keep-alive.

Two small housekeeping items:

  1. cpuFile.Close() (lines 83, 89), memFile.Close() (line 102), and goFile.Close() (line 113) silently drop their errors — that's what golangci-lint's errcheck is flagging. In a test this is benign, but a truncated profile would fail silently. Either log the error or explicitly discard it so intent is clear.
  2. var _ = fmt.Sprint on line 167 keeps the fmt import alive for "future debugging prints" — at present it's dead code. Simpler to drop both the import and the anchor until a caller lands.
♻️ Proposed cleanup
 	if err := pprof.StartCPUProfile(cpuFile); err != nil {
 		t.Fatalf("start cpu profile: %v", err)
 	}
-
+	defer func() {
+		if err := cpuFile.Close(); err != nil {
+			t.Logf("close cpu profile: %v", err)
+		}
+	}()

And at line 167, drop the fmt import and the anchor:

-	"path/filepath"
 	"runtime"
@@
-// Ensure fmt is kept imported when future debugging prints land here.
-var _ = fmt.Sprint
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/registry/file/dynamicpathdetector/tests/profile_test.go` around lines 83
- 167, Fix the unchecked Close errors and remove the dead fmt anchor: check and
handle the returned errors from cpuFile.Close(), memFile.Close(), and
goFile.Close() in the profiling code (e.g., log or t.Fatalf on error in
TestAnalyzePath/profile helpers) so file-close failures don’t get dropped, and
remove the no-op "var _ = fmt.Sprint" and the unused fmt import (or reintroduce
a real use) to avoid keeping fmt alive unnecessarily; locate these changes
around the TestAnalyzePath profiling block and the file-scope anchor in
profile_test.go.
pkg/registry/file/dynamicpathdetector/analyzer.go (1)

51-62: Minor inconsistency between effectiveThreshold and FindConfigForPath tiebreaking.

Both functions implement "longest-prefix wins" but disagree on the details:

  • effectiveThreshold (line 56): len(c.Prefix) >= bestLen, bestLen starts at 0.
  • FindConfigForPath (line 381): len(cfg.Prefix) > bestLen, bestLen starts at -1.

With >=, if two configs share the same prefix length the later one wins in effectiveThreshold, while the earlier one wins in FindConfigForPath. In practice DefaultCollapseConfigs has unique prefixes so this is inert today, but FindConfigForPath is documented as "so callers and tests can introspect which threshold will apply" — if the two diverge, introspection will lie.

Worth aligning on > and the same initial sentinel (or, better, extracting a shared func (ua *PathAnalyzer) longestMatch(path string) *CollapseConfig and having both entry points call it).

Also applies to: 376-393

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/registry/file/dynamicpathdetector/analyzer.go` around lines 51 - 62,
effectiveThreshold currently uses len(c.Prefix) >= bestLen with bestLen starting
at 0, which ties differently from FindConfigForPath (which uses > and bestLen =
-1); align their tie-breaking by extracting a shared helper (e.g., longestMatch
or findBestConfig) that iterates ua.configs and returns the best *CollapseConfig
for a given path using strict ">" comparison and a sentinel bestLen of -1, then
have both effectiveThreshold and FindConfigForPath call that helper to obtain
the chosen config and derive the threshold or return value accordingly so both
functions behave identically.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@go.mod`:
- Line 9: The go.mod downgrade to github.com/anchore/syft v1.32.0 reintroduces
CVE-2026-33481; revert or upgrade the syft dependency to at least v1.42.3 (or
newer) so the improper temporary file cleanup fix is included, update the go.mod
entry for github.com/anchore/syft accordingly, run go mod tidy and CI tests, and
re-run tests touching pkg/apis/softwarecomposition/v1beta1/syfttypes_test.go and
code paths using syft’s sbom.SBOM, pkg.* and syftfile.* types to verify no
regressions.

In `@pkg/registry/file/cleanup.go`:
- Around line 188-191: The check is reading the marker from metadata.Labels but
the marker is actually set on annotations; update the conditional to use
metadata.Annotations[helpersv1.ManagedByMetadataKey] ==
helpersv1.ManagedByUserValue (using the existing metadata variable and the
constants helpersv1.ManagedByMetadataKey and helpersv1.ManagedByUserValue) so
user-managed resources are correctly skipped; ensure you replace the Labels
access with Annotations access in the cleanup logic where this conditional
appears.

In `@pkg/registry/file/dynamicpathdetector/analyzer.go`:
- Around line 342-370: The wildcard handling in compareSegments incorrectly
peeks at nextDynamic and uses a match guard that prevents recursion for
consecutive wildcards and disallows zero-segment wildcard consumption; update
compareSegments (the block where dynamic[0] == WildcardIdentifier) to remove the
optimistic `match` check and instead try recursing unconditionally for every
possible split point by iterating i from 0 through len(regular) (inclusive) and
calling compareSegments(dynamic[1:], regular[i:]) — return true on the first
success; keep the existing base-case when len(dynamic) == 1. This change
(affecting compareSegments, and related to WildcardIdentifier/DynamicIdentifier
handling and user-authored patterns like /*/*) will allow consecutive wildcards
and zero-length wildcard matches without altering collapseAdjacentDynamic.

In `@pkg/registry/file/dynamicpathdetector/tests/analyze_opens_test.go`:
- Around line 739-747: Remove the unused test helper function assertPathIsOneOf
from the file: locate the function declaration named assertPathIsOneOf and
delete it, ensuring no other code references it (the existing
assertContainsOneOfPaths covers the same behavior); run tests / golangci-lint to
confirm the warning is gone.

In `@pkg/registry/file/dynamicpathdetector/tests/profile_test.go`:
- Around line 86-91: The memstats snapshot is taken after pprof.StopCPUProfile()
and cpuFile.Close(), which lets their internal allocations inflate the measured
delta; move the runtime.ReadMemStats(&after) call to immediately after the
benchmark loop (before calling pprof.StopCPUProfile() and cpuFile.Close()) so
that the after TotalAlloc/Mallocs reflect only the loop work — update the
placement of runtime.ReadMemStats(&after) in the test that uses the before/after
variables and the pprof.StopCPUProfile and cpuFile.Close calls.

---

Outside diff comments:
In `@pkg/registry/file/dynamicpathdetector/analyze_endpoints.go`:
- Around line 128-156: MergeDuplicateEndpoints currently only folds specific
endpoints into an existing wildcard but not the reverse; fix by making merging
symmetric: inside MergeDuplicateEndpoints, after computing port and pathPart via
splitEndpointPortAndPath(endpoint.Endpoint), if isWildcardPort(port) is true
then scan the seen map for keys matching the same pathPart and
endpoint.Direction but with specific ports (non-wildcard), for each match call
existing.Methods = MergeStrings(existing.Methods, endpoint.Methods) and
mergeHeaders(existing, endpoint), remove the specific entry from seen and from
newEndpoints, then insert the wildcard (or merge into the wildcard entry) to
ensure one combined entry; keep using getEndpointKey, mergeHeaders, seen and
newEndpoints as the identifiers and update/add a unit test (e.g.,
TestMergeDuplicateEndpointsWildcardPort) to cover specific-first and
wildcard-first orders.

---

Nitpick comments:
In `@pkg/registry/file/dynamicpathdetector/analyzer.go`:
- Around line 51-62: effectiveThreshold currently uses len(c.Prefix) >= bestLen
with bestLen starting at 0, which ties differently from FindConfigForPath (which
uses > and bestLen = -1); align their tie-breaking by extracting a shared helper
(e.g., longestMatch or findBestConfig) that iterates ua.configs and returns the
best *CollapseConfig for a given path using strict ">" comparison and a sentinel
bestLen of -1, then have both effectiveThreshold and FindConfigForPath call that
helper to obtain the chosen config and derive the threshold or return value
accordingly so both functions behave identically.

In `@pkg/registry/file/dynamicpathdetector/tests/analyze_endpoints_test.go`:
- Around line 259-263: The tests
TestAnalyzeEndpointsWildcardPortAbsorbsSpecificPort and
TestAnalyzeEndpointsWildcardPortAfterSpecificPorts use brittle slicing
ep.Endpoint[:len(":0")] to check the wildcard port; replace those checks with a
proper prefix check using strings.HasPrefix(ep.Endpoint, ":0/") (or
strings.HasPrefix(ep.Endpoint, ":0") plus an explicit delimiter check) to avoid
panics and false matches, update the assertion accordingly (e.g., assert.True(t,
strings.HasPrefix(...), "...") or assert.Equal for the intended prefix), and add
an import for "strings".

In `@pkg/registry/file/dynamicpathdetector/tests/benchmark_test.go`:
- Around line 75-82: Remove the dead commented-out benchmark block for
BenchmarkCompareDynamic: either delete the entire commented function or restore
it as an active benchmark if you intend to keep measuring CompareDynamic;
specifically remove or re-enable the block that references
BenchmarkCompareDynamic and the call to dynamicpathdetector.CompareDynamic so
the test file no longer contains commented-out code.

In `@pkg/registry/file/dynamicpathdetector/tests/profile_test.go`:
- Around line 83-167: Fix the unchecked Close errors and remove the dead fmt
anchor: check and handle the returned errors from cpuFile.Close(),
memFile.Close(), and goFile.Close() in the profiling code (e.g., log or t.Fatalf
on error in TestAnalyzePath/profile helpers) so file-close failures don’t get
dropped, and remove the no-op "var _ = fmt.Sprint" and the unused fmt import (or
reintroduce a real use) to avoid keeping fmt alive unnecessarily; locate these
changes around the TestAnalyzePath profiling block and the file-scope anchor in
profile_test.go.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 7e3121a6-6b0b-43fa-bf9b-452b7f5c2b45

📥 Commits

Reviewing files that changed from the base of the PR and between 2dbea36 and 435e924.

⛔ Files ignored due to path filters (1)
  • go.sum is excluded by !**/*.sum
📒 Files selected for processing (13)
  • go.mod
  • pkg/registry/file/applicationprofile_processor.go
  • pkg/registry/file/applicationprofile_processor_test.go
  • pkg/registry/file/cleanup.go
  • pkg/registry/file/containerprofile_processor.go
  • pkg/registry/file/dynamicpathdetector/analyze_endpoints.go
  • pkg/registry/file/dynamicpathdetector/analyzer.go
  • pkg/registry/file/dynamicpathdetector/tests/analyze_endpoints_test.go
  • pkg/registry/file/dynamicpathdetector/tests/analyze_opens_test.go
  • pkg/registry/file/dynamicpathdetector/tests/benchmark_test.go
  • pkg/registry/file/dynamicpathdetector/tests/coverage_test.go
  • pkg/registry/file/dynamicpathdetector/tests/profile_test.go
  • pkg/registry/file/dynamicpathdetector/types.go

Comment thread go.mod
Comment thread pkg/registry/file/cleanup.go Outdated
Comment thread pkg/registry/file/dynamicpathdetector/analyzer.go
Comment thread pkg/registry/file/dynamicpathdetector/tests/analyze_opens_test.go Outdated
Comment thread pkg/registry/file/dynamicpathdetector/tests/profile_test.go
@entlein entlein marked this pull request as draft April 24, 2026 22:29
@entlein entlein marked this pull request as ready for review April 25, 2026 16:24
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (3)
pkg/registry/file/dynamicpathdetector/tests/profile_test.go (2)

21-31: Unused fmt import propped up by var _ = fmt.Sprint.

fmt isn't actually referenced anywhere in the file; line 173 exists solely to satisfy the unused-import check. If you want fmt available for future debugging, just delete both the import and the placeholder — adding the import back is a one-keystroke editor action and the current state pollutes test binary symbol tables and confuses future readers.

♻️ Suggested cleanup
 import (
 	"flag"
-	"fmt"
 	"os"
 	"path/filepath"
 	"runtime"
 	"runtime/pprof"
 	"testing"
 
 	"github.com/kubescape/storage/pkg/registry/file/dynamicpathdetector"
 )
-// Ensure fmt is kept imported when future debugging prints land here.
-var _ = fmt.Sprint

Also applies to: 173-173

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/registry/file/dynamicpathdetector/tests/profile_test.go` around lines 21
- 31, Remove the unused fmt import and its placeholder var to avoid polluting
test symbols: delete the "fmt" entry in the import block and remove the dummy
statement var _ = fmt.Sprint (the placeholder) from profile_test.go so fmt is no
longer referenced; if you later need fmt for debugging you can re-add the import
and use it directly.

83-83: golangci-lint flags four unchecked Close() errors (errcheck).

In a profile-output test, a silently-failed close on cpuFile could yield a truncated/invalid cpu.out that's then consumed by go tool pprof without anyone noticing. Checking these errors keeps the lint clean and surfaces I/O failures early. The bail path at line 83 is the most important — if AnalyzePath fails mid-loop and the Close() then also fails, the original t.Fatalf argument is what the user wants to see, but the file should still be flushed/closed cleanly.

♻️ Suggested fix
 	for i := 0; i < *profileIters; i++ {
 		if _, err := analyzer.AnalyzePath(paths[i%len(paths)], identifier); err != nil {
 			pprof.StopCPUProfile()
-			cpuFile.Close()
+			_ = cpuFile.Close()
 			t.Fatalf("AnalyzePath iter %d: %v", i, err)
 		}
 	}
@@
 	pprof.StopCPUProfile()
-	cpuFile.Close()
+	if err := cpuFile.Close(); err != nil {
+		t.Fatalf("close cpu profile: %v", err)
+	}
@@
-	memFile.Close()
+	if err := memFile.Close(); err != nil {
+		t.Fatalf("close mem profile: %v", err)
+	}
@@
-	goFile.Close()
+	if err := goFile.Close(); err != nil {
+		t.Fatalf("close goroutine profile: %v", err)
+	}

Also applies to: 97-97, 108-108, 119-119

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/registry/file/dynamicpathdetector/tests/profile_test.go` at line 83,
Unchecked Close() calls on the test files (cpuFile and the other profile files)
can hide I/O errors; replace each naked cpuFile.Close() (and the Close()s at the
other occurrences) with an error check like: if err := cpuFile.Close(); err !=
nil { t.Fatalf("closing cpuFile: %v", err) } so failures to flush/close are
reported. Ensure you apply the same pattern for the Close() calls referenced at
the other occurrences (lines noted around AnalyzePath usage) so
AnalyzePath-related failures do not mask a subsequent Close() error.
pkg/registry/file/dynamicpathdetector/tests/coverage_test.go (1)

222-243: Test passes but the loop count is misleading.

appThreshold == 1, and (per the analyzer's threshold-1 short-circuit) /api/users/0 already returns /api/* on the very first call — the loop iterating appThreshold+1 times suggests collapse needs two inserts when in fact one suffices. The final AnalyzePath call at line 239 is the assertion-bearing one, and that's correct, but the loop reads as "drive until threshold is exceeded" semantics that don't apply at threshold 1.

Optional: drop the loop or comment the intent (e.g., "ensure idempotence when the threshold-1 wildcard is hit repeatedly").

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/registry/file/dynamicpathdetector/tests/coverage_test.go` around lines
222 - 243, The test TestCollapseConfig is misleading because appThreshold == 1
and analyzer.AnalyzePath short-circuits to the wildcard on the first call;
update the test to reflect that by removing the loop and calling
analyzer.AnalyzePath exactly once for the pre-collapse hits (or replace the loop
with a single call) and add a brief comment near appThreshold and the call to
analyzer.AnalyzePath explaining that a single call triggers the threshold-1
wildcard behavior; locate these in TestCollapseConfig and the variables
appThreshold, analyzer, and the CollapseConfig passed to
dynamicpathdetector.NewPathAnalyzerWithConfigs to make the change.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@pkg/registry/file/dynamicpathdetector/analyzer.go`:
- Around line 67-78: The hasPrefixAtBoundary function currently rejects a
user-supplied Prefix="/" for non-root paths; update hasPrefixAtBoundary to
special-case prefix == "/" and return true when pathPrefix begins with '/'
(i.e., treat "/" as a valid catch-all boundary) while keeping the existing
length/prefix checks for other values; this change ensures explicit {Prefix:"/"}
configs can override ua.defaultCfg/effectiveThreshold and will be picked up by
FindConfigForPath instead of being silently ignored.
- Around line 51-62: effectiveThreshold currently uses "len(c.Prefix) >=
bestLen" which causes last-equal-length config to win, while FindConfigForPath
uses "len(cfg.Prefix) > bestLen" (first-equal-length wins); make the tie-break
consistent by changing effectiveThreshold to use "len(c.Prefix) > bestLen" so it
mirrors FindConfigForPath and returns the same config/threshold for identical
Prefix values (refer to functions effectiveThreshold and FindConfigForPath and
update the comparison accordingly).
- Around line 245-264: The doc comment for createWildcardNode contradicts its
behavior: it claims the subtree collapses on the "second unique value" but the
code (the call site in PathAnalyzer that invokes createWildcardNode) triggers
immediate wildcarding on the first unique segment; update the comment on
createWildcardNode (and adjust any docstring on PathAnalyzer if present) to
state that it replaces all existing children with a wildcard immediately when
invoked, reflecting the current immediate-wildcard semantics tested by
TestAnalyzeOpensThreshold1ImmediateWildcard; alternatively, if you want the
original "allow one then collapse on second" behavior, change the call-site
condition to only call createWildcardNode when len(node.Children) >= 1, but
prefer updating the comment to match the implemented behavior.

---

Nitpick comments:
In `@pkg/registry/file/dynamicpathdetector/tests/coverage_test.go`:
- Around line 222-243: The test TestCollapseConfig is misleading because
appThreshold == 1 and analyzer.AnalyzePath short-circuits to the wildcard on the
first call; update the test to reflect that by removing the loop and calling
analyzer.AnalyzePath exactly once for the pre-collapse hits (or replace the loop
with a single call) and add a brief comment near appThreshold and the call to
analyzer.AnalyzePath explaining that a single call triggers the threshold-1
wildcard behavior; locate these in TestCollapseConfig and the variables
appThreshold, analyzer, and the CollapseConfig passed to
dynamicpathdetector.NewPathAnalyzerWithConfigs to make the change.

In `@pkg/registry/file/dynamicpathdetector/tests/profile_test.go`:
- Around line 21-31: Remove the unused fmt import and its placeholder var to
avoid polluting test symbols: delete the "fmt" entry in the import block and
remove the dummy statement var _ = fmt.Sprint (the placeholder) from
profile_test.go so fmt is no longer referenced; if you later need fmt for
debugging you can re-add the import and use it directly.
- Line 83: Unchecked Close() calls on the test files (cpuFile and the other
profile files) can hide I/O errors; replace each naked cpuFile.Close() (and the
Close()s at the other occurrences) with an error check like: if err :=
cpuFile.Close(); err != nil { t.Fatalf("closing cpuFile: %v", err) } so failures
to flush/close are reported. Ensure you apply the same pattern for the Close()
calls referenced at the other occurrences (lines noted around AnalyzePath usage)
so AnalyzePath-related failures do not mask a subsequent Close() error.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: f0c03bac-0e90-4505-8f54-99f5702de801

📥 Commits

Reviewing files that changed from the base of the PR and between 435e924 and 814025e.

📒 Files selected for processing (6)
  • pkg/registry/file/cleanup.go
  • pkg/registry/file/cleanup_test.go
  • pkg/registry/file/dynamicpathdetector/analyzer.go
  • pkg/registry/file/dynamicpathdetector/tests/analyze_opens_test.go
  • pkg/registry/file/dynamicpathdetector/tests/coverage_test.go
  • pkg/registry/file/dynamicpathdetector/tests/profile_test.go

Comment thread pkg/registry/file/dynamicpathdetector/analyzer.go
Comment thread pkg/registry/file/dynamicpathdetector/analyzer.go
Comment thread pkg/registry/file/dynamicpathdetector/analyzer.go
@matthyx matthyx moved this from WIP to Needs Reviewer in KS PRs tracking Apr 27, 2026
Comment thread pkg/registry/file/dynamicpathdetector/analyze_endpoints.go Outdated
entlein added a commit to k8sstormcenter/storage that referenced this pull request Apr 29, 2026
… longer contaminates unrelated ports (#21)

A single :0 (wildcard-port) entry was forcing every other endpoint in
the slice through `rewritePort` and being analyzed under the wildcard
sub-tree, even when its concrete port was unrelated. Result: input like
[":0/health", ":443/login"] collapsed to [":0/health", ":0/login"],
silently widening the captured surface area.

Fix: drop the global wildcardPort detection. Each endpoint is now
analyzed under its own port; cross-port folding is the explicit job of
MergeDuplicateEndpoints, which only merges same-(path, direction)
specific-port siblings into an explicit :0 entry. The merge is now
symmetric — wildcard-after-specific sweeps `seen` and absorbs prior
specific-port entries; specific-after-wildcard folds in as before.

Tests:
  - 3 new regression tests for the contamination + reverse-order paths
  - existing "Test with 0 port" updated to encode correct behavior

Flagged in upstream review on kubescape#316.

Co-authored-by: Entlein <eineintlein@gmail.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
entlein added 10 commits April 29, 2026 09:46
Signed-off-by: entlein <einentlein@gmail.com>
Signed-off-by: entlein <einentlein@gmail.com>
…ldcardSegment, I want to test this one e2e against actual component test before merge

Signed-off-by: entlein <einentlein@gmail.com>
…ard, this requires functional requirements as it could defeat the intention of a user, if they really want just two subpath elements, to be discussed

Signed-off-by: entlein <einentlein@gmail.com>
… /etc higher to test the functionality

Signed-off-by: entlein <einentlein@gmail.com>
Signed-off-by: entlein <einentlein@gmail.com>
Signed-off-by: entlein <einentlein@gmail.com>
Signed-off-by: entlein <einentlein@gmail.com>
Signed-off-by: entlein <einentlein@gmail.com>
Signed-off-by: entlein <einentlein@gmail.com>
entlein and others added 2 commits April 29, 2026 09:46
Signed-off-by: entlein <einentlein@gmail.com>
Also benchmarking it:
 Local benchmark (AnalyzeEndpoints, realistic shape: 1 wildcard out of N, 3 runs each, median ns/op)

  ┌──────┬────────────────┬────────────┬─────────┬───────────────────────┐
  │  N   │ Pre (cdbf491) │ Post (fix) │ Δ ns/op │  allocs/op pre→post   │
  ├──────┼────────────────┼────────────┼─────────┼───────────────────────┤
  │ 8    │ 8 832          │ 9 083      │ +2.8%   │ 125 → 125             │
  ├──────┼────────────────┼────────────┼─────────┼───────────────────────┤
  │ 64   │ 78 050         │ 85 192     │ +9.2%   │ 980 → 980             │
  ├──────┼────────────────┼────────────┼─────────┼───────────────────────┤
  │ 256  │ 291 932        │ 347 453    │ +19.0%  │ 3869 → 3869           │
  ├──────┼────────────────┼────────────┼─────────┼───────────────────────┤
  │ 1024 │ 1 503 729      │ 1 535 156  │ +2.1%   │ 29 674 → 27 634       │
  └──────┴────────────────┴────────────┴─────────┴───────────────────────┘

Signed-off-by: entlein <einentlein@gmail.com>
@entlein entlein force-pushed the upstream-pr/analyzer-zero-alloc branch from d110a20 to cc2a2ec Compare April 29, 2026 07:47
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@pkg/registry/file/dynamicpathdetector/tests/analyze_endpoints_test.go`:
- Line 15: The test shares one PathAnalyzer created with
NewPathAnalyzerWithConfigs across table-driven subtests, but AnalyzeEndpoints
mutates the analyzer trie making tests order-dependent; update the test so each
subtest constructs a fresh analyzer (call NewPathAnalyzerWithConfigs inside each
t.Run/loop iteration) before calling AnalyzeEndpoints, and apply the same change
to the other table block noted (the cases around lines 176-185) so every subtest
uses its own PathAnalyzer instance.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 265f9a79-5800-45e3-befc-cde925b8e6a7

📥 Commits

Reviewing files that changed from the base of the PR and between 814025e and d110a20.

📒 Files selected for processing (3)
  • pkg/registry/file/dynamicpathdetector/analyze_endpoints.go
  • pkg/registry/file/dynamicpathdetector/analyzer.go
  • pkg/registry/file/dynamicpathdetector/tests/analyze_endpoints_test.go
🚧 Files skipped from review as they are similar to previous changes (2)
  • pkg/registry/file/dynamicpathdetector/analyze_endpoints.go
  • pkg/registry/file/dynamicpathdetector/analyzer.go

Comment thread pkg/registry/file/dynamicpathdetector/tests/analyze_endpoints_test.go Outdated
Comment thread pkg/registry/file/dynamicpathdetector/analyze_endpoints.go Outdated
Comment thread pkg/registry/file/dynamicpathdetector/tests/analyze_endpoints_test.go Outdated
entlein added a commit to k8sstormcenter/storage that referenced this pull request Apr 29, 2026
HTTPEndpoint.Equal distinguishes endpoints by (Endpoint, Direction,
Internal), but MergeDuplicateEndpoints was keying only on
(Endpoint, Direction). Three places leaked the bug:

  1. getEndpointKey — duplicate-detection at the top of the loop
     would collapse two endpoints that differ only in Internal.
  2. The wildcard-after-specific sweep — a :0/x Internal=false
     wildcard could absorb a previously-recorded :443/x Internal=true
     sibling because the sweep only checked path + direction.
  3. The wildcardKey lookup for specific-after-wildcard ordering —
     same omission, so a specific-port entry could fold into a
     wildcard with a different Internal.

Result: profile entries that the runtime semantically considers
distinct (Internal=true vs false) were silently merged into one,
losing the internal/external distinction for that path.

Fix: include Internal in getEndpointKey, the sweep predicate, and
wildcardKey, all keyed identically so the map lookups stay
consistent.

4 new regression tests pin all three spots plus a positive sanity
check that matching-Internal merges still happen. Existing tests
unchanged.

Flagged on upstream review of kubescape#316.

Co-authored-by: Entlein <eineintlein@gmail.com>
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (7)
pkg/registry/file/dynamicpathdetector/tests/benchmark_test.go (1)

75-82: Remove or restore the commented-out benchmark.

Commented-out code adds noise and can become stale. Either delete BenchmarkCompareDynamic if it's no longer needed, or uncomment it with a TODO if it's temporarily disabled for a known reason.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/registry/file/dynamicpathdetector/tests/benchmark_test.go` around lines
75 - 82, Remove or restore the commented-out benchmark function
BenchmarkCompareDynamic: either delete the entire commented block if the
benchmark is no longer needed, or uncomment it and add a short TODO comment
explaining why it was temporarily disabled and when to re-enable; ensure the
restored benchmark runs correctly by keeping the call to
dynamicpathdetector.CompareDynamic(dynamicPath, regularPath) and the
b.ReportAllocs() invocation so the test harness behaves as intended.
pkg/registry/file/dynamicpathdetector/tests/profile_test.go (3)

172-173: Remove the fmt.Sprint import-keeper hack.

If fmt is only used for future debugging, remove both the import and this line. When debugging is needed, the import can be added back. This pattern adds noise and signals incomplete cleanup.

🧹 Suggested cleanup
-// Ensure fmt is kept imported when future debugging prints land here.
-var _ = fmt.Sprint

And remove "fmt" from imports if not used elsewhere.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/registry/file/dynamicpathdetector/tests/profile_test.go` around lines 172
- 173, Remove the unused fmt import-keeper hack by deleting the placeholder
reference `var _ = fmt.Sprint` from the test file and then remove `"fmt"` from
the import block in `profile_test.go` (or re-add the import only when real
debugging prints are introduced); ensure there are no remaining references to
fmt so the file compiles cleanly.

52-52: Use NewPathAnalyzerWithConfigs for API consistency.

This file uses the old NewPathAnalyzer(100) constructor (lines 52, 138, 165), while the rest of the test suite has been updated to use NewPathAnalyzerWithConfigs. For consistent behavior and to exercise the same code paths as production, consider updating these calls.

♻️ Suggested change
-	analyzer := dynamicpathdetector.NewPathAnalyzer(100)
+	analyzer := dynamicpathdetector.NewPathAnalyzerWithConfigs(dynamicpathdetector.OpenDynamicThreshold, nil)

Apply similarly to lines 138 and 165.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/registry/file/dynamicpathdetector/tests/profile_test.go` at line 52,
Replace usages of NewPathAnalyzer(100) with NewPathAnalyzerWithConfigs(...) to
match the updated API: locate calls to dynamicpathdetector.NewPathAnalyzer in
profile_test.go (currently at three places) and change them to
dynamicpathdetector.NewPathAnalyzerWithConfigs passing the equivalent
configuration used previously (e.g., the same numeric limit or wrapped in the
PathAnalyzer config struct expected by NewPathAnalyzerWithConfigs) so the tests
exercise the same code path as production.

97-97: Handle Close() errors per static analysis.

Multiple Close() calls have unchecked return values (lines 83, 97, 108, 119). While unlikely to fail, ignoring file close errors can mask I/O issues, especially when writing profiles to disk.

♻️ Suggested fix pattern
-	cpuFile.Close()
+	if err := cpuFile.Close(); err != nil {
+		t.Errorf("close cpu profile: %v", err)
+	}

Or use defer with a named error variable if preferred.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/registry/file/dynamicpathdetector/tests/profile_test.go` at line 97, The
bare cpuFile.Close() (and the other unchecked Close() calls in profile_test.go)
should check and handle the returned error; replace calls like cpuFile.Close()
with error-checked versions such as if err := cpuFile.Close(); err != nil {
t.Fatalf("closing cpuFile: %v", err) } or use require.NoError(t, err) from the
test helper to fail the test on I/O errors, or convert these to deferred closes
using a named error check pattern if preferred.
pkg/registry/file/applicationprofile_processor_test.go (1)

347-354: Hardcoded /etc threshold may drift from production defaults.

The test hardcodes etcThreshold := 100 but production uses dynamicpathdetector.DefaultCollapseConfigs. If the production config for /etc changes, this test won't catch the behavioral change.

Consider deriving the threshold from the production config or explicitly documenting the test's intent.

♻️ Alternative: derive from config
// Option 1: Use a helper to find the /etc config threshold
etcConfig := dynamicpathdetector.FindConfigForPathFromSlice("/etc/file", dynamicpathdetector.DefaultCollapseConfigs, dynamicpathdetector.DefaultCollapseConfig)
etcThreshold := etcConfig.Threshold

// Option 2: Document the intentional decoupling
// etcThreshold is intentionally hardcoded to test a specific boundary,
// independent of production DefaultCollapseConfigs
etcThreshold := 100
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/registry/file/applicationprofile_processor_test.go` around lines 347 -
354, The test currently hardcodes etcThreshold := 100 which can drift from
production; change the test to derive the threshold from the production collapse
configs by calling dynamicpathdetector.FindConfigForPathFromSlice("/etc/file",
dynamicpathdetector.DefaultCollapseConfigs,
dynamicpathdetector.DefaultCollapseConfig) and use the returned config's
Threshold when building opens (replace etcThreshold usage in
applicationprofile_processor_test.go), or if you intentionally want a fixed
boundary, add a short comment explaining the deliberate decoupling and why 100
is chosen; reference the etcThreshold variable,
dynamicpathdetector.DefaultCollapseConfigs,
dynamicpathdetector.DefaultCollapseConfig and
dynamicpathdetector.FindConfigForPathFromSlice in your change.
pkg/registry/file/dynamicpathdetector/analyze_endpoints.go (1)

221-225: Replace fmt.Println with structured logging or silent failure.

Using fmt.Println in library code writes to stdout unconditionally, which can pollute logs in production. Consider using a logger or returning the error (even if ignored upstream).

♻️ Suggested change
 	rawJSON, err := json.Marshal(existingHeaders)
 	if err != nil {
-		fmt.Println("Error marshaling JSON:", err)
 		return
 	}

If logging is desired, inject a logger or use the project's standard logging approach.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/registry/file/dynamicpathdetector/analyze_endpoints.go` around lines 221
- 225, The use of fmt.Println in the json.Marshal error branch should be
replaced with the project's logger or an error return instead of printing to
stdout; in the function containing rawJSON := json.Marshal(existingHeaders) (in
analyze_endpoints.go) remove fmt.Println("Error marshaling JSON:", err) and
either call the injected logger (e.g., logger.Error or processLogger.Error) with
a contextual message plus err, or propagate the error back to the caller (return
err) so callers can decide how to handle it—update the function signature if you
choose to return the error and ensure callers handle it accordingly.
pkg/registry/file/dynamicpathdetector/types.go (1)

38-38: Clarify or correct the comment for /app threshold.

The comment says "any variation under /app collapses immediately" but the threshold is 50, not 1. With threshold 50, collapse only happens after 50+ unique children are seen at a node level. If "immediately" is the intent, the threshold should be 1 (as seen in testCollapseConfigs in the test file).

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/registry/file/dynamicpathdetector/types.go` at line 38, The comment for
the entry {Prefix: "/app", Threshold: 50} is incorrect: either update the
comment to state that collapse occurs only after 50+ unique children at that
node, or change Threshold from 50 to 1 if the intent is to "collapse
immediately" (as used in testCollapseConfigs); locate the struct/element
{Prefix: "/app", Threshold: 50} and make the corresponding fix so comment and
threshold behavior match.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@pkg/registry/file/dynamicpathdetector/analyze_endpoints.go`:
- Around line 102-109: splitEndpointPortAndPath currently assumes inputs start
with ":" (as done by AnalyzeURL) but lacks defensive checks; add an explicit
guard at the top of splitEndpointPortAndPath to validate the input: if
!strings.HasPrefix(endpoint, ":") then treat the whole input as a path and
return "" for the port and the original endpoint (or ensure it begins with "/"
before returning), or alternatively document in the function comment that the
contract requires a ":" prefix and leave the call-sites to enforce it
(referencing AnalyzeURL) — implement the former (guard clause) so callers that
pass "/health" get ("", "/health") deterministically.

---

Nitpick comments:
In `@pkg/registry/file/applicationprofile_processor_test.go`:
- Around line 347-354: The test currently hardcodes etcThreshold := 100 which
can drift from production; change the test to derive the threshold from the
production collapse configs by calling
dynamicpathdetector.FindConfigForPathFromSlice("/etc/file",
dynamicpathdetector.DefaultCollapseConfigs,
dynamicpathdetector.DefaultCollapseConfig) and use the returned config's
Threshold when building opens (replace etcThreshold usage in
applicationprofile_processor_test.go), or if you intentionally want a fixed
boundary, add a short comment explaining the deliberate decoupling and why 100
is chosen; reference the etcThreshold variable,
dynamicpathdetector.DefaultCollapseConfigs,
dynamicpathdetector.DefaultCollapseConfig and
dynamicpathdetector.FindConfigForPathFromSlice in your change.

In `@pkg/registry/file/dynamicpathdetector/analyze_endpoints.go`:
- Around line 221-225: The use of fmt.Println in the json.Marshal error branch
should be replaced with the project's logger or an error return instead of
printing to stdout; in the function containing rawJSON :=
json.Marshal(existingHeaders) (in analyze_endpoints.go) remove
fmt.Println("Error marshaling JSON:", err) and either call the injected logger
(e.g., logger.Error or processLogger.Error) with a contextual message plus err,
or propagate the error back to the caller (return err) so callers can decide how
to handle it—update the function signature if you choose to return the error and
ensure callers handle it accordingly.

In `@pkg/registry/file/dynamicpathdetector/tests/benchmark_test.go`:
- Around line 75-82: Remove or restore the commented-out benchmark function
BenchmarkCompareDynamic: either delete the entire commented block if the
benchmark is no longer needed, or uncomment it and add a short TODO comment
explaining why it was temporarily disabled and when to re-enable; ensure the
restored benchmark runs correctly by keeping the call to
dynamicpathdetector.CompareDynamic(dynamicPath, regularPath) and the
b.ReportAllocs() invocation so the test harness behaves as intended.

In `@pkg/registry/file/dynamicpathdetector/tests/profile_test.go`:
- Around line 172-173: Remove the unused fmt import-keeper hack by deleting the
placeholder reference `var _ = fmt.Sprint` from the test file and then remove
`"fmt"` from the import block in `profile_test.go` (or re-add the import only
when real debugging prints are introduced); ensure there are no remaining
references to fmt so the file compiles cleanly.
- Line 52: Replace usages of NewPathAnalyzer(100) with
NewPathAnalyzerWithConfigs(...) to match the updated API: locate calls to
dynamicpathdetector.NewPathAnalyzer in profile_test.go (currently at three
places) and change them to dynamicpathdetector.NewPathAnalyzerWithConfigs
passing the equivalent configuration used previously (e.g., the same numeric
limit or wrapped in the PathAnalyzer config struct expected by
NewPathAnalyzerWithConfigs) so the tests exercise the same code path as
production.
- Line 97: The bare cpuFile.Close() (and the other unchecked Close() calls in
profile_test.go) should check and handle the returned error; replace calls like
cpuFile.Close() with error-checked versions such as if err := cpuFile.Close();
err != nil { t.Fatalf("closing cpuFile: %v", err) } or use require.NoError(t,
err) from the test helper to fail the test on I/O errors, or convert these to
deferred closes using a named error check pattern if preferred.

In `@pkg/registry/file/dynamicpathdetector/types.go`:
- Line 38: The comment for the entry {Prefix: "/app", Threshold: 50} is
incorrect: either update the comment to state that collapse occurs only after
50+ unique children at that node, or change Threshold from 50 to 1 if the intent
is to "collapse immediately" (as used in testCollapseConfigs); locate the
struct/element {Prefix: "/app", Threshold: 50} and make the corresponding fix so
comment and threshold behavior match.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 571be628-6cdf-4c4b-857d-4dcd79de8c70

📥 Commits

Reviewing files that changed from the base of the PR and between d110a20 and dd028cf.

⛔ Files ignored due to path filters (1)
  • go.sum is excluded by !**/*.sum
📒 Files selected for processing (14)
  • go.mod
  • pkg/registry/file/applicationprofile_processor.go
  • pkg/registry/file/applicationprofile_processor_test.go
  • pkg/registry/file/cleanup.go
  • pkg/registry/file/cleanup_test.go
  • pkg/registry/file/containerprofile_processor.go
  • pkg/registry/file/dynamicpathdetector/analyze_endpoints.go
  • pkg/registry/file/dynamicpathdetector/analyzer.go
  • pkg/registry/file/dynamicpathdetector/tests/analyze_endpoints_test.go
  • pkg/registry/file/dynamicpathdetector/tests/analyze_opens_test.go
  • pkg/registry/file/dynamicpathdetector/tests/benchmark_test.go
  • pkg/registry/file/dynamicpathdetector/tests/coverage_test.go
  • pkg/registry/file/dynamicpathdetector/tests/profile_test.go
  • pkg/registry/file/dynamicpathdetector/types.go
✅ Files skipped from review due to trivial changes (2)
  • go.mod
  • pkg/registry/file/containerprofile_processor.go
🚧 Files skipped from review as they are similar to previous changes (4)
  • pkg/registry/file/applicationprofile_processor.go
  • pkg/registry/file/dynamicpathdetector/tests/coverage_test.go
  • pkg/registry/file/cleanup_test.go
  • pkg/registry/file/dynamicpathdetector/analyzer.go

Comment thread pkg/registry/file/dynamicpathdetector/analyze_endpoints.go
Copy link
Copy Markdown
Contributor

@matthyx matthyx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 regressions and some nits

Comment thread pkg/registry/file/dynamicpathdetector/analyzer.go Outdated
dynamicIndex++
}
dynamicSegment := dynamicPath[dynamicSegmentStart:dynamicIndex]
dynamicSegments := strings.Split(dynamicPath, "/")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Performance regression: the old implementation was a manual index scan — zero heap allocations per call. The new code allocates two []string slices on every invocation. And this is called on every file-open per R0002.
Solution: port the old two-pointer approach to handle *.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@entlein I'd like to fix this one

Comment thread pkg/registry/file/dynamicpathdetector/types.go Outdated
Comment thread pkg/registry/file/dynamicpathdetector/types.go Outdated
entlein pushed a commit to k8sstormcenter/storage that referenced this pull request May 1, 2026
…faultCollapseConfigs API

Address Matthias's review feedback on PR kubescape#316:

1. Behavior regression in CompareDynamic — trailing WildcardIdentifier (*)
   used to match zero-or-more segments, so '/etc/*' returned true against
   the bare '/etc' path. That widens R0002's blind spot to include
   directory-entry tampering of the parent itself. Standard glob semantics
   require trailing * to consume one or more segments. Mid-path wildcards
   keep zero-or-more semantics (preserves /a/*/b matching /a/b).

   Added TestCompareDynamic_TrailingWildcardSemantics covering /etc/*,
   /var/log/*, /* and bare '/' edges to pin the new boundary.

2. Unexport defaultCollapseConfigs and add a DefaultCollapseConfigs()
   accessor that returns a defensive copy. Prevents callers from
   accidentally mutating the package-level slice. Updates the two
   internal callers (applicationprofile_processor.go,
   containerprofile_processor.go).

3. Comment fix on the /app entry — collapse happens at 50 unique
   children, not 'immediately'.

Performance regression (allocation in CompareDynamic via strings.Split)
deferred to a separate commit — needs a benchmark to prove the rewrite
matches the pre-wildcard zero-alloc baseline.
entlein pushed a commit to k8sstormcenter/storage that referenced this pull request May 1, 2026
…faultCollapseConfigs API

Address Matthias's review feedback on PR kubescape#316:

1. Behavior regression in CompareDynamic — trailing WildcardIdentifier (*)
   used to match zero-or-more segments, so '/etc/*' returned true against
   the bare '/etc' path. That widens R0002's blind spot to include
   directory-entry tampering of the parent itself. Standard glob semantics
   require trailing * to consume one or more segments. Mid-path wildcards
   keep zero-or-more semantics (preserves /a/*/b matching /a/b).

   Added TestCompareDynamic_TrailingWildcardSemantics covering /etc/*,
   /var/log/*, /* and bare '/' edges to pin the new boundary.

2. Unexport defaultCollapseConfigs and add a DefaultCollapseConfigs()
   accessor that returns a defensive copy. Prevents callers from
   accidentally mutating the package-level slice. Updates the two
   internal callers (applicationprofile_processor.go,
   containerprofile_processor.go).

3. Comment fix on the /app entry — collapse happens at 50 unique
   children, not 'immediately'.

Performance regression (allocation in CompareDynamic via strings.Split)
deferred to a separate commit — needs a benchmark to prove the rewrite
matches the pre-wildcard zero-alloc baseline.
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@pkg/registry/file/dynamicpathdetector/analyzer.go`:
- Around line 406-422: FindConfigForPath currently returns a pointer to internal
data (ua.configs / ua.defaultCfg) which lets callers mutate analyzer state;
change it to return a value copy instead: update the function signature from
FindConfigForPath(path string) *CollapseConfig to return CollapseConfig (value)
and return a copy of the selected config (e.g., dereference the pointer before
returning or return ua.defaultCfg when best==nil). Also update all callers of
FindConfigForPath to accept a CollapseConfig value rather than a *CollapseConfig
so the analyzer's internal configs remain immutable from callers.
- Around line 357-360: The CompareDynamic function and its helper
compareSegments incorrectly treat unnormalized paths with trailing slashes as
matching (e.g., CompareDynamic("/etc/*", "/etc/") returns true); fix by
normalizing inputs inside CompareDynamic—trim trailing slashes and remove empty
segments (or explicitly handle empty-string segments) before calling
compareSegments so a trailing slash does not create a bogus empty segment, and
add the unit test case CompareDynamic("/etc/*", "/etc/", false) to cover this
scenario; alternatively, if you prefer not to normalize here, document in
CompareDynamic that callers must pass cleaned paths and add the test to ensure
callers conform.

In `@pkg/registry/file/dynamicpathdetector/types.go`:
- Around line 49-54: DefaultCollapseConfig is a misleading exported constant
because NewPathAnalyzerWithConfigs uses an instance defaultThreshold (not always
OpenDynamicThreshold); replace the exported var with an instance-aware provider:
either make the var unexported (defaultCollapseConfig) and update internal
callers, or add a constructor helper DefaultCollapseConfigFor(threshold int)
that returns CollapseConfig{Prefix: "/", Threshold: threshold} and update usages
(including where endpoints are built in applicationprofile_processor.go which
currently expects EndpointDynamicThreshold) to call the new helper or the
analyzer method so the fallback config matches the analyzer's threshold.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: d11aa814-3fb2-413e-85cc-8f8935b60688

📥 Commits

Reviewing files that changed from the base of the PR and between dd028cf and 7380d7b.

📒 Files selected for processing (5)
  • pkg/registry/file/applicationprofile_processor.go
  • pkg/registry/file/containerprofile_processor.go
  • pkg/registry/file/dynamicpathdetector/analyzer.go
  • pkg/registry/file/dynamicpathdetector/tests/coverage_test.go
  • pkg/registry/file/dynamicpathdetector/types.go
🚧 Files skipped from review as they are similar to previous changes (1)
  • pkg/registry/file/containerprofile_processor.go

Comment thread pkg/registry/file/dynamicpathdetector/analyzer.go Outdated
Comment thread pkg/registry/file/dynamicpathdetector/analyzer.go Outdated
Comment thread pkg/registry/file/dynamicpathdetector/types.go
Comment thread pkg/registry/file/dynamicpathdetector/analyzer.go
@entlein entlein force-pushed the upstream-pr/analyzer-zero-alloc branch from 7380d7b to dd028cf Compare May 1, 2026 15:17
entlein added 2 commits May 1, 2026 18:43
…xplicit tests and behavior for empty versus wildcard, restores the benchmark test and addresses the nitpick comments - does not include the performance issue re the zero-alloc

Signed-off-by: entlein <einentlein@gmail.com>
…ed by the Rabbit

Signed-off-by: entlein <einentlein@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Needs Reviewer

Development

Successfully merging this pull request may close these issues.

2 participants