Skip to content

netlink List implementation#38

Open
aojea wants to merge 2 commits intokubernetes-sigs:masterfrom
aojea:netlink
Open

netlink List implementation#38
aojea wants to merge 2 commits intokubernetes-sigs:masterfrom
aojea:netlink

Conversation

@aojea
Copy link
Copy Markdown
Contributor

@aojea aojea commented Feb 24, 2026

Implement List() method using netlink directly to avoid the overhead of
calling the nft user space progam and parsing its output. Also, reducing
the exposure to bugs or problems caused by skew versions of the binary
when running inside containers.

@aojea aojea requested a review from danwinship February 24, 2026 18:31
@k8s-ci-robot k8s-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Feb 24, 2026
@k8s-ci-robot
Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: aojea
Once this PR has been reviewed and has the lgtm label, please assign danwinship for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Feb 24, 2026
@aojea aojea force-pushed the netlink branch 2 times, most recently from fcd74dc to 93119c2 Compare February 25, 2026 08:42
Comment thread nftables.go
@aojea aojea force-pushed the netlink branch 2 times, most recently from 5dd8893 to f66b4f6 Compare February 25, 2026 11:26
@aojea aojea changed the title [WIP] netlink List implementation netlink List implementation Feb 25, 2026
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Feb 25, 2026
@danwinship
Copy link
Copy Markdown
Collaborator

To get further improvements, we would need to reduce pointers, and
change knftables.Rule to use values instead of pointers for Handle, Index, etc.
This would save millions of allocations at scale but is a breaking change.

In general if you are listing large numbers of rules, you're probably doing something wrong anyway, in terms of good nftables architecture. (kube-proxy never calls ListRules. Calico does, and I haven't looked into exactly what they're doing, but I suspect it's mostly because they were trying to make the nftables backend work roughly the same way as the iptables backend.)

We could optimize the Handle allocation without breaking API by allocating a single []int along with the []*Rule and allocating all of the handles out of that. (Though you'd want to fix the code to pre-allocate those arrays to the right length first in that case to avoid blocking GC of every intermediate version of the handle array.)

This would be less memory-efficient than the current system in the case where someone lists a lot of rules and then keeps a handle to just one of them, but again, "probably doing something wrong anyway".

(Despite what the comment says, neither the old nor the new code sets Index from ListRules)

Comment thread netlink.go Outdated
Comment thread netlink.go Outdated
Comment thread netlink.go Outdated
Comment thread netlink.go Outdated
Comment thread netlink.go Outdated
Comment thread netlink.go Outdated
Comment thread netlink.go
Comment thread netlink.go Outdated
Comment thread netlink.go Outdated
Comment thread nftables.go Outdated
Comment thread netlink.go Outdated
@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 4, 2026
@aojea aojea force-pushed the netlink branch 4 times, most recently from 87ac32d to 45d427b Compare March 6, 2026 09:07
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 6, 2026
@aojea aojea force-pushed the netlink branch 3 times, most recently from 41f4670 to 94c2b8e Compare March 6, 2026 15:15
aojea added 2 commits March 17, 2026 07:22
Implement List() method using netlink directly to avoid the overhead of
calling the nft user space progam and parsing its output. Also, reducing
the exposure to bugs or problems caused by skew versions of the binary
when running inside containers.

benchmark netlink vs nft

/hack/benchmark.sh
Running benchmarks in a new network namespace...
goos: linux
goarch: amd64
pkg: sigs.k8s.io/knftables
cpu: Intel(R) Xeon(R) CPU @ 2.60GHz
BenchmarkListChains_NFT_10
BenchmarkListChains_NFT_10-48                        435           2727261 ns/op           72902 B/op        369 allocs/op
BenchmarkListChains_NFT_100
BenchmarkListChains_NFT_100-48                       252           4751280 ns/op          238012 B/op       2545 allocs/op
BenchmarkListChains_NFT_1000
BenchmarkListChains_NFT_1000-48                       45          26322812 ns/op         1771818 B/op      24162 allocs/op
BenchmarkListChains_NFT_10000
BenchmarkListChains_NFT_10000-48                       5         249445099 ns/op        16737220 B/op     240197 allocs/op
BenchmarkListChains_Netlink_10
BenchmarkListChains_Netlink_10-48                   6103            180090 ns/op           24133 B/op        188 allocs/op
:x
BenchmarkListChains_Netlink_100-48                  2169            526953 ns/op           88200 B/op       1062 allocs/op
BenchmarkListChains_Netlink_1000
BenchmarkListChains_Netlink_1000-48                  312           3816398 ns/op          766180 B/op       9731 allocs/op
BenchmarkListChains_Netlink_10000
BenchmarkListChains_Netlink_10000-48                  18          63964912 ns/op         8220568 B/op      96582 allocs/op
PASS"

* Speed: 4x to 15x faster
  - 10 chains: ~0.18ms vs ~2.72ms (15x faster)
  - 10k chains: ~63.9ms vs ~249.4ms (4x faster)
* Memory: ~50% to ~66% less memory used per operation
  - 10 chains: ~24KB vs ~72KB
  - 10k chains: ~8.2MB vs ~16.7MB
* Garbage Collection: ~50% to ~60% reduction in object allocations
  - 10k chains: ~96.5k allocs vs ~240k allocs
@aojea
Copy link
Copy Markdown
Contributor Author

aojea commented Mar 17, 2026

@danwinship what do we want to do with this? do we wait?

@danwinship
Copy link
Copy Markdown
Collaborator

I think we should backport the --terse fix to 1.35 and 1.34, and then work on netlink List() support (and maybe more) for 1.37.

@aojea
Copy link
Copy Markdown
Contributor Author

aojea commented Mar 17, 2026

I see you already tagged itv0.0.21 with the terse fix

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants