Skip to content

feat: support agent governance#11446

Open
kylewanginchina wants to merge 44 commits intomainfrom
support-agent-governance
Open

feat: support agent governance#11446
kylewanginchina wants to merge 44 commits intomainfrom
support-agent-governance

Conversation

@kylewanginchina
Copy link
Contributor

This PR is for:

  • Agent

Support agent governance

Checklist

  • Added unit test.

Backport to branches

@kylewanginchina kylewanginchina force-pushed the support-agent-governance branch 5 times, most recently from 6ea8ccf to 8980494 Compare March 11, 2026 15:55
@deepflowio deepflowio deleted a comment from claude bot Mar 12, 2026
@deepflowio deepflowio deleted a comment from claude bot Mar 12, 2026
@deepflowio deepflowio deleted a comment from claude bot Mar 12, 2026
@kylewanginchina kylewanginchina marked this pull request as ready for review March 12, 2026 03:19
lzf575
lzf575 previously approved these changes Mar 12, 2026
Copy link
Contributor

@lzf575 lzf575 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ingest 部分没问题

@kylewanginchina kylewanginchina force-pushed the support-agent-governance branch 2 times, most recently from 9d6be57 to 4b22b71 Compare March 12, 2026 08:09
@xiaochaoren1
Copy link
Contributor

xiaochaoren1 commented Mar 12, 2026

translation.go 部分,只在 INT_ENUM_PEER_TAG 里加一下即可,其他地方不用改

@kylewanginchina kylewanginchina force-pushed the support-agent-governance branch from d74d5d1 to eed5377 Compare March 12, 2026 15:25
kylewanginchina and others added 10 commits March 13, 2026 00:05
Add inputs.proc.ai_agent config section with http_endpoints
(default: /v1/chat/completions, /v1/embeddings), max_payload_size
(default: 1MB), and file_io_enabled. Forward to LogParserConfig.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add BIZ_TYPE_DEFAULT (0) and BIZ_TYPE_AI_AGENT (1) constants for
process classification in AI agent governance.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Stub module for AI Agent governance. Returns no-ops in open source.
Real implementation provided by enterprise enterprise-utils crate.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Enterprise-gated hook calls enterprise_utils::ai_agent::match_ai_agent_endpoint
to detect LLM API URLs. Sets endpoint and biz_type=AI_AGENT on match.
Priority: WASM/biz_field > AI Agent detection > http_endpoint config.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
AI Agent processes will be synced to controller with biz_type=1 (AI_AGENT).
Field plumbing only — registry integration in a later task.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
AI Agent flows use ai_agent_max_payload_size (1MB default) instead of
l7_log_packet_size to preserve full LLM request/response bodies for
governance audit.

Changes:
- Add is_ai_agent flag to FlowLog (enterprise-gated) to track flows
  identified as AI Agent traffic via biz_type detection
- In l7_parse_log, use ai_agent_max_payload_size for payload truncation
  when the flow is marked as AI Agent
- After parse_payload returns, check parsed result for BIZ_TYPE_AI_AGENT
  and set the flag for subsequent packets in the flow
- Add L7ParseResult::has_biz_type() helper to check parsed results
- Saturate ParseParam::buf_size to u16::MAX to avoid overflow with
  larger AI Agent payload sizes

Enterprise feature only. Original behavior preserved for non-AI-Agent
flows and non-enterprise builds.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add access_permission (__u16) to __io_event_buffer struct for exposing
file permission bits (inode->i_mode & 0xFFF) in I/O events.

Add #ifdef EXTENDED_AI_AGENT_FILE_IO hook in trace_io_event_common()
that allows enterprise extensions to bypass the latency filter for
AI agent processes and populate access_permission from the inode.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add global registry accessors (init_global_registry, global_registry)
  to enterprise-utils ai_agent module (stub returns None in open source)
- Initialize registry at startup in trident.rs (enterprise only)
- Register AI Agent PIDs in perf/mod.rs when biz_type detection fires
- proc_scan_hook checks registry to set biz_type=AI_AGENT on ProcessData
Enterprise feature only.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
… var

- Import L7ProtocolInfoInterface trait for get_biz_type() in l7_protocol_log.rs
- Prefix process_datas with underscore in proc_scan_hook.rs to suppress
  unused variable warning in non-enterprise builds

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
In C, a label must be followed by a statement, not a declaration.
The struct declaration after skip_latency_filter: causes a compile
error when EXTENDED_AI_AGENT_FILE_IO is defined. Add a null statement
(;) to satisfy the grammar requirement.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@kylewanginchina kylewanginchina force-pushed the support-agent-governance branch 3 times, most recently from c161d36 to 931bed0 Compare March 14, 2026 08:35
@kylewanginchina
Copy link
Contributor Author

translation.go 部分,只在 INT_ENUM_PEER_TAG 里加一下即可,其他地方不用改

@xiaochaoren1 意思是下图中红框的部分不需要,只需要在 INT_ENUM_PEER_TAG中加一下biz_type?
image

@kylewanginchina kylewanginchina force-pushed the support-agent-governance branch from 931bed0 to 8307b66 Compare March 16, 2026 03:46
kylewanginchina and others added 24 commits March 16, 2026 12:20
- proc_scan_hook: inject AI agent PIDs not matched by process_matcher
  so they appear in MySQL process table (not just l7_flow_log)
- handler.rs: add /v1/responses to default ai_agent_endpoints
- perf/mod.rs: remove redundant register() with empty endpoint
- http.rs: borrow path instead of cloning on every HTTP parse
- socket.c: change __set_ai_agent_data_limit_max param to unsigned int
  to fix dead code branch (limit_size > INT_MAX unreachable with int)
- server: decode access_permission from IoEventData into ClickHouse
  file_event table (column constant, EventStore field, column block)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@kylewanginchina kylewanginchina force-pushed the support-agent-governance branch from 8307b66 to d0ab245 Compare March 16, 2026 04:21
kylewanginchina and others added 2 commits March 16, 2026 23:44
移除开源版 AiAgentRegistry stub 中的 record_endpoint_hit() 方法,
与企业版删除 endpoint 唯一性约束保持一致。

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
**复现步骤**

AI Agent子进程(fork/exec产生的)的file_event和proc_lifecycle_event
的gprocess_id为0,因为子进程在controller同步到process表之前就已
经产生了事件。

**原因和解决方案**

子进程的gprocess_id依赖server端通过QueryProcessInfo查询,但新
fork的子进程可能还未同步到process表。

解决方案:在Agent端维护root_pid(最初通过endpoint识别的根AI Agent
进程PID),通过protobuf的ai_agent_root_pid字段传递到server端。
Server端在cache和直接PID查询都失败时,使用ai_agent_root_pid作为
fallback查询gprocess_id。

变更内容:
- metric.proto: ProcEvent新增ai_agent_root_pid字段(tag=14)
- proc_event/linux.rs: ProcEvent结构体新增ai_agent_root_pid字段
- ebpf_dispatcher.rs: 新增fill_ai_agent_root_pid()从registry查询
  root_pid并填充到事件中
- decoder.go: resolveGProcessID()新增ai_agent_root_pid fallback
- enterprise-utils/lib.rs: 开源stub新增get_root_pid/register_child

**影响范围**

仅影响AI Agent治理数据采集功能的gprocess_id解析

**验证方案**

- 单元测试:TestResolveGProcessIDAiAgentRootPidFallback
- 部署后验证fork事件的gprocess_id不再为0

**涉及分支**

* support-agent-governance

**检查项**

- [x] 需要更新依赖
- [ ] 是共性问题(代码中存在类似问题)
- [ ] 编译通过
- [ ] 单元测试通过

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@kylewanginchina kylewanginchina force-pushed the support-agent-governance branch from 8e93928 to 94e4ea5 Compare March 17, 2026 11:13
kylewanginchina and others added 2 commits March 18, 2026 13:25
Strip "FileOp" prefix from event_type (fileopcreate→create),
split full file path into file_dir + file_name, and populate
access_permission for chmod events.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
AI Agent进程的文件read/write事件之前仅绕过了latency过滤,
但仍被io_event_collect_mode过滤(默认mode=1要求trace关联)。
fork的子进程exec后执行独立的文件操作没有trace_id,导致事件
被丢弃。现在AI Agent进程同时绕过collect_mode和latency过滤。

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants