feat: add content security scanning and apm audit command#313
feat: add content security scanning and apm audit command#313danielmeppiel wants to merge 9 commits intomainfrom
apm audit command#313Conversation
There was a problem hiding this comment.
Pull request overview
Adds supply-chain style content integrity scanning to APM to detect hidden/invisible Unicode characters in prompt/rules files, exposed both via a new apm audit command and as install-time diagnostics.
Changes:
- Introduces a dependency-free
ContentScannerplus install-time wiring throughBaseIntegrator.scan_deployed_files()andDiagnosticCollector’s new security category. - Adds
apm auditcommand with--file,--fix, and--verbosemodes and corresponding unit tests. - Updates docs and changelog to document the new security scanning and command behavior.
Reviewed changes
Copilot reviewed 14 out of 14 changed files in this pull request and generated 9 comments.
Show a summary per file
| File | Description |
|---|---|
src/apm_cli/security/content_scanner.py |
Implements the hidden-Unicode scanner and non-critical stripping helper. |
src/apm_cli/security/__init__.py |
Exposes scanner types for import ergonomics. |
src/apm_cli/integration/base_integrator.py |
Adds install-time scanning hook (scan_deployed_files). |
src/apm_cli/utils/diagnostics.py |
Adds security category, counting/flags, and rendering group. |
src/apm_cli/commands/install.py |
Wires scanning after integration and tags findings with package_name. |
src/apm_cli/commands/audit.py |
Implements new apm audit CLI command with lockfile and --file modes. |
src/apm_cli/cli.py |
Registers the new audit command. |
tests/unit/test_content_scanner.py |
Unit tests for scanner detection, positioning, and stripping behavior. |
tests/unit/test_audit_command.py |
Unit tests for apm audit modes, exit codes, and --fix. |
tests/unit/test_install_scanning.py |
Unit tests for install-time scanning/diagnostics integration. |
docs/src/content/docs/reference/cli-commands.md |
Documents apm audit usage, options, and exit codes. |
docs/src/content/docs/enterprise/security.md |
Documents content scanning as a supply-chain mitigation. |
docs/src/content/docs/enterprise/governance.md |
Adds governance guidance for apm audit and install-time scanning. |
CHANGELOG.md |
Adds Unreleased entries describing audit/scanning features. |
Comments suppressed due to low confidence (1)
src/apm_cli/commands/audit.py:347
- Exit-code selection treats any findings (including info-level) as non-clean (
sys.exit(2)when no critical findings). If info-level findings are intended to be non-failing, the exit logic should distinguish warnings vs info. If info-level findings should still return 2, the messaging should clearly state that exit code 2 includes info-only cases (and_render_summary()should reflect that).
# -- Exit code --
if not findings_by_file:
sys.exit(0)
all_findings = [f for ff in findings_by_file.values() for f in ff]
if ContentScanner.has_critical(all_findings):
sys.exit(1)
sys.exit(2)
f6db857 to
5b91e46
Compare
Add supply chain integrity scanning for prompt files — detects hidden Unicode characters (tag characters, bidi overrides, zero-width chars) that can embed invisible instructions in shared rules files. New features: - apm audit: scan installed packages for hidden Unicode characters - apm audit --file: scan arbitrary files (gateway for non-APM users) - apm audit --fix: auto-strip non-critical characters - Install-time scanning: apm install surfaces findings in diagnostics Architecture: - Pure stateless ContentScanner with O(1) per-character lookup - Three severity levels: critical, warning, info - Security category in DiagnosticCollector - BaseIntegrator.scan_deployed_files() bridge method 72 new tests across 3 test files (scanner, audit command, install scanning). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
5b91e46 to
f899dd7
Compare
There was a problem hiding this comment.
Pull request overview
Adds hidden-Unicode content integrity scanning to APM, exposing it via a new apm audit command and integrating a pre-deployment scan into apm install, with findings surfaced through diagnostics and documented in the site docs.
Changes:
- Introduce a dependency-free
ContentScannerthat detects suspicious Unicode characters and can optionally strip non-critical ones. - Add
apm audit(lockfile-mode +--file, plus--stripand verbose output) and wire pre-deploy scanning intoapm install. - Extend diagnostics with a new
securitycategory (rendered first), plus tests and documentation updates.
Reviewed changes
Copilot reviewed 14 out of 14 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
src/apm_cli/security/content_scanner.py |
Implements the core scanner, findings model, and strip helper. |
src/apm_cli/security/__init__.py |
Exposes scanner types from the new security package. |
src/apm_cli/commands/audit.py |
Adds the apm audit CLI command and related helpers for scanning/stripping. |
src/apm_cli/commands/install.py |
Adds a pre-deployment security gate to block critical hidden characters unless --force. |
src/apm_cli/integration/base_integrator.py |
Adds scan_deployed_files() helper for post-integration scanning into diagnostics. |
src/apm_cli/utils/diagnostics.py |
Adds CATEGORY_SECURITY, security() recording, and security-first rendering. |
src/apm_cli/cli.py |
Registers the new audit command. |
tests/unit/test_content_scanner.py |
Unit tests for scanner detection, positions, and stripping. |
tests/unit/test_audit_command.py |
CLI tests for apm audit modes, exit codes, filtering, and --strip. |
tests/unit/test_install_scanning.py |
Tests pre-deploy scan gating + deployed-file scan diagnostics behavior. |
docs/src/content/docs/reference/cli-commands.md |
Documents the new apm audit command and install scanning behavior. |
docs/src/content/docs/enterprise/security.md |
Expands enterprise security model to include content scanning and threat model. |
docs/src/content/docs/enterprise/governance.md |
Adds governance workflow section for apm audit and planned CI mode. |
CHANGELOG.md |
Adds an Unreleased changelog entry for content security scanning/audit. |
Comments suppressed due to low confidence (1)
src/apm_cli/commands/install.py:589
- PR description mentions install-time scanning of deployed files via
BaseIntegrator.scan_deployed_files(), but the install flow shown here doesn’t call it after integrating primitives (only the pre-deploy scan runs). If per-deployed-file diagnostics are intended, wirescan_deployed_files()into the integration pipeline.
def _integrate_package_primitives(
package_info,
project_root,
*,
integrate_vscode,
- Add str.isascii() fast-path to scan_text() (~5000x faster for ASCII files) - Early termination in _pre_deploy_security_scan() when critical found (not force) - Add ContentScanner.classify() combining has_critical + summarize in one pass - Fix symlink following in _pre_deploy_security_scan and scan_deployed_files - Fix Rich markup parsing of [i]/[!] in _rich_echo (markup=False) - Fix exit code docs wording in governance.md and cli-commands.md Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
Adds a new content security scanning capability to APM to detect hidden Unicode characters in prompt/rules files, exposed via a new apm audit command and enforced as an install-time pre-deploy gate.
Changes:
- Introduces a dependency-free
ContentScannerengine with severity classification and optional stripping of non-critical characters. - Adds
apm audit(lockfile scanning +--file,--strip,--verbose) and wires install-time scanning to block critical findings unless--force. - Extends diagnostics to support a first-class
securitycategory with typed severity and updated rendering, plus docs/tests coverage.
Reviewed changes
Copilot reviewed 16 out of 17 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
src/apm_cli/security/content_scanner.py |
New scanning engine for suspicious Unicode characters + strip support |
src/apm_cli/security/__init__.py |
Exposes scanner API from the security package |
src/apm_cli/commands/audit.py |
New apm audit command (lockfile + file modes, strip, rendering, exit codes) |
src/apm_cli/commands/install.py |
Adds _pre_deploy_security_scan() gate and wires it into install flows |
src/apm_cli/integration/base_integrator.py |
Adds scan_deployed_files() to push security findings into diagnostics |
src/apm_cli/utils/diagnostics.py |
Adds CATEGORY_SECURITY, per-item severity, and security rendering/query helpers |
src/apm_cli/utils/console.py |
Adjusts Rich printing to avoid unintended markup parsing |
src/apm_cli/cli.py |
Registers the new audit command |
tests/unit/test_content_scanner.py |
Unit tests for scanner detection, positioning, summarize/classify, stripping |
tests/unit/test_audit_command.py |
Unit tests for CLI behavior, exit codes, strip behavior, lockfile scanning |
tests/unit/test_install_scanning.py |
Unit tests for install-time gate + deployed-file scanning diagnostics |
docs/src/content/docs/reference/cli-commands.md |
Documents apm audit and install-time scanning / --force behavior |
docs/src/content/docs/enterprise/security.md |
Updates threat model + security posture explanation for content scanning |
docs/src/content/docs/enterprise/governance.md |
Adds governance guidance for apm audit usage + exit codes |
CHANGELOG.md |
Adds Unreleased entry for the feature |
.copilot/session-state/.../plan.md |
Adds a Copilot session plan document (likely unintended repo artifact) |
- Remove dead scan_deployed_files() from BaseIntegrator (orphaned by pre-deploy shift, zero callers) - Add content scanning to apm compile output (AGENTS.md, CLAUDE.md, commands) before writing to disk — defense-in-depth using isascii() fast-path - Add content scanning to apm pack before bundling — publishing-side check warns on hidden characters - Update security docs: document three scanning gates (install, compile, pack) and planned hardening roadmap - Update CLI commands docs: compile and pack scanning behavior - Replace dead-code tests with ContentScanner-based equivalents Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…lution
- Replace rglob('*') with os.walk(followlinks=False) in both
_pre_deploy_security_scan() and _scan_files_in_dir() to prevent
traversal into symlinked directories outside the package tree
- Reword early-termination diagnostic to 'at least N critical' since
the scan may have stopped before counting all findings
- Resolve --file paths to absolute before keying findings so that
--strip works correctly with relative paths like ../some.md
- Add test proving symlinked directories are not followed
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
Adds a content security scanning layer to APM to detect hidden/invisible Unicode characters in prompt/rules files, with enforcement during install and on-demand via a new apm audit command.
Changes:
- Introduces a dependency-free
ContentScannerengine plus anapm auditCLI command (file-mode and lockfile-mode, optional--strip, exit codes). - Wires scanning into
apm installas a pre-deploy gate (block on critical unless--force) and adds defense-in-depth scanning in compile/pack flows. - Extends
DiagnosticCollectorwith asecuritycategory + severity support, and updates docs/changelog accordingly.
Reviewed changes
Copilot reviewed 19 out of 20 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/unit/test_install_scanning.py | Unit tests for install-time pre-deploy scanning gate + diagnostics behavior |
| tests/unit/test_content_scanner.py | Unit tests for scanner character classification, positioning, and stripping |
| tests/unit/test_audit_command.py | Unit tests for apm audit behavior, exit codes, lockfile scanning, and --strip |
| src/apm_cli/utils/diagnostics.py | Adds security diagnostics category with typed severity + rendering |
| src/apm_cli/utils/console.py | Adjusts Rich printing to avoid unintended markup parsing |
| src/apm_cli/security/content_scanner.py | New scanning engine for hidden Unicode detection + stripping |
| src/apm_cli/security/init.py | Exposes scanner symbols via package exports |
| src/apm_cli/compilation/claude_formatter.py | Scans generated Claude commands before writing |
| src/apm_cli/compilation/agents_compiler.py | Scans generated CLAUDE.md output before writing |
| src/apm_cli/commands/install.py | Adds _pre_deploy_security_scan() gate and wires it into install paths |
| src/apm_cli/commands/compile.py | Scans compiled outputs before writing to disk |
| src/apm_cli/commands/audit.py | New apm audit command implementation + helpers |
| src/apm_cli/cli.py | Registers the new audit command |
| src/apm_cli/bundle/packer.py | Scans files before bundling during apm pack |
| docs/src/content/docs/reference/cli-commands.md | Documents apm audit and scanning behavior in install/compile/pack |
| docs/src/content/docs/enterprise/security.md | Updates enterprise security model with scanning gates + threat model |
| docs/src/content/docs/enterprise/governance.md | Adds governance guidance for apm audit workflows/exit codes |
| README.md | Mentions content security + apm audit in feature list |
| CHANGELOG.md | Adds Unreleased entry for the feature set |
| .gitignore | Ignores .copilot/ directory |
- packer.py: Replace logging.warning with _rich_warning so pack-time scan findings are visible to users - agents_compiler.py: Replace logging.warning with all_warnings.append so CLAUDE.md findings surface through CompilationFormatter - claude_formatter.py: Hoist ContentScanner import outside write loop - compile.py: Remove unused has_crit variable from classify() call - install.py: Remove duplicate installed_packages.append in blocked local-package path (was already appended at line 1291) - test_install_scanning.py: Add try/except OSError pytest.skip guard for symlink test on platforms that don't support symlinks Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- agents_compiler.py: Move ContentScanner import before the content_map write loop (was inside loop, flagged pattern) - test_content_scanner.py: Add latin-1 encoding and BOM+critical combo edge case tests Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Content Security Scanner —
apm audit+ Install-Time Pre-Deploy GateCloses #312
Problem
Shared prompt/rules files (
.cursorrules,.github/prompts/, etc.) are becoming a de facto supply chain, but without integrity guarantees. Hidden Unicode characters — particularly tag characters (U+E0001–E007F) and bidi overrides — can embed invisible instructions that LLMs tokenize and follow but humans can't see. Unlike npm packages where code sits inert until executed, prompt files are read by IDE agents the moment they land on disk. File presence IS execution.Solution
Three layers of defense:
Pre-deploy gate during
apm install— scans all source files inapm_modules/{pkg}/BEFORE integrators copy them to targets (.github/,.claude/, etc.). Critical findings block deployment unless--force.Compile-time scanning — scans compiled output (AGENTS.md, CLAUDE.md, .claude/commands/) before writing to disk. Defense-in-depth: source files were already scanned at install, but the compiled output is what agents actually read.
Pack-time scanning — scans files before bundling with
apm pack. Publishing-side gate prevents authors from distributing tainted content.apm auditcommand — on-demand scanning of installed packages or arbitrary files.Features
apm audit— scan all installed packagesapm audit <package>— scan a specific packageapm audit --file .cursorrules— scan any fileapm audit --strip— remove non-critical chars (zero-width spaces, unusual whitespace)apm audit --verbose— show info-level findingsSeverity Levels
--force)--verboseExit Codes
Security Hardening
is_symlink()checks prevent traversal outside package directory_is_safe_lockfile_path()validates all lockfile paths inapm audit_apply_strip()path validation: ensures strip only writes within project rootPerformance
str.isascii()fast-path: ~5000x faster for pure-ASCII files (90%+ of prompts)--forcenot setContentScanner.classify(): combinedhas_critical+summarizein single pass_CHAR_LOOKUPdict (156 entries)Files
New:
src/apm_cli/security/__init__.py+content_scanner.py— pure/stateless scanning enginesrc/apm_cli/commands/audit.py—apm auditcommandtests/unit/test_content_scanner.py— 39 scanner teststests/unit/test_audit_command.py— 33 audit command teststests/unit/test_install_scanning.py— install scanning testsModified:
src/apm_cli/commands/install.py—_pre_deploy_security_scan()wired into all 3 install pathssrc/apm_cli/commands/compile.py— compile-time output scanningsrc/apm_cli/compilation/agents_compiler.py— CLAUDE.md output scanningsrc/apm_cli/compilation/claude_formatter.py— commands output scanningsrc/apm_cli/bundle/packer.py— pack-time input scanningsrc/apm_cli/utils/diagnostics.py—severityfield,security()method,has_critical_securitysrc/apm_cli/utils/console.py—markup=Falsefix for Rich[i]parsingdocs/src/content/docs/enterprise/security.md— rewritten with threat model, three scanning gatesdocs/src/content/docs/enterprise/governance.md— trimmed duplicates, cross-linkeddocs/src/content/docs/reference/cli-commands.md—apm audit, compile/pack scanning docsFuture work (tracked)
All 1853 unit tests passing.