-
Notifications
You must be signed in to change notification settings - Fork 1
docs: add cognitive security manifesto and threat taxonomy v1 #23637
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,47 @@ | ||
| --- | ||
| version: 1.0.0 | ||
| date: 2026-03-31 | ||
| status: draft | ||
| --- | ||
|
|
||
| # Cognitive Security Manifesto | ||
|
|
||
| ## The Category Thesis | ||
|
|
||
| Artificial Intelligence has fundamentally altered the security landscape. Traditional cybersecurity focused on protecting systems from unauthorized access, but it is ill-equipped to handle systems that can reason, synthesize, and act autonomously. AI safety is broken because it relies on probabilistic alignment rather than structural guarantees. We do not need better vibes; we need an epistemic immune system. Cognitive security is not just about stopping prompt injection—it is about ensuring the structural integrity of machine reasoning. | ||
|
|
||
| ## Core Category Sentence | ||
|
|
||
| We secure the structural integrity of AI reasoning to prevent cognitive failure, enforcing admissibility and determinism over probabilistic alignment. | ||
|
|
||
| ## The Paradigm Shift: Admissibility Over Alignment | ||
|
|
||
| The prevailing paradigm of AI safety attempts to teach models to behave well through reinforcement learning from human feedback (RLHF). This is fundamentally flawed. It treats symptoms rather than causes, attempting to patch a leaky boat with polite suggestions. | ||
|
|
||
| We advocate a shift from probabilistic alignment to deterministic admissibility. | ||
|
|
||
| 1. **Verification Before Execution:** AI outputs must not be trusted implicitly. They must be validated against deterministic cognitive structures. | ||
|
|
||
| 2. **Epistemic Integrity:** The provenance, lineage, and structural soundness of information entering and leaving an AI system must be unbroken. | ||
|
|
||
| 3. **Graph-Based Grounding:** Reality is relational. Our security models must enforce relational invariants through Reality Graphs, Belief Graphs, and Narrative Graphs. | ||
|
|
||
| ## The Messaging Hierarchy | ||
|
|
||
| To build a true Cognitive Security posture, organizations must adopt three core pillars: | ||
|
|
||
| ### 1. Structural Admissibility Gates | ||
|
|
||
| AI systems must operate behind Admissibility Gates that enforce cryptographic and structural validation of all context and outputs. If a cognitive packet cannot be verified against the Reality Graph, it is quarantined. | ||
|
|
||
| ### 2. The Cognitive Security Protocol (CSP) | ||
|
|
||
| A standardized, machine-readable schema for defining what constitutes valid cognition within an enterprise. The CSP acts as the foundational constitution that all AI agents must strictly adhere to. | ||
|
|
||
| ### 3. Quarantine and Subsumption | ||
|
|
||
| When cognitive failure occurs—whether through external attack or internal stochastic drift—the system must isolate the anomaly in a Quarantine Graph. From there, human and automated operators analyze, subsume, and integrate the failure to inoculate the broader Epistemic Immune System. | ||
|
|
||
| ## The Path Forward | ||
|
|
||
| We stand at the precipice of cognitive automation. The systems we build today will define the epistemics of tomorrow. We must stop relying on the black-box promises of model providers and start engineering deterministic cognitive constraints. Security is no longer just about the network; it is about the mind of the machine. |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,77 @@ | ||
| --- | ||
| version: 1.0.0 | ||
| date: 2026-03-31 | ||
| status: draft | ||
| --- | ||
|
|
||
| # Cognitive Security Threat Taxonomy v1 | ||
|
|
||
| This taxonomy defines the four Canonical Failure classes in Cognitive Security. | ||
|
|
||
| ## 1. Corrupted Cognition | ||
|
|
||
| **Definition:** | ||
| Occurs when the input context, prompt structures, or reasoning chains are maliciously manipulated or poisoned, leading the AI system to process invalid or hijacked epistemics. | ||
|
|
||
| **Sub-types:** | ||
|
|
||
| * Context Poisoning | ||
| * Prompt Injection | ||
| * Instruction Hijacking | ||
|
|
||
| **Examples:** | ||
|
|
||
| * An attacker embeds a hidden prompt injection payload inside a seemingly benign PDF resume, causing the HR parsing AI to output a recommendation for hire regardless of qualifications. | ||
| * A user adds invisible text to a webpage that instructs a summarization agent to exfiltrate private session tokens via markdown image links. | ||
| * A third-party API dependency returns intentionally hallucinated JSON that exploits the parser's loose schema, causing downstream logic errors in an autonomous agent. | ||
|
|
||
| ## 2. Non-Compliant Cognition | ||
|
|
||
| **Definition:** | ||
| Occurs when the AI system generates outputs or takes actions that violate established enterprise policies, guardrails, or regulatory frameworks, despite operating on non-corrupted inputs. | ||
|
|
||
| **Sub-types:** | ||
|
|
||
| * Guardrail Evasion | ||
| * Policy Bypass | ||
| * Regulatory Infraction | ||
|
|
||
| **Examples:** | ||
|
|
||
| * A financial advisory agent provides specific, actionable stock trading advice despite a strict system prompt forbidding financial recommendations, due to a highly persuasive conversational turn. | ||
| * A customer service bot reveals a hidden discount code meant only for internal employees because the user asked it to roleplay as a developer testing the system. | ||
| * An AI tool generates code that includes a hardcoded secret or vulnerability, violating the organization's secure coding standards. | ||
|
|
||
| ## 3. Non-Reproducible Cognition | ||
|
|
||
| **Definition:** | ||
| Occurs when an AI system produces non-deterministic, heavily drifted, or hallucinated outputs that cannot be reliably reproduced or traced back to grounded facts. | ||
|
|
||
| **Sub-types:** | ||
|
|
||
| * Stochastic Drift | ||
| * Hallucination Cascades | ||
| * Contextual Amnesia | ||
|
|
||
| **Examples:** | ||
|
|
||
| * A legal analysis bot invents a non-existent legal precedent (hallucination) and uses it as the foundational argument for all subsequent case analysis in the session. | ||
| * An agent running the same evaluation task on the same data returns three wildly different summarization metrics across three separate runs. | ||
| * A multi-agent system progressively loses track of its original objective over a long context window, leading to an emergent, off-topic loop. | ||
|
|
||
| ## 4. Non-Admissible Cognition | ||
|
|
||
| **Definition:** | ||
| Occurs when the output fails structural, schema, or relational validation checks defined by the Cognitive Security Protocol (CSP), resulting in the rejection of the data packet by the Admissibility Gates. | ||
|
|
||
| **Sub-types:** | ||
|
|
||
| * Schema Violation | ||
| * Relational Inconsistency | ||
| * Unverified Epistemics | ||
|
|
||
| **Examples:** | ||
|
|
||
| * An agent outputs a JSON response missing a required evidence ID field, causing the data to be rejected by the WriteSet firewall. | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Undefined control term:
🤖 Prompt for AI Agents |
||
| * A knowledge graph generator asserts a relationship between two entities that explicitly contradicts a verified invariant in the central Reality Graph. | ||
| * A data extraction pipeline submits an event with a timestamp that predates the creation of the system, violating temporal constraints in the Admissibility Gate. | ||
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1 +1 @@ | ||
| { "status": "done" } | ||
| {"branch_name":"jules-5346770267403756260-84071d5c","message":"docs: add cognitive security manifesto and threat taxonomy v1"} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
Repository: BrianCLong/summit
Length of output: 481
🏁 Script executed:
Repository: BrianCLong/summit
Length of output: 3520
🏁 Script executed:
Repository: BrianCLong/summit
Length of output: 439
🏁 Script executed:
Repository: BrianCLong/summit
Length of output: 50373
🏁 Script executed:
Repository: BrianCLong/summit
Length of output: 50373
Pin the attestation action to an immutable commit SHA (not
@v1).On Line 30, using a moving tag allows unreviewed upstream changes to alter provenance behavior over time. For this security-sensitive step, pin to a full commit SHA.
Suggested fix
🤖 Prompt for AI Agents