Skip to content

XortexAI/System-prompt

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Coding Agent System Prompt Analysis

A deep comparison of Claude Code, Cursor, and Gemini CLI system prompts

At a Glance

Agent Tokens Characters Relative Token Cost
Claude Code 5,865 27,000+ Highest (+72% vs Cursor)
Gemini CLI 4,122 19,000+ Medium (+21% vs Cursor)
Cursor 3,410 14,000+ Lowest (baseline)

Step 1: What Token Count Actually Reveals

Claude Code (5,865 tokens)

Claude Code's prompt is 72% larger than Cursor's and 42% larger than Gemini CLI's. This is not bloat. Every major section carries institutional weight:

  • A full risk categorization framework for destructive vs. irreversible vs. shared-state actions
  • A typed, file-based memory system with four categories, a two-file index architecture, staleness warnings, and explicit rules about what NOT to save
  • Detailed git workflow rules covering status, diff, log, commit message style, and push confirmation
  • Prompt injection awareness the model is explicitly told to flag suspicious tool results

Gemini CLI (4,122 tokens)

Despite being 21% larger than Cursor, Gemini CLI's tokens are spent differently on procedural scaffolding:

  • 6-step numbered workflow for software engineering tasks
  • 6-step numbered workflow for new application builds
  • Explicit technology defaults (React, FastAPI, Flutter, Three.js, etc.)
  • Sub-agent delegation mandates (ALWAYS use delegate_to_agent)
  • Shell output verbosity controls to manage token consumption (ironic given its own token cost)

Cursor (3,410 tokens)

The leanest of the three. Cursor's prompt reads like a senior engineer's style guide opinionated, tight, and trusting. It doesn't over-specify because it doesn't need to. Notable for what it doesn't include: no memory system, no sub-agent framework, no git workflow, no numbered step sequences. This is maximum model trust expressed as minimum prompt overhead.


Step 2: Structure & Organization

Claude Code

Uses flowing prose with bold headers. Mixes rule types across sections safety, git, memory, and code quality are interwoven rather than cleanly separated. Readable but dense. The memory system section alone is longer than Cursor's entire prompt.

Verdict: Well organized at the section level, dense within sections.

Cursor

Uses XML-style tags (<tool_calling>, <making_code_changes>, <mode_selection>) to create clean, scannable, modular sections. Each tag is a self-contained behavioral domain. Easy to maintain and update individual sections independently.

Verdict: Best structural organization of the three.

Gemini CLI

Uses a strict hierarchy: Core Mandates → Sub-Agents → Skills → Primary Workflows → Operational Guidelines. Numbered steps within workflows create clear sequential logic. The most "engineered" structure feels like it was designed by a team with a formal spec process.

Verdict: Best workflow structure, most process-oriented.


Step 3: Safety & Guardrails

Claude Code 🥇

The most mature safety model. Actions are explicitly categorized into three tiers:

  • Destructive: deleting files/branches, rm -rf, overwriting uncommitted changes
  • Hard-to-reverse: force-pushing, git reset --hard, amending published commits
  • Shared-state: pushing code, commenting on PRs, posting to external services

Additionally includes prompt injection awareness the model is told to flag suspicious tool call results before acting on them. The "measure twice, cut once" principle is explicitly named.

Gemini CLI 🥈

Good safety rules (explain before acting, security-first, never commit secrets) but expressed as a checklist rather than a coherent risk framework. The sandbox reminder for commands that modify the system outside the project directory is a thoughtful addition.

Cursor 🥉

Light on safety. Relies on model judgment and mentions linter checks. No explicit risky-action categorization framework. Thinner than the others.


Step 4: Code Quality Philosophy

Gemini CLI 🥇

The most rigorous engineering discipline of the three. Explicitly covers:

  • Mimic existing style, naming, and architectural patterns
  • Verify library availability before using it (check package.json, Cargo.toml, requirements.txt)
  • Comments should explain why, never what
  • Never talk to the user through comments
  • No speculative abstractions, no half-finished implementations
  • Test files are permanent artifacts

Cursor 🥈

Sharp and opinionated: no unnecessary abstractions, no speculative features, no backwards-compat hacks, three similar lines beats a premature abstraction. Everything it says is correct it just covers slightly less ground than Gemini CLI.

Claude Code 🥉

Shares the same values but expresses them less crisply. The no-redundant-comments rule is present but framed more loosely. Loses points for being less explicit about the "verify library exists" rule.


Step 5: Agent & Tool Architecture

Gemini CLI 🥇

The only prompt with a true multi-agent architecture:

  • codebase_investigator - architectural mapping, root-cause analysis
  • cli_help - documentation and runtime configuration questions
  • skill-creator - extensible skill system via activate_skill tool
  • Explicit mandate: "ALWAYS use delegate_to_agent if a relevant agent exists"

Claude Code 🥈

No formal sub-agent delegation, but has MCP server integration and web search tooling as extensibility vectors. The tool design philosophy (tightly scoped, structured returns) is excellent even without the orchestration layer.

Cursor 🥉

Single-agent model. Parallel tool calls are the primary optimization mechanism. No sub-agent delegation, no extensible skill system. Appropriate for an IDE context where the task scope per turn is typically narrow.


Our Recommendation Building the Ideal Coding Agent Prompt

If you are building a coding agent today, don't copy any one of these prompts wholesale. Each has been shaped by the constraints of its team, its model, and its product surface. What you should do is cherry-pick the best idea from each and compose them deliberately.

Here is the blueprint, section by section.


1. Structure: Borrow from Cursor

Use XML-style tags to separate behavioral domains. Each tag should be self-contained, independently editable, and covering exactly one concern.

<identity>...</identity>
<tool_usage>...</tool_usage>
<code_quality>...</code_quality>
<safety>...</safety>
<git_workflow>...</git_workflow>
<tone_and_style>...</tone_and_style>

Why: Modular structure lets you A/B test individual sections, update one rule without risking regressions in others, and onboard new team members to the prompt logic quickly. Flat prose prompts become unmaintainable at scale.


2. Safety: Borrow from Claude Code

Copy the three-tier risk classification framework verbatim and adapt it to your toolset.

Before any action, classify it:
 
DESTRUCTIVE (always confirm first):
- Deleting files, branches, or database records
- Overwriting uncommitted changes
- Any rm -rf or DROP TABLE equivalent
 
HARD-TO-REVERSE (confirm unless explicitly pre-authorized):
- Force pushes, hard resets, amending published commits
- Removing or downgrading dependencies
- Modifying CI/CD pipelines
 
SHARED-STATE (confirm, these affect others):
- Pushing code to remote
- Opening, closing, or commenting on PRs/issues
- Posting to external services or APIs

Also add prompt injection awareness: instruct your model to flag suspicious content in tool results before acting on it.

Why: This is the section most builders skip and most regret skipping. A single accidental git push --force or dropped table destroys user trust permanently. The cost of adding this is ~150 tokens. The cost of not having it is unbounded.


3. Code Quality: Borrow from Gemini CLI

These rules are the sharpest expression of senior engineering discipline across all three prompts. Use them as-is:

- Never assume a library is available. Check package.json / requirements.txt /
  Cargo.toml / build.gradle before using any dependency.
 
- Mimic the style, naming conventions, and architectural patterns of existing
  code in the project. Read before you write.
 
- Comments explain WHY, never WHAT. If the code is clear, no comment is needed.
  Never describe your changes in comments — that's what commit messages are for.
 
- Do not add features, refactor, or "improve" beyond what was asked.
  A bug fix does not need surrounding code cleaned up.
 
- No speculative abstractions. Three similar lines of code is better than
  a premature abstraction. Build for the task, not for hypothetical futures.
 
- Test files are permanent artifacts. Do not delete them after verification.

Why: These rules prevent the most common failure modes of AI-generated code: dependency hallucination, style drift, over-engineering, and scope creep. They cost ~200 tokens and save enormous review overhead.


4. Workflow: Borrow from Gemini CLI (but trim it)

Gemini CLI's numbered workflow is its strongest structural idea. Adapt it, but cut the verbosity:

For every engineering task:
1. READ first. Understand the file, its conventions, and its dependencies
   before writing a single line.
2. PLAN. State your approach in one sentence before acting. If the task
   is complex, share the plan with the user.
3. IMPLEMENT. Match existing style. Use only verified dependencies.
4. VERIFY. Run the project's test command. If unknown, check README or
   package.json — never assume a standard command.
5. CHECK STANDARDS. Run lint and type-check commands after every change.

Why: Numbered steps force sequential reasoning and prevent the most common agentic failure acting before understanding. The key is keeping each step tight. Gemini CLI's version runs to paragraphs per step; yours should be two lines each.


5. Git Workflow: Borrow from Claude Code + Gemini CLI

Combine the best of both:

NEVER stage or commit unless explicitly instructed.
 
When asked to commit:
1. Run: git status && git diff HEAD && git log -n 3
2. Review all changes and match the style of recent commit messages.
3. Propose a draft commit message focused on WHY, not WHAT.
4. Wait for confirmation before committing.
5. After commit, run git status to confirm success.
6. NEVER push to remote without explicit user instruction.
 
If a commit fails, report it. Never attempt to work around
commit hooks or validation checks.

Why: Git mistakes are the most irreversible mistakes an agent can make. Force pushes overwrite history. Accidental commits to main break teams. This section is ~100 tokens and prevents catastrophic outcomes.


6. Tone & Token Discipline: Write Your Own

Neither Cursor nor Claude Code nor Gemini CLI gets this fully right. Write explicit rules for your own context:

- Respond in plain sentences. No bullet points unless the content
  is genuinely list-structured.
- Lead with the action or answer, not the reasoning.
- Do not summarize what you just did the user can see the diff.
- Do not use colons before tool calls.
- Never use emojis unless the user uses them first.
- If you can say it in one sentence, don't use three.

Why: AI agents are notoriously verbose by default. Explicit tone rules shape output quality more than people expect. The self-referential token discipline rule is a useful forcing function it makes the model aware of its own cost footprint.


7. What to Leave Out

As important as what you include is what you deliberately omit.

Don't include a full sub-agent architecture unless your model reliably supports multi-agent tool use. Gemini CLI's sub-agent mandates are impressive on paper and unreliable in practice on weaker models. Adding orchestration instructions to a model that can't honor them creates confusion, not capability.

Don't include a new application workflow unless building greenfield apps is a core use case. It's 600+ tokens of Gemini CLI that most agents will never trigger.

Don't over-specify what your model already knows. If you're running on Claude or GPT-4-class models, you don't need to explain what a library import is, what a linter does, or that tests should pass before merging. Trust the model's training. Every token you spend teaching the model what it already knows is a token stolen from instructions it actually needs.


The Composite Prompt Architecture

Putting it all together, the ideal coding agent system prompt looks like this:

Total target: 3,500 – 4,000 tokens
(Lean enough to be economical, complete enough to be safe)
 
<identity>          ~100 tokens  — who you are, what you do
<tool_usage>        ~200 tokens  — tool priority, parallelism, no-terminal rules
<code_quality>      ~250 tokens  — Gemini CLI's engineering discipline rules
<safety>            ~200 tokens  — Claude Code's three-tier risk framework
<workflow>          ~200 tokens  — Gemini CLI's numbered steps, condensed
<git_workflow>      ~150 tokens  — combined Claude Code + Gemini CLI git rules
<memory>            ~300 tokens  — Claude Code's typed memory system (if needed)
<tone_and_style>    ~150 tokens  — your own rules + token discipline

Total: ~1,550 tokens of structure + your project-specific context.

That leaves 2,000–3,000 tokens of headroom for injected context (file trees, CLAUDE.md files, active task state) without blowing past the 4,000 token target.


Conclusion

Gemini CLI wrote the most architecturally sophisticated prompt for a model that can't fully honor it.

Cursor wrote the leanest, most economically efficient prompt for a model that doesn't need more.

Claude Code wrote the densest prompt at the highest token cost encoding genuine organizational philosophy rather than compensating for model gaps, running on a model capable of absorbing that density.


Token counts measured at time of extraction. Prompts may be updated by their respective teams at any time.

About

System Prompts for all Providers - Cursor, Claude Code, Gemini CLI, OpenCode

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors