[Phase 0.2.4] Create e2e test framework#34
Open
richard-devbot wants to merge 4 commits intoCursorTouch:mainfrom
Open
[Phase 0.2.4] Create e2e test framework#34richard-devbot wants to merge 4 commits intoCursorTouch:mainfrom
richard-devbot wants to merge 4 commits intoCursorTouch:mainfrom
Conversation
…rsorTouch#10] Creates tests/e2e/ with conftest.py (mock_llm_provider, test_agent, test_gateway, test_agent_with_echo_tool, mock_llm_with_tool_call fixtures), helpers.py (send_message, assert_tool_called, assert_response_contains), and test_smoke.py with 4 passing smoke tests. No real LLM API keys required. Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
Wires mock LLM + Agent + Bus + Orchestrator into a single fixture so tests can exercise the full pipeline coordinator layer via process_direct() without requiring real API keys or a live bus loop. Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
…[ci]
- BrowserPlugin/ComputerPlugin: wire _state_hook (and _wait_for_ui_hook)
into register_hooks/unregister_hooks/enable/disable so hooks are
registered with the Hooks service when the plugin is enabled
- Both plugins: add <perception>, <tool_use>, <execution_principles>
XML sections to SYSTEM_PROMPT as expected by the test suite
- BrowserPlugin.unregister_tools: also unset_extension("browser")
so registry.get("browser") returns None after unregistration
- ToolRegistry.get: fall back to _extensions when no tool matches
by name, enabling tests to discover the browser extension via get()
- control_center: pass graceful_fn=kwargs.get("_graceful_restart_fn")
to _do_restart so graceful restart fn is correctly threaded through
- ruff --fix: remove unused imports across several files
Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
e60fa03 to
56635b2
Compare
- Add TYPE_CHECKING guard in cli/start.py so MCPManager forward ref resolves (F821) - Remove unused mock_session variable in test_mcp_manager.py (F841) - Suppress unused tool_names via _ assignment in test_mcp_manager.py (F841) - Apply ruff format across all 215 files (E702 semicolons and style) Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Closes #10
Creates
tests/e2e/end-to-end test framework:__init__.py— package markerconftest.py— five pytest fixtures:mock_llm_provider(deterministic MagicMock satisfying BaseChatLLM protocol),test_agent(Agent wired to mock LLM + temp workspace),test_gateway(Gateway + stub channel for integration tests),test_agent_with_echo_tool(agent pre-loaded with echo tool),mock_llm_with_tool_call(factory that exercises the tool-call → response path)helpers.py—send_message(),assert_tool_called(),assert_response_contains()test_smoke.py— 4 smoke tests: send message → response, response content assertion, tool call recording, session isolationWhy
Unit tests verify components in isolation. This framework verifies the full pipeline: message → agent → tools → response. CI runs without real LLM keys — the mock LLM returns
"pong"deterministically.Test run