Skip to content
Open
Show file tree
Hide file tree
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 6 additions & 6 deletions e2e-tests/chat_tabs.spec.ts
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ import { expect } from "@playwright/test";

test("tabs appear after navigating between chats", async ({ po }) => {
await po.setUp({ autoApprove: true });
await po.importApp("minimal");
await po.importApp("minimal-with-ai-rules");
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 MEDIUM | unexplained-change

Fixture swap from minimalminimal-with-ai-rules is not explained

All 6 tests in this file switched their fixture. The only difference between the two fixtures is the presence of AI_RULES.md:

$ diff -rq e2e-tests/fixtures/import-app/minimal e2e-tests/fixtures/import-app/minimal-with-ai-rules
Only in minimal-with-ai-rules: AI_RULES.md

~15 other spec files still use minimal, so this is now an outlier, and the PR description doesn't mention the switch at all. Two problems:

  1. Coverage drift: the tab-UI paths are no longer exercised against the baseline minimal fixture. If tab behavior regresses when AI rules are absent (different banners, setup flow, focus stealing), none of these tests will catch it.
  2. Cargo-culting risk: a future maintainer won't know whether the swap was load-bearing for deflaking, or an accidental change that should be reverted.

💡 Suggestion: If the swap is needed to avoid a focus/race condition caused by the no-AI-rules onboarding path, call that out in the PR description and add a one-line // comment next to the first importApp("minimal-with-ai-rules") explaining why. Otherwise, revert to minimal to stay consistent with the rest of the e2e suite.


// Chat 1
await po.sendPrompt("[dump] build a todo app");
Expand All @@ -24,7 +24,7 @@ test("tabs appear after navigating between chats", async ({ po }) => {

test("clicking a tab switches to that chat", async ({ po }) => {
await po.setUp({ autoApprove: true });
await po.importApp("minimal");
await po.importApp("minimal-with-ai-rules");

// Chat 1 - send a unique message
await po.sendPrompt("First chat unique message alpha");
Expand Down Expand Up @@ -57,7 +57,7 @@ test("clicking a tab switches to that chat", async ({ po }) => {

test("closing a tab removes it and selects adjacent tab", async ({ po }) => {
await po.setUp({ autoApprove: true });
await po.importApp("minimal");
await po.importApp("minimal-with-ai-rules");

// Chat 1
await po.sendPrompt("First chat message gamma");
Expand Down Expand Up @@ -99,7 +99,7 @@ test("closing a tab removes it and selects adjacent tab", async ({ po }) => {

test("right-click context menu: Close other tabs", async ({ po }) => {
await po.setUp({ autoApprove: true });
await po.importApp("minimal");
await po.importApp("minimal-with-ai-rules");

// Chat 1
await po.sendPrompt("[dump] Chat one context menu");
Expand Down Expand Up @@ -138,7 +138,7 @@ test("right-click context menu: Close other tabs", async ({ po }) => {

test("right-click context menu: Close tabs to the right", async ({ po }) => {
await po.setUp({ autoApprove: true });
await po.importApp("minimal");
await po.importApp("minimal-with-ai-rules");

// Chat 1
await po.sendPrompt("[dump] Left tab one");
Expand Down Expand Up @@ -182,7 +182,7 @@ test("right-click context menu: Close tabs to the right", async ({ po }) => {

test("only shows tabs for chats opened in current session", async ({ po }) => {
await po.setUp({ autoApprove: true });
await po.importApp("minimal");
await po.importApp("minimal-with-ai-rules");

// Initially no tabs should be visible (no chats opened yet in this session)
const closeButtons = po.page.getByLabel(/^Close tab:/);
Expand Down
15 changes: 12 additions & 3 deletions e2e-tests/helpers/page-objects/components/ChatActions.ts
Original file line number Diff line number Diff line change
Expand Up @@ -88,9 +88,18 @@ export class ChatActions {
timeout,
}: { skipWaitForCompletion?: boolean; timeout?: number } = {},
) {
await this.getChatInput().click();
await this.getChatInput().fill(prompt);
await this.page.getByRole("button", { name: "Send message" }).click();
const chatInput = this.getChatInput();
const sendButton = this.page.getByRole("button", { name: "Send message" });

await expect(chatInput).toBeVisible();
Comment thread
cursor[bot] marked this conversation as resolved.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Apply MEDIUM timeout to initial chat-input visibility wait

The new expect(chatInput).toBeVisible() pre-check uses Playwright’s default assertion timeout (~5s), so on slower CI runs after importApp() it can fail before the toPass({ timeout: Timeout.MEDIUM }) retry logic even starts. Before this commit, click()/fill() used the longer action/test timeout budget, so this change can reintroduce prompt-entry flakes in the same paths this helper is meant to stabilize. Use timeout: Timeout.MEDIUM for this visibility check (or move it inside the retry block) so waits are consistent.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Visibility assertion uses short default timeout before retry

Low Severity

The new await expect(chatInput).toBeVisible() assertion runs before the toPass retry block and uses Playwright's default 5-second assertion timeout. The subsequent retry block uses Timeout.MEDIUM (15–30s on CI). On slow CI machines, this creates a bottleneck: if the chat input takes more than 5 seconds to render (e.g., after clickNewChat() triggers a page transition), the method throws before the more resilient retry block ever runs. The old code had no such gate—click() would wait using the much longer action/test timeout. Other assertions in this codebase pass explicit timeouts when longer waits are needed (e.g., toBeVisible({ timeout: Timeout.MEDIUM })). For a PR aimed at deflaking, this could introduce a new flake source.

Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit c2d660a. Configure here.

await expect(async () => {
await chatInput.click();
await chatInput.fill(prompt);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 MEDIUM | retry-correctness

Retry loop can silently produce accumulated text in the Lexical editor

Inside the toPass() callback, each retry unconditionally calls:

await chatInput.click();
await chatInput.fill(prompt);
await expect(chatInput).toContainText(prompt);
await expect(sendButton).toBeEnabled();

The newly-added rule in rules/e2e-testing.md already warns that fill() is unreliable on Lexical (fill("") doesn't clear). If a retry fires — e.g., because toBeEnabled() hadn't updated yet on the first pass — the second fill(prompt) may append to the existing content instead of replacing it, leaving the editor with "First chat unique message alphaFirst chat unique message alpha".

toContainText(prompt) is a substring check, so the assertion still passes on the accumulated text. The wrong prompt is then clicked through to the LLM and the test quietly reports success. This is the classic "deflake masks a real bug" failure mode — the retry turns a loud flake into a silent correctness bug.

💡 Suggestion: Use toHaveText(prompt) instead of toContainText(prompt) so accumulation forces another retry, or explicitly clear the editor (ControlOrMeta+a / Backspace, as clearChatInput() does) at the top of each retry attempt before calling fill().

await expect(chatInput).toContainText(prompt);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Drop raw-input equality check for mention prompts

sendPrompt() now gates submission on expect(chatInput).toContainText(prompt), but the Lexical editor rewrites internal mention formats to display text (e.g. @app:name becomes @name, similarly for other mention types) before render. In flows that use mention syntax (for example po.sendPrompt("[dump] @app:minimal-with-ai-rules hi")), this assertion never becomes true, so the helper retries until timeout and never clicks Send despite a valid prompt.

Useful? React with 👍 / 👎.

await expect(sendButton).toBeEnabled();
Comment on lines +98 to +99
Copy link

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inside expect(...).toPass(), the nested Playwright expect(chatInput).toContainText(...) and expect(sendButton).toBeEnabled() each have their own auto-wait timeouts (defaults), which can block a single attempt for seconds and largely defeats the retry loop. Prefer asserting via textContent() / isEnabled() inside the callback, or set explicit short/zero timeouts on the nested expect calls so toPass can retry quickly.

Suggested change
await expect(chatInput).toContainText(prompt);
await expect(sendButton).toBeEnabled();
const chatInputText = await chatInput.textContent();
expect(chatInputText ?? "").toContain(prompt);
expect(await sendButton.isEnabled()).toBe(true);

Copilot uses AI. Check for mistakes.
Comment on lines +98 to +99
Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai bot Apr 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: The nested web-first assertions (toContainText, toBeEnabled) inside the toPass() callback each carry their own default auto-wait timeout (~5s each). A single retry attempt can block for up to 10s on these assertions, largely defeating the retry loop. Use instant checks (e.g., await chatInput.textContent() + synchronous expect, and await sendButton.isEnabled()) or pass { timeout: 0 } to the nested expects so toPass retries quickly.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At e2e-tests/helpers/page-objects/components/ChatActions.ts, line 100:

<comment>The nested web-first assertions (`toContainText`, `toBeEnabled`) inside the `toPass()` callback each carry their own default auto-wait timeout (~5s each). A single retry attempt can block for up to 10s on these assertions, largely defeating the retry loop. Use instant checks (e.g., `await chatInput.textContent()` + synchronous `expect`, and `await sendButton.isEnabled()`) or pass `{ timeout: 0 }` to the nested expects so `toPass` retries quickly.</comment>

<file context>
@@ -88,9 +88,20 @@ export class ChatActions {
+    await expect(async () => {
+      await chatInput.click();
+      await chatInput.fill(prompt);
+      await expect(chatInput).toContainText(prompt);
+      await expect(sendButton).toBeEnabled();
+    }).toPass({ timeout: Timeout.SHORT });
</file context>
Suggested change
await expect(chatInput).toContainText(prompt);
await expect(sendButton).toBeEnabled();
const chatInputText = await chatInput.textContent();
expect(chatInputText ?? "").toContain(prompt);
expect(await sendButton.isEnabled()).toBe(true);
Fix with Cubic

}).toPass({ timeout: Timeout.MEDIUM });

await sendButton.click();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟢 LOW | race-condition

sendButton.click() is outside the retry loop

Once toPass() resolves, the final click is a single-shot call:

}).toPass({ timeout: Timeout.MEDIUM });

await sendButton.click();

If the send button flips back to disabled between the final retry check and the actual click (e.g., React commits a state change while re-validating the Lexical content), the click will either miss or be silently swallowed — putting us right back in the flake class this PR is trying to fix.

💡 Suggestion: After sendButton.click(), assert that the chat is actually in a sending state (spinner visible, send button disabled, or the message appears in the history) so any lost click surfaces as a loud failure rather than a timeout further downstream in waitForChatCompletion().

if (!skipWaitForCompletion) {
await this.waitForChatCompletion({ timeout });
}
Expand Down
1 change: 1 addition & 0 deletions rules/e2e-testing.md
Original file line number Diff line number Diff line change
Expand Up @@ -61,6 +61,7 @@ The chat input uses a Lexical editor (contenteditable). Standard Playwright meth

- **Clearing input**: `fill("")` doesn't reliably clear Lexical. Use keyboard shortcuts instead: `Meta+a` then `Backspace`.
- **Timing issues**: Lexical may need time to update its internal state. Use `toPass()` with retries for resilient tests.
- **Avoid locator drift**: When both home/chat inputs may exist, scope the editor locator to the specific container (for example `chat-input-container`) and reuse one locator instance for click/fill/assertions.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 MEDIUM | documentation-code-mismatch

New rule contradicts sendPrompt() implementation and PR description

This new rule instructs readers to "scope the editor locator to the specific container (for example chat-input-container) and reuse one locator instance". But the actual sendPrompt() in this PR uses this.getChatInput(), which is a global unscoped locator:

getChatInput() {
  return this.page.locator(
    '[data-lexical-editor="true"][aria-placeholder^="Ask Dyad to build"]',
  );
}

That means the same placeholder under home-chat-input-container also matches. The PR description compounds this — it says "stabilize sendPrompt() by scoping to the chat-input container" and "add a short settle delay after creating a new chat in chat_tabs.spec.ts", but neither of those changes is in the current diff (the scoping was reverted after breaking home-page tests; no sleep is present in chat_tabs.spec.ts).

Future contributors will follow this rule expecting it to describe sendPrompt(), then be confused when the code does the opposite. Either:

  • Update the rule to describe the actual pattern (retry-based toPass over a global locator, with container scoping reserved for tests that need to disambiguate), and update the PR description to match, or
  • Refactor sendPrompt() to pick getChatInputContainer() vs getHomeChatInputContainer() at runtime and scope the editor lookup inside it.

💡 Suggestion: At minimum, fix the PR description so reviewers and git blame readers aren't misled about what changed.

- **Helper methods**: Use `po.clearChatInput()` and `po.openChatHistoryMenu()` from test_helper.ts for reliable Lexical interactions.

```ts
Expand Down
Loading