Conversation
|
@BugBot run |
|
@BugBot run |
|
@BugBot run |
wwwillchen
left a comment
There was a problem hiding this comment.
I think overall a pretty good plan with a few suggestions:
- i'd just use Dyad Engine + Dyad Pro key, this is going to be much easier than managing 3 API keys and it'll be using the same models. (Dyad Engine basically proxies all the models)
- I'd have more eval cases: like 10-12 cases and include more complex cases such as (refactor giant 700-line react component into 3 smaller components, etc.)
- Just because a search-replace applies without error doesn't mean it's correct, you'll need to either spot check them or use another model (i'd probably use GPT 5.4) to judge the output - basically, feed in the prompt + original file + output file and say: does the output file look correct given the prompt + original file?
|
@BugBot run |
🎭 Playwright Test Results❌ Some tests failed
Summary: 811 passed, 12 failed, 7 flaky, 258 skipped Failed Tests🍎 macOS
🪟 Windows
📋 Re-run Failing Tests (macOS)Copy and paste to re-run all failing spec files locally: npm run e2e \
e2e-tests/chat_input.spec.ts \
e2e-tests/queued_message.spec.ts \
e2e-tests/setup_flow.spec.ts
|
|
@BugBot run |
|
@BugBot run |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: e1f2d72fd5
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
|
@BugBot run |
|
Re-requesting a review because I've made two large changes at this point:
I've also started implementing this, so I'll likely open another PR soon. I can still address any other changes that are needed though. |
Like most of the files in the
plansfolder, this file is AI-generated. However, I've checked it over and revised it, and to me it appears sound.Essentially, it is a plan to test the
search_replacetool in order to make sure that it's reliable, and to fix it if it's not.