fix(testing): Fix Kyutai Speech-To-Text, LLaVA-OneVision, and LongCatFlash test failures on main CI #44695
Conversation
|
[For maintainers] Suggested jobs to run (before merge) run-slow: kyutai_speech_to_text, llava_onevision, longcat_flash |
|
cc: @Rocketknight1 @zucchini-nlp Just a gentle ping :) |
zucchini-nlp
left a comment
There was a problem hiding this comment.
left one comment, otherwise lgmt
| local_videos = [ | ||
| os.path.join(repo_root, "Big_Buck_Bunny_720_10s_10MB.mp4"), | ||
| os.path.join(repo_root, "sample_demo_1.mp4"), | ||
| ] |
There was a problem hiding this comment.
we also load local images above from the same root path. IIRC we made sure these artifacts are cached when loading from hub so I dont know why they are being created here
cc @ydshieh for this
There was a problem hiding this comment.
++, I didn't notice this previously but now that I've read into it a bit more, I guess even this image creation block isn't needed either for the same reason (ref: 05c0e1d); just the local_tiny_video logic staying intact should suffice but I'd love to know if I'm missing something.
What does this PR do?
The following failing tests were identified and fixed in this PR:
→ Kyutai Speech-To-Text: The PR [processors] Unbloating simple processors, refactored ProcessorMixin.call to use explicit keyword-only params instead of accepting positional arguments; but the KyutaiSTT integration tests were still calling
processor(samples)positionally; the audio samples in the current state mapped to theimagesparam.→ LLaVA-OneVision: The PR Load a tiny video to make CI faster introduced local video file path mappings. LlavaOnevision's
setUpClasswas still building paths toBig_Buck_Bunny_720_10s_10MB.mp4andsample_demo_1.mp4in the repo root.→ LongCatFlash: The PR [V5] Return a BatchEncoding dict from apply_chat_template by default again changed apply_chat_template to return BatchEncoding dict instead of a tensor. The test was passing this dict directly to
model.generateand tried to access.shapeon the dict; this fixes that :)Note: The test still fails with an
AssertionError, I'm not too sure and it could be flaky, but the crash should be resolved :)cc: @Rocketknight1 @zucchini-nlp
CI Failures
Before the fix (feel free to cross-check; these errors are reproducible):
After the fix (feel free to cross-check):
Before submitting
Pull Request section?
to it if that's the case.
documentation guidelines, and
here are tips on formatting docstrings.