[infer] Add progress_callback to inference, judge, and synthesis APIs#2335
Draft
rlehman221 wants to merge 2 commits intomainfrom
Draft
[infer] Add progress_callback to inference, judge, and synthesis APIs#2335rlehman221 wants to merge 2 commits intomainfrom
rlehman221 wants to merge 2 commits intomainfrom
Conversation
Add an optional progress_callback parameter that fires (completed, total) after each item is processed. This enables callers to report granular progress during bulk operations instead of only at start/end. Changes: - BaseInferenceEngine.infer() and _infer_online(): new progress_callback param - RemoteInferenceEngine._infer(): fires callback per async task completion - BaseJudge.judge() and _infer(): thread callback to inference engine - AttributeSynthesizer.synthesize(): thread callback to inference engine - All other engine subclasses: accept param in _infer_online() signature
- BaseInferenceEngine: test callback forwarding, None default, exception safety - RemoteInferenceEngine: test callback fires per conversation, None works, exception in callback doesn't crash inference - BaseJudge: test callback forwarded to engine, None default - AttributeSynthesizer: test callback forwarded to engine, None default - Fix existing tests for updated _infer_online signature (3rd arg)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
progress_callback: Callable[[int, int], None]parameter toBaseInferenceEngine.infer(),BaseJudge.judge(), andAttributeSynthesizer.synthesize()(completed_count, total_count)after each item finishes, enabling callers to report granular progress during bulk operationsRemoteInferenceEnginefires the callback per async task completion in the gather loop; other engines accept the parameter but don't use it yetMotivation
When the Oumi Enterprise worker calls bulk inference/judge/synthesis, it passes all rows at once. Without this callback, the worker can only update heartbeat progress at the start and end of the call — leaving users with no progress updates for potentially long-running operations.
Changes
base_inference_engine.pyprogress_callbackoninfer()and_infer_online()remote_inference_engine.pynative_text_inference_engine.py_infer_online()vllm_inference_engine.py_infer_online()llama_cpp_inference_engine.py_infer_online()base_judge.pyprogress_callbackonjudge()and_infer(), forwarded to engineattribute_synthesizer.pyprogress_callbackonsynthesize(), forwarded to engineDesign
Nonedefault) — fully backwards compatibleasyncio.gather()runs on a single event loop thread, so the nonlocal counter won't race