⚡️ Speed up function construct_simd_step_input by 37% in PR #1504 (feature/try-to-beat-the-limitation-of-ee-in-terms-of-singular-elements-pushed-into-batch-inputs)#1509
Merged
Conversation
…`feature/try-to-beat-the-limitation-of-ee-in-terms-of-singular-elements-pushed-into-batch-inputs`)
The optimized code achieves a **36% speedup** through a single but impactful conditional check optimization in the `prepare_parameters` function.
**Key Optimization:**
The main performance improvement comes from adding an `if empty_indices:` check before executing expensive list comprehension and data removal operations:
```python
# Original: Always executes these expensive operations
indices = [e for e in indices if e not in empty_indices]
result = remove_indices(value=result, indices=empty_indices)
# Optimized: Only executes when empty_indices is non-empty
if empty_indices:
indices = [e for e in indices if e not in empty_indices]
result = remove_indices(value=result, indices=empty_indices)
```
**Why this optimization works:**
- In many test cases, `empty_indices` is an empty set, making the filtering operations unnecessary
- The list comprehension `[e for e in indices if e not in empty_indices]` has O(n*m) complexity where n=len(indices) and m=len(empty_indices)
- `remove_indices()` recursively processes nested data structures, which is expensive even for empty removal sets
- By avoiding these operations when `empty_indices` is empty, we eliminate significant computational overhead
**Performance impact by test case type:**
- **Large batch inputs** see the biggest gains (43-107% faster) because they avoid expensive O(n) operations on large datasets when no filtering is needed
- **Basic test cases** show consistent 15-25% improvements from avoiding unnecessary operations
- **Edge cases with actual empty elements** may see minimal or slightly negative impact (0.5% slower) due to the additional conditional check, but this is negligible compared to the gains in common cases
This optimization is particularly effective because most workflow executions don't have empty batch elements that need filtering, making the conditional check a highly beneficial guard against unnecessary work.
4 tasks
37c120b
into
feature/try-to-beat-the-limitation-of-ee-in-terms-of-singular-elements-pushed-into-batch-inputs
2 checks passed
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
⚡️ This pull request contains optimizations for PR #1504
If you approve this dependent PR, these changes will be merged into the original PR branch
feature/try-to-beat-the-limitation-of-ee-in-terms-of-singular-elements-pushed-into-batch-inputs.📄 37% (0.37x) speedup for
construct_simd_step_inputininference/core/workflows/execution_engine/v1/executor/execution_data_manager/step_input_assembler.py⏱️ Runtime :
1.99 milliseconds→1.46 milliseconds(best of40runs)📝 Explanation and details
The optimized code achieves a 36% speedup through a single but impactful conditional check optimization in the
prepare_parametersfunction.Key Optimization:
The main performance improvement comes from adding an
if empty_indices:check before executing expensive list comprehension and data removal operations:Why this optimization works:
empty_indicesis an empty set, making the filtering operations unnecessary[e for e in indices if e not in empty_indices]has O(n*m) complexity where n=len(indices) and m=len(empty_indices)remove_indices()recursively processes nested data structures, which is expensive even for empty removal setsempty_indicesis empty, we eliminate significant computational overheadPerformance impact by test case type:
This optimization is particularly effective because most workflow executions don't have empty batch elements that need filtering, making the conditional check a highly beneficial guard against unnecessary work.
✅ Correctness verification report:
🌀 Generated Regression Tests and Runtime
To edit these changes
git checkout codeflash/optimize-pr1504-2025-08-25T10.24.18and push.