perf: improve histogram fold plan#7937
Conversation
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 22c656efff
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
There was a problem hiding this comment.
Code Review
This pull request introduces an optimistic execution path for the HistogramFold plan to improve performance, including new benchmarks and refactored logic for histogram evaluation. I have identified a critical issue in the new evaluate_row_fast_validated function where it fails to validate the minimum number of buckets, which could lead to an out-of-bounds panic. Please incorporate the suggested check for bucket.len() < 2 to ensure robustness.
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
| let remaining_input_batch = batch.slice(cursor, remaining_rows); | ||
| self.switch_to_safe_mode(remaining_input_batch)?; | ||
| return Ok(()); | ||
| if !Self::is_positive_infinity(le_array, cursor + bucket_num - 1) { |
There was a problem hiding this comment.
should the optimis path still use validate_optimistic_group to valid just in case input is malformed?
| let expected_pos = total * quantile; | ||
| let fit_bucket_pos = counter.partition_point(|&c| c < expected_pos); | ||
| if fit_bucket_pos >= bucket.len() - 1 { | ||
| bucket[bucket.len() - 2] | ||
| } else { | ||
| let upper_bound = bucket[fit_bucket_pos]; | ||
| let upper_count = counter[fit_bucket_pos]; | ||
| let mut lower_bound = bucket[0].min(0.0); | ||
| let mut lower_count = 0.0; | ||
| if fit_bucket_pos > 0 { | ||
| lower_bound = bucket[fit_bucket_pos - 1]; | ||
| lower_count = counter[fit_bucket_pos - 1]; | ||
| } | ||
| if (upper_count - lower_count).abs() < 1e-10 { | ||
| f64::NAN | ||
| } else { | ||
| lower_bound | ||
| + (upper_bound - lower_bound) / (upper_count - lower_count) | ||
| * (expected_pos - lower_count) | ||
| } | ||
| } |
There was a problem hiding this comment.
Is it almost the same as the part in evaluate_row()? Can we reuse it?
Signed-off-by: Ruihang Xia waynestxia@gmail.comI hereby agree to the terms of the GreptimeDB CLA.
Refer to a related PR or issue link (optional)
What's changed and what's your intention?
reduce overheads by optimizations like reusing parsed content and buffer, removing redundant checks on the happy path, storing bucket position and building output in batch etc.
Details
PR Checklist
Please convert it to a draft if some of the following conditions are not met.