test(relationships): add test for standalone file yielding no extra c…#289
test(relationships): add test for standalone file yielding no extra c…#289Ahmath-Gadji wants to merge 1 commit intodevfrom
Conversation
|
Warning Rate limit exceeded
⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (2)
📝 WalkthroughWalkthroughThe changes fix a behavior in Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Suggested labels
Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
📝 Coding Plan
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
openrag/components/indexer/vectordb/utils.py (1)
571-582:⚠️ Potential issue | 🔴 CriticalPreserve the target row for standalone files in
get_file_ancestors().At Line 581, filtering only on
row.relationship_id is not Nonedrops the depth-0 target row when the file is standalone, so this method returns[]instead of returning the file itself.💡 Proposed fix
return [ { "file_id": row.file_id, "partition": row.partition_name, "parent_id": row.parent_id, "relationship_id": row.relationship_id, "depth": row.depth, **(row.file_metadata or {}), } for row in result - if row.relationship_id is not None # Only include files that are part of a relationship + if row.depth == 0 or row.relationship_id is not None ]🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@openrag/components/indexer/vectordb/utils.py` around lines 571 - 582, get_file_ancestors() currently filters out rows with relationship_id == None which drops the depth-0 target row for standalone files; update the list comprehension filter to include the target row by allowing rows where row.relationship_id is not None OR row.depth == 0 (i.e., change the condition to "if row.relationship_id is not None or row.depth == 0") so standalone files at depth 0 are preserved in the returned list.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@openrag/components/test_relationships.py`:
- Around line 587-624: The test
test_no_extra_chunks_for_file_without_relationship_id relies on
PartitionFileManagerHelper.get_file_ancestors which does not implement the new
ancestor filtering added in openrag/components/indexer/vectordb/utils.py; update
the test to exercise the same production logic by calling the actual function
that contains the new filter (or by invoking the production helper in
openrag/components/indexer/vectordb/utils.py that performs ancestor filtering)
instead of the helper stub, or modify
PartitionFileManagerHelper.get_file_ancestors to replicate the exact filter
logic used in the vectordb utils so the test validates the real ancestor
behavior introduced in this PR.
---
Outside diff comments:
In `@openrag/components/indexer/vectordb/utils.py`:
- Around line 571-582: get_file_ancestors() currently filters out rows with
relationship_id == None which drops the depth-0 target row for standalone files;
update the list comprehension filter to include the target row by allowing rows
where row.relationship_id is not None OR row.depth == 0 (i.e., change the
condition to "if row.relationship_id is not None or row.depth == 0") so
standalone files at depth 0 are preserved in the returned list.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 717f6ad6-a889-4384-879f-3075c0a57df2
📒 Files selected for processing (2)
openrag/components/indexer/vectordb/utils.pyopenrag/components/test_relationships.py
…hunks on expansion
a2920bf to
5cfd4f5
Compare
Fix: exclude standalone files from relationship expansion
Ensures files without a
relationship_idare not incorrectly expanded with related or ancestor chunks.Changes:
get_file_ancestorsinutils.pynow filters out rows whererelationship_idisNoneTestStandaloneFileNoExpansiontest class covering: no related files returned, only the file itself as ancestor, and correct storage of falsyrelationship_idSummary by CodeRabbit
Bug Fixes
Tests