Skip to content

Feat/add OpenAI reranking#288

Open
Ahmath-Gadji wants to merge 2 commits intodevfrom
feat/add_openai_reranking
Open

Feat/add OpenAI reranking#288
Ahmath-Gadji wants to merge 2 commits intodevfrom
feat/add_openai_reranking

Conversation

@Ahmath-Gadji
Copy link
Collaborator

@Ahmath-Gadji Ahmath-Gadji commented Mar 17, 2026

Refactor reranker into a multi-provider architecture

Replaces the single-file reranker with a factory pattern supporting multiple backends (Infinity and OpenAI-compatible endpoints). The provider is selected at runtime via configuration.

Changes:

  • Introduced BaseReranker, InfinityReranker, and OpenAIReranker classes under openrag/components/reranker/
  • Updated pipeline to use the new factory with improved debug logging
  • Added dedicated YAML configs per provider (.hydra_config/reranker/)
  • Updated docker-compose.yaml for dynamic provider selection; added extern/reranker/openai.yaml with GPU/CPU support
  • Moved RRF test to align with new module structure
  • Updated docs to reflect new env vars and provider options

Summary by CodeRabbit

Release Notes

  • New Features

    • Added support for multiple reranker providers (Infinity and OpenAI-compatible backends).
    • Introduced new configuration variables: RERANKER_PROVIDER, RERANKER_API_KEY, and RERANKER_SEMAPHORE.
    • Updated RERANKER_TOP_K default from 5 to 10.
    • Set "openrag-all" as the default chat profile selection.
  • Chores

    • Updated reranker configuration system and Docker Compose setup for provider flexibility.

@coderabbitai
Copy link

coderabbitai bot commented Mar 17, 2026

Caution

Review failed

Failed to post review comments

📝 Walkthrough

Walkthrough

This PR refactors the reranker system to support multiple providers (Infinity and OpenAI-compatible) using a factory pattern. It reorganizes reranker code into a modular structure with provider-specific implementations, updates Hydra configurations to be composable, and adjusts the pipeline to instantiate rerankers dynamically based on configuration.

Changes

Cohort / File(s) Summary
Configuration: Hydra Defaults
.hydra_config/config.yaml
Replaced inline reranker configuration with a defaults entry referencing RERANKER_PROVIDER environment variable (default: infinity).
Configuration: Reranker Providers
.hydra_config/reranker/base.yaml, .hydra_config/reranker/infinity.yaml, .hydra_config/reranker/openai.yaml
Added three new modular reranker configuration files: base defines shared defaults (model_name, top_k, base_url, semaphore, enabled); infinity adds Infinity-specific provider settings; openai adds OpenAI-compatible provider settings with api_key interpolation.
Configuration: Docker Integration
docker-compose.yaml, extern/reranker/openai.yaml
Made docker-compose reranker include dynamic based on RERANKER_PROVIDER; added OpenAI reranker service definition with GPU and CPU variants using vllm/vllm-openai images.
Reranker Module: Core Refactoring
openrag/components/reranker.py (deleted)
Removed original monolithic reranker.py containing BaseReranker and Reranker classes; logic relocated into provider-specific modules.
Reranker Module: New Architecture
openrag/components/reranker/__init__.py, openrag/components/reranker/base.py, openrag/components/reranker/infinity.py, openrag/components/reranker/openai.py
Introduced RerankerFactory with RERANKER_MAPPING to instantiate providers; BaseReranker defines async rerank() interface and RRF reranking logic; InfinityReranker and OpenAIReranker implement provider-specific reranking via semaphore-gated external API calls.
Pipeline Integration
openrag/components/pipeline.py
Updated to use RerankerFactory.get_reranker(config) instead of direct Reranker instantiation; changed config key from "enable" to "enabled"; added provider logging; updated type annotation to BaseReranker.
Documentation & Tests
docs/content/docs/documentation/env_vars.md, openrag/components/reranker/test_rrf_reranking.py, openrag/app_front.py
Expanded env_vars.md with RERANKER_PROVIDER, RERANKER_API_KEY, RERANKER_SEMAPHORE details and provider table; updated test import path; added default flag to openrag-all chat profile.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~30 minutes

Possibly related PRs

  • Feat/chunking #165: Modifies openrag/components/pipeline.py to compute max_context_tokens from config.reranker.top_k, creating interaction with this PR's RerankerFactory and config key changes.
  • Merge for release v1.1.5 #175: Updates reranker configuration defaults and Hydra config entries, directly aligned with this PR's configuration restructuring.
  • Feat/add ruff linting #209: Makes runtime changes to the reranker implementation that this PR substantially replaces with a modular provider-based architecture.

Suggested reviewers

  • paultranvan
  • dodekapod

Poem

🐰 The reranker hops with newfound grace,
Two providers race in parallel space!
From monolith to factory clean,
Infinity and OpenAI now convene,
Modular config makes refactoring sweet! ✨

🚥 Pre-merge checks | ✅ 1 | ❌ 2

❌ Failed checks (2 warnings)

Check name Status Explanation Resolution
Title check ⚠️ Warning The title "Feat/add OpenAI reranking" is partially related to the changeset but does not highlight the main change, which is a comprehensive multi-provider architecture refactor with factory pattern. Consider revising the title to emphasize the key architectural change, such as "Refactor reranker to support multiple providers with factory pattern" or "Add multi-provider reranker architecture with Infinity and OpenAI support".
Docstring Coverage ⚠️ Warning Docstring coverage is 18.18% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (1 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/add_openai_reranking
📝 Coding Plan
  • Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@Ahmath-Gadji Ahmath-Gadji added breaking-change Change of behavior after upgrade feat Add a new feature labels Mar 17, 2026
BREAKING CHANGE: docker-compose.yaml now includes
`extern/reranker/${RERANKER_PROVIDER:-infinity}.yaml` instead of
`extern/infinity.yaml`. Set RERANKER_PROVIDER=infinity or leave unset
to preserve existing behavior.

Add OpenAI-compatible reranker provider selectable via RERANKER_PROVIDER
(values: `infinity`, `openai`). New env vars: RERANKER_API_KEY, RERANKER_SEMAPHORE.
@Ahmath-Gadji Ahmath-Gadji force-pushed the feat/add_openai_reranking branch from b9a9bbd to c4c3346 Compare March 17, 2026 14:44
@Ahmath-Gadji Ahmath-Gadji marked this pull request as ready for review March 17, 2026 14:53
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🧹 Nitpick comments (2)
openrag/components/reranker/openai.py (1)

48-55: Prefer bare raise to preserve the original traceback.

Using raise e can subtly alter the traceback. Use bare raise instead.

-                raise e
+                raise
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@openrag/components/reranker/openai.py` around lines 48 - 55, The except block
in the reranker method logs the exception but re-raises using "raise e", which
can alter the traceback; update the except handler that references logger,
self.model_name and documents (the block that logs "Reranking failed") to
re-raise the caught exception with a bare "raise" instead of "raise e" so the
original traceback is preserved.
openrag/components/reranker/infinity.py (1)

45-52: Prefer bare raise to preserve the original traceback.

-                raise e
+                raise
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@openrag/components/reranker/infinity.py` around lines 45 - 52, In the except
Exception as e block that logs reranking failures (the block referencing
logger.error with model_name=self.model_name and
documents_count=len(documents)), replace the current "raise e" with a bare
"raise" so the original traceback is preserved; keep the logger.error call and
exception variable for logging, but re-raise using "raise" instead of "raise e".
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@extern/reranker/openai.yaml`:
- Around line 17-20: The service listens on container port 8000 but the Docker
port mapping uses ${RERANKER_PORT:-8003}, causing mismatch when RERANKER_PORT is
overridden; update the OpenAI vLLM service command blocks in openai.yaml to
include a --port flag using the same variable (e.g., add --port
${RERANKER_PORT:-8003}) in both the main reranker command and the reranker-cpu
command so the container binds to the same overridable port used in the mapping
and Hydra URLs, ensuring port alignment across RERANKER_PORT, the command lines,
and the port mapping.

In `@openrag/components/reranker/infinity.py`:
- Line 7: Change the relative-style import to an absolute import from the
openrag package: replace the current import of get_logger in infinity.py with an
absolute import that references openrag.utils.logger (e.g., import get_logger
from openrag.utils.logger) so the symbol get_logger is imported via the
project's top-level package name per coding guidelines.

In `@openrag/components/reranker/openai.py`:
- Line 5: Replace the relative import of the logger used in openai.py: instead
of importing get_logger from a local/relative utils module, change it to use the
project-root absolute package import so get_logger is imported from the
top-level utils.logger package (i.e., the absolute openrag package path) to
comply with the coding guideline; update the import statement that currently
references utils.logger to the absolute package import for get_logger.
- Around line 27-38: The Async HTTP call to self.rerank_url using
httpx.AsyncClient.post has no timeout and can hang; update the reranker to set a
request timeout (either by adding a configurable attribute like self.timeout on
the reranker class and passing timeout=self.timeout to client.post, or by
constructing httpx.AsyncClient(timeout=...) / using httpx.Timeout) so the post
call to self.rerank_url will fail fast on slow/unresponsive services; ensure the
timeout value is used in the call site that invokes httpx.AsyncClient().post and
consider catching httpx.TimeoutException where appropriate.

In `@openrag/components/reranker/test_rrf_reranking.py`:
- Line 5: The test file uses a relative import for BaseReranker; change the
relative import to an absolute one so it imports BaseReranker from the package
root (use the full module path, e.g. import BaseReranker from
openrag.components.reranker.base) to comply with project import guidelines and
avoid relative import issues.

---

Nitpick comments:
In `@openrag/components/reranker/infinity.py`:
- Around line 45-52: In the except Exception as e block that logs reranking
failures (the block referencing logger.error with model_name=self.model_name and
documents_count=len(documents)), replace the current "raise e" with a bare
"raise" so the original traceback is preserved; keep the logger.error call and
exception variable for logging, but re-raise using "raise" instead of "raise e".

In `@openrag/components/reranker/openai.py`:
- Around line 48-55: The except block in the reranker method logs the exception
but re-raises using "raise e", which can alter the traceback; update the except
handler that references logger, self.model_name and documents (the block that
logs "Reranking failed") to re-raise the caught exception with a bare "raise"
instead of "raise e" so the original traceback is preserved.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 30e4cae1-728d-4d4c-a208-50bebaee0198

📥 Commits

Reviewing files that changed from the base of the PR and between 896764e and c4c3346.

⛔ Files ignored due to path filters (1)
  • uv.lock is excluded by !**/*.lock
📒 Files selected for processing (16)
  • .hydra_config/config.yaml
  • .hydra_config/reranker/base.yaml
  • .hydra_config/reranker/infinity.yaml
  • .hydra_config/reranker/openai.yaml
  • docker-compose.yaml
  • docs/content/docs/documentation/env_vars.md
  • extern/reranker/infinity.yaml
  • extern/reranker/openai.yaml
  • openrag/app_front.py
  • openrag/components/pipeline.py
  • openrag/components/reranker.py
  • openrag/components/reranker/__init__.py
  • openrag/components/reranker/base.py
  • openrag/components/reranker/infinity.py
  • openrag/components/reranker/openai.py
  • openrag/components/reranker/test_rrf_reranking.py
💤 Files with no reviewable changes (1)
  • openrag/components/reranker.py

Comment on lines +17 to +20
command: >
--model ${RERANKER_MODEL:-BAAI/bge-reranker-v2-m3}
--trust-remote-code
--gpu_memory_utilization 0.3
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "Inspect RERANKER_PORT usage and vLLM listen-port flags"
rg -n 'RERANKER_PORT|--port|ports:' extern/reranker/openai.yaml .hydra_config/reranker/openai.yaml

Repository: linagora/openrag

Length of output: 343


🏁 Script executed:

cat -n extern/reranker/openai.yaml

Repository: linagora/openrag

Length of output: 1962


🏁 Script executed:

# Check vLLM documentation or code comments in the repo
rg -i 'port|listen' extern/reranker/ --context 3

Repository: linagora/openrag

Length of output: 1628


🏁 Script executed:

# Check how RERANKER_PORT is used in configs and environment
rg -n 'RERANKER_PORT' --context 2

Repository: linagora/openrag

Length of output: 3180


RERANKER_PORT override breaks service connectivity when overridden.

The openai.yaml vLLM service lacks a --port flag in its commands (lines 17-20 and 56-59), so it always listens on port 8000 internally. However, the port mapping on line 28 (${RERANKER_PORT:-8003}:8000) changes when RERANKER_PORT is set to a non-default value. When overridden, the port mapping remaps the container's port 8000 to the host, but the Hydra configuration constructs URLs using the overridden port number, causing a mismatch. Compare this to the infinity.yaml reranker variant, which correctly passes --port ${RERANKER_PORT:-7997} to the service.

Add the --port flag to both reranker command definitions and update the port mapping to maintain port alignment across the stack.

Proposed fix
   command: >
     --model ${RERANKER_MODEL:-BAAI/bge-reranker-v2-m3}
     --trust-remote-code
     --gpu_memory_utilization 0.3
+    --port ${RERANKER_PORT:-8000}
   ports:
-      - ${RERANKER_PORT:-8003}:8000
+      - ${RERANKER_PORT:-8003}:${RERANKER_PORT:-8000}

Apply the same change to the reranker-cpu command block (lines 56-59).

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@extern/reranker/openai.yaml` around lines 17 - 20, The service listens on
container port 8000 but the Docker port mapping uses ${RERANKER_PORT:-8003},
causing mismatch when RERANKER_PORT is overridden; update the OpenAI vLLM
service command blocks in openai.yaml to include a --port flag using the same
variable (e.g., add --port ${RERANKER_PORT:-8003}) in both the main reranker
command and the reranker-cpu command so the container binds to the same
overridable port used in the mapping and Hydra URLs, ensuring port alignment
across RERANKER_PORT, the command lines, and the port mapping.

from infinity_client.api.default import rerank
from infinity_client.models import RerankInput, ReRankResult
from langchain_core.documents.base import Document
from utils.logger import get_logger
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Use absolute import from openrag/ directory.

Per coding guidelines, imports should use absolute paths from the openrag/ directory.

-from utils.logger import get_logger
+from openrag.utils.logger import get_logger
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
from utils.logger import get_logger
from openrag.utils.logger import get_logger
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@openrag/components/reranker/infinity.py` at line 7, Change the relative-style
import to an absolute import from the openrag package: replace the current
import of get_logger in infinity.py with an absolute import that references
openrag.utils.logger (e.g., import get_logger from openrag.utils.logger) so the
symbol get_logger is imported via the project's top-level package name per
coding guidelines.


import httpx
from langchain_core.documents.base import Document
from utils.logger import get_logger
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Use absolute import from openrag/ directory.

Per coding guidelines, imports should use absolute paths from the openrag/ directory.

-from utils.logger import get_logger
+from openrag.utils.logger import get_logger
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
from utils.logger import get_logger
from openrag.utils.logger import get_logger
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@openrag/components/reranker/openai.py` at line 5, Replace the relative import
of the logger used in openai.py: instead of importing get_logger from a
local/relative utils module, change it to use the project-root absolute package
import so get_logger is imported from the top-level utils.logger package (i.e.,
the absolute openrag package path) to comply with the coding guideline; update
the import statement that currently references utils.logger to the absolute
package import for get_logger.

Comment on lines +27 to +38
async with httpx.AsyncClient() as client:
response = await client.post(
self.rerank_url,
headers={"Authorization": f"Bearer {self.api_key}"},
json={
"model": self.model_name,
"query": query,
"documents": [doc.page_content for doc in documents],
"top_n": top_k,
},
)
response.raise_for_status()
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Add timeout to HTTP request to prevent indefinite hangs.

The httpx.AsyncClient.post() call has no timeout configured. If the reranker service is slow or unresponsive, this will block indefinitely, potentially exhausting the semaphore.

🛠️ Proposed fix: add timeout parameter
             async with httpx.AsyncClient() as client:
                 response = await client.post(
                     self.rerank_url,
                     headers={"Authorization": f"Bearer {self.api_key}"},
                     json={
                         "model": self.model_name,
                         "query": query,
                         "documents": [doc.page_content for doc in documents],
                         "top_n": top_k,
                     },
+                    timeout=60.0,  # Consider making this configurable
                 )
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@openrag/components/reranker/openai.py` around lines 27 - 38, The Async HTTP
call to self.rerank_url using httpx.AsyncClient.post has no timeout and can
hang; update the reranker to set a request timeout (either by adding a
configurable attribute like self.timeout on the reranker class and passing
timeout=self.timeout to client.post, or by constructing
httpx.AsyncClient(timeout=...) / using httpx.Timeout) so the post call to
self.rerank_url will fail fast on slow/unresponsive services; ensure the timeout
value is used in the call site that invokes httpx.AsyncClient().post and
consider catching httpx.TimeoutException where appropriate.

from components.reranker import BaseReranker
from langchain_core.documents.base import Document

from .base import BaseReranker
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "Verify relative imports in the touched test file"
rg -nP '^from \.' openrag/components/reranker/test_rrf_reranking.py

Repository: linagora/openrag

Length of output: 141


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "=== Check if base.py exists and contains BaseReranker ==="
if [ -f "openrag/components/reranker/base.py" ]; then
    echo "✓ base.py found"
    rg -A 2 "class BaseReranker" openrag/components/reranker/base.py | head -5
else
    echo "✗ base.py not found"
fi

echo ""
echo "=== Verify the proposed absolute import would work ==="
python3 << 'EOF'
import sys
sys.path.insert(0, '.')
try:
    from openrag.components.reranker.base import BaseReranker
    print(f"✓ Absolute import works, BaseReranker is: {BaseReranker}")
except Exception as e:
    print(f"✗ Absolute import failed: {e}")
EOF

Repository: linagora/openrag

Length of output: 469


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "=== Check all imports in test_rrf_reranking.py ==="
rg -n "^from |^import " openrag/components/reranker/test_rrf_reranking.py

Repository: linagora/openrag

Length of output: 196


Use absolute import for BaseReranker.

Line 5 uses a relative import; this must be absolute from the openrag/ root per coding guidelines.

🔧 Proposed fix
-from .base import BaseReranker
+from openrag.components.reranker.base import BaseReranker
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@openrag/components/reranker/test_rrf_reranking.py` at line 5, The test file
uses a relative import for BaseReranker; change the relative import to an
absolute one so it imports BaseReranker from the package root (use the full
module path, e.g. import BaseReranker from openrag.components.reranker.base) to
comply with project import guidelines and avoid relative import issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

breaking-change Change of behavior after upgrade feat Add a new feature

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant