Skip to content

fix dim() (embedding dimensions) method for qdrant cloud dense encoders#3079

Open
shanbady wants to merge 14 commits intomainfrom
shanbady/cloud-dense-encoder-dense-fix
Open

fix dim() (embedding dimensions) method for qdrant cloud dense encoders#3079
shanbady wants to merge 14 commits intomainfrom
shanbady/cloud-dense-encoder-dense-fix

Conversation

@shanbady
Copy link
Copy Markdown
Contributor

What are the relevant tickets?

Closes https://github.com/mitodl/hq/issues/10629

Description (What does it do?)

This PR resolves a bug that happens when attempting to use the new qdrant cloud encoder with a dense model

How can this be tested?

  1. checkout shanbady/qdrant-upgrade
  2. set settings.QDRANT_DENSE_MODEL = "openai/text-embedding-3-small" and settings.QDRANT_ENCODER="vector_search.encoders.qdrant_cloud.QdrantCloudEncoder"
  3. run the following and see it fail:
from vector_search.utils import dense_encoder
encoder = dense_encoder()
encoder.dim()
  1. checkout this branch and re-run the above and see it succeed

Additional Context

We have the option to use the cloud encoder for openai embeddings once we are migrated but will keep the legacy encoder for now until we have a chance to fully test it (seems to be working without issues afaik)

shanbady and others added 9 commits March 2, 2026 13:39
* unify key generation for point ids

* fix tests

* adding platform to vector key

* fix tests

* fixing other methods requiring point key

* fix point key

* fixing test

* account for platform=None
* adding sparse encoder util

* adding sparse encoder setting

* add sparse enc

* adding sparse hash encoder

* adding scikit-learn

* fix sparse encoder

* fix topic embedding'

* fix default vectorizer name

* adding cloud inference capability

* adding openai api key to options dict

* fix limits

* docstring updates

* adding test

* some optimizations

* fixing limit for prefetch queries

* hide hybrid search behind posthog feature flag

* scale prefetch with offset

* fix yield return

* fix sparse hash threshold calculation

* switching hybrid search to be a url param

* remove search params from groupby

* adding cache decorator to sparse encoder

* fix test

* fix test

* add default encoding name

* fix tests

* fix stop_words param

* adding test for hybrid flag and group_by

* pinning tokenizer to None for tests

* fix sparse embedding when searching
@shanbady shanbady marked this pull request as ready for review March 23, 2026 18:16
@shanbady shanbady changed the title fixing dim() method for qdrant cloud encoder fix dim() (embedding dimensions) method for qdrant cloud dense encoders Mar 24, 2026
Base automatically changed from shanbady/qdrant-upgrade to main March 25, 2026 13:30
@mbertrand mbertrand self-assigned this Mar 25, 2026
Copilot AI review requested due to automatic review settings March 26, 2026 20:05
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes a runtime error when using the Qdrant Cloud encoder with dense embedding models by making dim() return the correct embedding vector size (needed for Qdrant collection configuration).

Changes:

  • Update QdrantCloudEncoder to compute embedding dimensions via litellm.get_model_info(...).
  • Adjust tiktoken model lookup to use the encoder’s model_short_name() (supports provider-prefixed model names like openai/...).
  • Minor whitespace tweak in vector_search().

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
vector_search/utils.py Minor formatting change (blank line) in vector_search() flow.
vector_search/encoders/qdrant_cloud.py Fixes model token encoding lookup and adds a dim() implementation for Qdrant Cloud dense encoders.
Comments suppressed due to low confidence (1)

vector_search/encoders/qdrant_cloud.py:55

  • Add a unit test for QdrantCloudEncoder.dim() that mocks litellm.get_model_info and verifies it returns the expected embedding dimension (especially for provider-prefixed model names like openai/text-embedding-3-small). This will prevent regressions in collection creation where encoder_dense.dim() is required.
    def dim(self):
        """
        Return the dimension of the embeddings
        """
        info = litellm.get_model_info(self.model_short_name())
        return info["output_vector_size"]

shanbady and others added 3 commits March 26, 2026 17:24
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Copy link
Copy Markdown
Member

@mbertrand mbertrand left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Comment on lines +52 to +79
def dim(self):
"""
Return the dimension of the embeddings
"""
info = litellm.get_model_info(self.model_short_name())
if not isinstance(info, dict):
msg = (
f"Could not determine embedding dimension: litellm.get_model_info("
f"{self.model_short_name()!r}) returned {type(info).__name__}, "
"expected a dict with an 'output_vector_size' field."
)
raise TypeError(msg)
if "output_vector_size" not in info:
msg = (
"Could not determine embedding dimension: 'output_vector_size' "
f"missing from litellm.get_model_info({self.model_short_name()!r}) "
"response."
)
raise ValueError(msg)
dim = info["output_vector_size"]
if not isinstance(dim, int):
msg = (
"Could not determine embedding dimension: 'output_vector_size' "
f"from litellm.get_model_info({self.model_short_name()!r}) is of "
f"type {type(dim).__name__}, expected int."
)
raise TypeError(msg)
return dim
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be good to have a parametrized unit test for this function but otherwise everything works, LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants