feat: Support Vector Search in Valkey#354
Conversation
…batch functionality
- Add explicit type annotation for schema_fields to support both TagField and VectorField - Encode project string to bytes for consistency with other hash values - Decode doc_key bytes to string for hmget compatibility - Fix code formatting: break long lines and remove extra blank lines - Remove tests for multiple vector fields (Feast enforces one vector per feature view) - Fix config type: use 'eg-valkey' (hyphen) not 'eg_valkey' (underscore) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…h-support-reads Resolves merge conflicts and incorporates Option A implementation: - One index per vector field with feature name in index name - Float64 to Float32 conversion (Valkey limitation) - Vector fields use original name for hset keys Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
| """ | ||
| # Build KNN query with project filter | ||
| # Format: "(@__project__:{project})=>[KNN {top_k} @{field} $vec AS distance]" | ||
| query_str = ( |
There was a problem hiding this comment.
Do we allow hyphens in project name? If yes, then Iyou would have to escape it. Redisearch interprets it as negation.
| search_results = [] | ||
| for doc in results.docs: | ||
| doc_key = doc.id.encode() if isinstance(doc.id, str) else doc.id | ||
| distance = float(getattr(doc, "__distance__", 0.0)) |
There was a problem hiding this comment.
Distance of 0.0 means its a perfect match right? I think we should use something else as default.
| query = ( | ||
| Query(query_str) | ||
| .return_fields("__distance__") | ||
| .sort_by("__distance__") |
There was a problem hiding this comment.
sort_by("distance") would be ascending or descending?
There was a problem hiding this comment.
sort_by() defaults to ascending order, I'll make it explicit
There was a problem hiding this comment.
I changed it a bit to infer sort order from distance metric
| table: FeatureView, | ||
| requested_features: List[str], | ||
| search_results: List[Tuple[bytes, float]], | ||
| vector_field: Field, |
There was a problem hiding this comment.
Where is this being used?
| embedding: Query embedding vector | ||
| top_k: Number of results to return | ||
| distance_metric: Optional override for distance metric (COSINE, L2, IP) | ||
| query_string: Not supported in V1 (reserved for future BM25 search) |
There was a problem hiding this comment.
| query_string: Not supported in V1 (reserved for future BM25 search) | |
| query_string: Not supported in V2(reserved for future BM25 search) |
Add the third argument (vector_field.name) to _get_vector_index_name call to match the updated function signature. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
| entity_hset = dict() | ||
| entity_hset[ts_key] = ts.SerializeToString() | ||
| # Store project and entity key for vector search | ||
| entity_hset["__project__"] = project.encode() |
There was a problem hiding this comment.
This is producing bytes but line 1138 expects a string. Can you test it out?
Resolved conflicts: - eg_valkey.py: Keep both deserialize_entity_key import (for reads) and Query import (for FT.SEARCH) - test_valkey.py: Keep read/search test classes from HEAD
* Adding support for Valkey Search, adding changes to the online_write_batch functionality * Addressing PR comments * addressing linting error * Adding changes to support search in valkey * fix tests * adding unit tests * reformatting files and adding checks and more tests * reformatting files and adding checks and more tests * reformatting files and adding checks and more tests * Fix linter errors: type annotations and code formatting - Add explicit type annotation for schema_fields to support both TagField and VectorField - Encode project string to bytes for consistency with other hash values - Decode doc_key bytes to string for hmget compatibility - Fix code formatting: break long lines and remove extra blank lines - Remove tests for multiple vector fields (Feast enforces one vector per feature view) - Fix config type: use 'eg-valkey' (hyphen) not 'eg_valkey' (underscore) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * addressing PR comments * addressing PR comments * fixing linting * Fix missing feature_name argument in retrieve_online_documents_v2 Add the third argument (vector_field.name) to _get_vector_index_name call to match the updated function signature. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * addressing comments, PR changes for some fixes and merge conflicts * fixing tests * fixing tests * fixing linting * fixing linting --------- Co-authored-by: Manisha4 <Manisha4@github.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
* feat: Valkey Online Write Batch Vector Search Support (#351) * Adding support for Valkey Search, adding changes to the online_write_batch functionality * Addressing PR comments * addressing linting error * fix tests * addressing PR comments * addressing PR comments * fixing linting --------- Co-authored-by: Manisha4 <Manisha4@github.com> * feat: Support Vector Search in Valkey (#354) * Adding support for Valkey Search, adding changes to the online_write_batch functionality * Addressing PR comments * addressing linting error * Adding changes to support search in valkey * fix tests * adding unit tests * reformatting files and adding checks and more tests * reformatting files and adding checks and more tests * reformatting files and adding checks and more tests * Fix linter errors: type annotations and code formatting - Add explicit type annotation for schema_fields to support both TagField and VectorField - Encode project string to bytes for consistency with other hash values - Decode doc_key bytes to string for hmget compatibility - Fix code formatting: break long lines and remove extra blank lines - Remove tests for multiple vector fields (Feast enforces one vector per feature view) - Fix config type: use 'eg-valkey' (hyphen) not 'eg_valkey' (underscore) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * addressing PR comments * addressing PR comments * fixing linting * Fix missing feature_name argument in retrieve_online_documents_v2 Add the third argument (vector_field.name) to _get_vector_index_name call to match the updated function signature. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * addressing comments, PR changes for some fixes and merge conflicts * fixing tests * fixing tests * fixing linting * fixing linting --------- Co-authored-by: Manisha4 <Manisha4@github.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> * fix: Valkey vector search - remove unsupported SORTBY (#356) * fix: Valkey vector search - remove unsupported SORTBY and fix tag filter syntax Valkey Search KNN queries return results pre-sorted by distance, so explicit SORTBY is not supported and causes a ResponseError. This removes the .sort_by() call from the query builder. Additionally, fixes the project tag filter to use unquoted syntax with backslash escaping for special characters (e.g. hyphens, dots) instead of the quoted syntax which was returning empty results. Updates unit tests to reflect both changes: replaces three metric-specific sort order tests with a single test asserting no SORTBY is set, and updates escaping assertions to match the new backslash-escape approach. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * style: apply ruff format to eg_valkey.py and test_valkey.py Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Manisha4 <Manisha4@github.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Manisha4 <Manisha4@github.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
What this PR does / why we need it:
Adds vector similarity search support to the EG Valkey online store, enabling semantic search use cases for ML features.
Which issue(s) this PR fixes:
Misc