feat: add new falkordb integration by ghassenzaara · Pull Request #3158 · deepset-ai/haystack-core-integrations

ghassenzaara · 2026-04-13T15:19:27Z

Related Issues

partially addresses Add New FalkorDB Integration #3007

Proposed Changes:

Added FalkorDBDocumentStore to connect Haystack with FalkorDB graph databases.
Added FalkorDBEmbeddingRetriever for standard vector searches.
Added FalkorDBCypherRetriever for running custom GraphRAG Cypher queries.
Ensured document metadata is flattened and stored directly on the graph nodes.
Fixed vector insertion by casting arrays with vecf32() in Cypher queries.

How did you test it?

Unit tests: Added basic component and serialization tests (hatch run test:unit).
Integration tests: Verified writes, vector searches, and duplicate policies against a live database (hatch run test:integration).
Linters: Passed all type-checking and formatting checks (hatch run test:types, hatch run fmt).

Notes for the reviewer

Note the vecf32() explicit cast in the UNWIND cypher queries. This is specifically required by FalkorDB to parse vector embeddings correctly.

Checklist

I have read the contributors guidelines and the code of conduct
I have updated the related issue with new insights and changes
I added unit tests and updated the docstrings
I've used one of the conventional commit types for my PR title: fix:, feat:, build:, chore:, ci:, docs:, style:, refactor:, perf:, test:.

…e environment.

…e and native vector search

…ith neo4j-haystack

CLAassistant · 2026-04-13T15:19:36Z

All committers have signed the CLA.

…in CI workflow

bogdankostic

Thanks so much for this PR, @ghassenzaara! Great work so far. You'll see quite a few comments below, but they are mostly just minor formatting improvements for the docstrings.

bogdankostic · 2026-04-15T12:53:39Z

      - "Test / dspy"
      - "Test / elasticsearch"
      - "Test / faiss"
+      - "Test / falkor_db"


For consistency (for example with arcadedb) let's use falkordb throughout this integration instead of falkor_db, so changing for example also integrations/falkor_db -> integrations/falkordb.

bogdankostic · 2026-04-15T12:56:36Z

      - any-glob-to-any-file: ".github/workflows/faiss.yml"


+integration:falkor-db:


Suggested change

integration:falkor-db:

integration:falkordb:

bogdankostic · 2026-04-15T12:58:50Z

+    documented_only: true
+    skip_empty_modules: true
+renderer:
+  description: FalkorDB integration for Haystack — GraphRAG document store, embedding retriever, and Cypher retriever


Suggested change

description: FalkorDB integration for Haystack — GraphRAG document store, embedding retriever, and Cypher retriever

description: FalkorDB integration for Haystack

bogdankostic · 2026-04-15T13:35:22Z

+    def to_dict(self) -> dict[str, Any]:
+        """
+        Serialise this component to a dictionary.
+
+        :returns: Dictionary representation of this retriever's configuration.
+        """
+        data = default_to_dict(
+            self,
+            custom_cypher_query=self._custom_cypher_query,
+        )
+        data["init_parameters"]["document_store"] = self._document_store.to_dict()
+        return data
+
+    @classmethod
+    def from_dict(cls, data: dict[str, Any]) -> "FalkorDBCypherRetriever":
+        """
+        Deserialise this component from a dictionary.
+
+        :param data: Dictionary previously produced by :meth:`to_dict`.
+        :returns: A new :class:`FalkorDBCypherRetriever` instance.
+        """
+        init_params = data.get("init_parameters", {})
+        if "document_store" in init_params:
+            init_params["document_store"] = FalkorDBDocumentStore.from_dict(init_params["document_store"])
+        return default_from_dict(cls, data)


These methods shouldn't be needed, see our documentation on default serialization behavior.

bogdankostic · 2026-04-15T13:36:37Z

+        """
+        Retrieve documents by executing an OpenCypher query.
+
+        If a ``query`` is provided here, it overrides the ``custom_cypher_query``


We use single backticks inside docstrings for inline code.

Suggested change

If a ``query`` is provided here, it overrides the ``custom_cypher_query``

If a `query` is provided here, it overrides the `custom_cypher_query`

bogdankostic · 2026-04-15T19:30:41Z

+    Translate a Haystack filter dict into an OpenCypher ``WHERE`` sub-expression.
+
+    Supports the full Haystack filter DSL:
+
+    - Logical: ``AND``, ``OR``, ``NOT``
+    - Comparison: ``==``, ``!=``, ``>``, ``>=``, ``<``, ``<=``
+    - Membership: ``in``, ``not in``
+
+    All values are passed as named query parameters to prevent injection.
+
+    :param filters: A Haystack filter dictionary.
+    :returns: Tuple of ``(where_clause_string, params_dict)``.
+    :raises ValueError: If an unsupported operator or malformed filter is provided.


Suggested change

Translate a Haystack filter dict into an OpenCypher ``WHERE`` sub-expression.

Supports the full Haystack filter DSL:

- Logical: ``AND``, ``OR``, ``NOT``

- Comparison: ``==``, ``!=``, ``>``, ``>=``, ``<``, ``<=``

- Membership: ``in``, ``not in``

All values are passed as named query parameters to prevent injection.

:param filters: A Haystack filter dictionary.

:returns: Tuple of ``(where_clause_string, params_dict)``.

:raises ValueError: If an unsupported operator or malformed filter is provided.

Translate a Haystack filter dict into an OpenCypher `WHERE` sub-expression.

Supports the full Haystack filter DSL:

- Logical: `AND`, `OR`, `NOT`

- Comparison: `==`, `!=`, `>`, `>=`, `<`, `<=`

- Membership: `in`, `not in`

All values are passed as named query parameters to prevent injection.

:param filters: A Haystack filter dictionary.

:returns: Tuple of `(where_clause_string, params_dict)`.

:raises ValueError: If an unsupported operator or malformed filter is provided.

bogdankostic · 2026-04-15T19:34:02Z

+build-backend = "hatchling.build"
+
+[project]
+name = "falkor-db-haystack"


Suggested change

name = "falkor-db-haystack"

name = "falkordb-haystack"

bogdankostic · 2026-04-15T19:36:34Z

These changes should be reverted.

bogdankostic · 2026-04-15T19:43:08Z

Let's use our DocumentStoreBaseTests for testing the document store as described in our docs.

bogdankostic · 2026-04-15T19:46:02Z

Let's add a sentence here saying that in order to run the integration tests, a docker container needs to be run, similar to how we do for example for opensearch

ghassenzaara · 2026-04-15T19:59:47Z

I re-requested a review by accident.
Thanks for the review; it's my first time contributing to open-source. I will pay attention to the documentation carefully and work on the requested changes.

bogdankostic · 2026-04-16T09:13:56Z

I re-requested a review by accident. Thanks for the review; it's my first time contributing to open-source. I will pay attention to the documentation carefully and work on the requested changes.

No worries, let me know if there's anything you're unsure about.

… or falkor-db to falkordb for consistency, remove useless implementation, fix other small issues

…ssenzaara/haystack-core-integrations into feature/falkordb-integration

…ix and compatibility failures

ghassenzaara · 2026-04-21T21:55:58Z

Hi @bogdankostic, I've addressed all the feedback from the previous review. The changes should now align with Haystack's integration conventions and requirements. Please let me know if anything else needs to be adjusted. Happy to iterate further!

socket-security · 2026-04-23T11:54:06Z

Review the following changes in direct dependencies. Learn more about Socket for GitHub.

Diff	Package	Supply Chain Security	Vulnerability	Quality	Maintenance	License
	numpy@2.4.4
	certifi@2026.4.22 ⏵ 2026.2.25
	falkordb@1.0.1
	redis@5.3.1
	posthog@7.13.0
	distro@1.9.0
	typing-inspection@0.4.2
	attrs@26.1.0
	httpcore@1.0.9
	falkordb@1.6.0
	pydantic-core@2.46.3
	rpds-py@0.30.0
	jiter@0.14.0
	annotated-types@0.7.0
	jsonschema-specifications@2025.9.1
	backoff@2.2.1
	colorama@0.4.6
	h11@0.16.0
	pyjwt@2.12.1
	referencing@0.37.0
	haystack-ai@2.26.1
	idna@3.13 ⏵ 3.12

View full report

davidsbatista · 2026-04-23T13:28:14Z

I've created a follow up issue 3219 to add the extended operations to FalkorDBDocumentStore

ghassenzaara · 2026-04-25T11:30:49Z

Hey @davidsbatista,
I've pushed the requested changes and this is ready for another look! Regarding the follow-up issue #3219, I'd love to take that on next. Should I wait for this PR to be merged into main so I can start from a fresh branch, or would you prefer I start working on it now by branching off of this one?

bogdankostic

Thanks for addressing the comments @ghassenzaara! I think the PR is on a good way, we should just remove files that shouldn't be included in the PR and fix the sorting and scaling of scores.

Regarding #3219, I'd say let's wait for this PR to be merged so that we can be sure there won't be any major changes.

bogdankostic · 2026-04-27T09:23:25Z

This file is probably a residue from testing the integration and can be removed.

bogdankostic · 2026-04-27T09:25:12Z

+    def to_dict(self) -> dict[str, Any]:
+        """Serialize this retriever to a dictionary."""
+        return default_to_dict(
+            self,
+            document_store=self.document_store.to_dict(),
+            custom_cypher_query=self.custom_cypher_query,
+        )
+
+    @classmethod
+    def from_dict(cls, data: dict[str, Any]) -> "FalkorDBCypherRetriever":
+        """Deserialize a retriever from a dictionary."""
+        return default_from_dict(cls, data)


You can remove these methods entirely and directly use default_to_dict / default_from_dict in the serialization tests to test if serialization works as expected.

bogdankostic · 2026-04-27T09:26:13Z

+    def to_dict(self) -> dict[str, Any]:
+        """Serialize this retriever to a dictionary."""
+        return default_to_dict(
+            self,
+            document_store=self.document_store.to_dict(),
+            filters=self.filters,
+            top_k=self.top_k,
+            filter_policy=self.filter_policy.value,
+        )
+
+    @classmethod
+    def from_dict(cls, data: dict[str, Any]) -> "FalkorDBEmbeddingRetriever":
+        """Deserialize a retriever from a dictionary."""
+        return default_from_dict(cls, data)


You can remove these methods entirely and directly use default_to_dict / default_from_dict in the serialization tests to test if serialization works as expected.

bogdankostic · 2026-04-27T09:27:41Z

+
+from haystack_integrations.document_stores.falkordb.document_store import (
+    FalkorDBDocumentStore,
+    SimilarityFunction,


I don't think we need to expose SimilarityFunction here.

bogdankostic · 2026-04-27T09:29:21Z

+    def to_dict(self) -> dict[str, Any]:
+        """Serialize this document store to a dictionary."""
+        return default_to_dict(
+            self,
+            host=self.host,
+            port=self.port,
+            graph_name=self.graph_name,
+            username=self.username,
+            password=self.password,
+            node_label=self.node_label,
+            embedding_dim=self.embedding_dim,
+            embedding_field=self.embedding_field,
+            similarity=self.similarity,
+            write_batch_size=self.write_batch_size,
+            recreate_graph=self.recreate_graph,
+            verify_connectivity=self.verify_connectivity,
+        )
+
+    @classmethod
+    def from_dict(cls, data: dict[str, Any]) -> FalkorDBDocumentStore:
+        """Deserialize a document store from a dictionary."""
+        return default_from_dict(cls, data)


You can remove these methods entirely and directly use default_to_dict / default_from_dict in the serialization tests to test if serialization works as expected.

bogdankostic · 2026-04-27T09:45:00Z

+UNWIND $docs AS doc
+MERGE (d:{self.node_label} {{id: doc.id}})
+ON CREATE SET d += doc
+ON MATCH SET d += doc


Using += here keeps the properties from the overwritten document that are not present in the new document with the same ID, so we should use = here instead.

Suggested change

ON MATCH SET d += doc

ON MATCH SET d = doc

bogdankostic · 2026-04-27T09:48:12Z

+    record = {
+        "id": doc.id,
+        "content": doc.content,
+        "embedding": doc.embedding,
+    }
+    if doc.meta:
+        record.update(doc.meta)


Meta fields can silently overwrite standard Document fields (id, content, embedding) due to the update order, let's make sure that these are not overwritten by meta fields.

Suggested change

record = {

"id": doc.id,

"content": doc.content,

"embedding": doc.embedding,

}

if doc.meta:

record.update(doc.meta)

record = {}

if doc.meta:

record.update(doc.meta)

record["id"] = doc.id

record["content"] = doc.content

record["embedding"] = doc.embedding

bogdankostic · 2026-04-27T10:23:17Z

+YIELD node AS d, score
+WHERE {where_clause}
+RETURN d, score
+ORDER BY score DESC


I just looked deeper into FalkorDB and it seems that the database is not returning similarity scores but embedding distances, so we should ORDER BY score ASC here - sorry for the wrong comment earlier.

bogdankostic · 2026-04-27T10:31:55Z

+        :returns: Scaled score in `[0, 1]`.
+        """
+        if self.similarity == "cosine":
+            return (score + 1) / 2


Thisformula assumes the raw score is cosine similarity in range [-1, 1], but the raw score is cosine distance (1 - cos_sim, range [0, 2]), we should therefore adapt the scaling:

Suggested change

return (score + 1) / 2

return 1 - (score / 2)

bogdankostic · 2026-04-27T10:36:19Z

+        """
+        if self.similarity == "cosine":
+            return (score + 1) / 2
+        return float(1 / (1 + math.exp(-score / 100)))


The raw score is euclidean distance in range [0, ∞). Plugging into this sigmoid:

distance=0 (perfect match) → 0.5,

distance→∞ (terrible) → ~1.0

Bad matches get higher scaled scores than good ones, the mapping is inverted.

Let's replace with a monotonically decreasing transform:

Suggested change

return float(1 / (1 + math.exp(-score / 100)))

return 1 / (1 + score)

ghassenzaara added 10 commits April 13, 2026 12:24

Establish the package skeleton, dependencies, and a developer-runnabl…

7389656

…e environment.

feat: implement FalkorDBDocumentStore for graph-based document storag…

f4b0c4d

…e and native vector search

feat: add FalkorDB retrievers and align document store architecture w…

47d908f

…ith neo4j-haystack

feat: complete FalkorDB integrations and test suites

ea68351

feat: Bug fixes

ffcb725

feat: Bug fixes deepset-ai#2

698cad4

feat: Bug Fix 3

a804f21

fix: use correct falkordb vector index creation syntax

a4a4dc0

fix: assert vector native typing on batch write

74bf446

feat: restore CLAUDE.md

f93b0e2

ghassenzaara requested a review from a team as a code owner April 13, 2026 15:19

ghassenzaara requested review from davidsbatista and removed request for a team April 13, 2026 15:19

github-actions Bot added topic:CI type:documentation Improvements or additions to documentation labels Apr 13, 2026

ghassenzaara added 3 commits April 13, 2026 19:24

feat: fix: resolve CI failures on Windows/macOS and lowest-deps runs

3b45d65

fix: correct pydoc output filename and revert invalid runner context …

04ab2ae

…in CI workflow

fix: quote to pass shellcheck

d380366

ghassenzaara force-pushed the feature/falkordb-integration branch from c580421 to d380366 Compare April 14, 2026 09:18

Merge branch 'main' into feature/falkordb-integration

0e79918

davidsbatista requested a review from bogdankostic April 15, 2026 12:34

julian-risch removed the request for review from davidsbatista April 15, 2026 13:29

bogdankostic requested changes Apr 15, 2026

View reviewed changes

ghassenzaara requested a review from bogdankostic April 15, 2026 19:54

ghassenzaara added 3 commits April 16, 2026 19:53

feat: fix double backticks issue, change every occurence of falkor_db…

701351a

… or falkor-db to falkordb for consistency, remove useless implementation, fix other small issues

fix: resolve more issues

ae8d5f4

fix: final changes

3bb3705

ghassenzaara added 5 commits April 21, 2026 17:18

Merge branch 'main' into feature/falkordb-integration

13c9933

Merge branch 'feature/falkordb-integration' of https://github.com/gha…

28d2184

…ssenzaara/haystack-core-integrations into feature/falkordb-integration

fix: resolve all ruff lint and mypy errors blocking CI

cfe4563

fix: correct workflow quoting and pydoc config for CI

0e4f4a6

fix: restrict CI to Linux and pin haystack-ai>=2.26.1 to resolve matr…

b24cb06

…ix and compatibility failures

Merge branch 'main' into feature/falkordb-integration

8e2f79e

Merge branch 'main' into feature/falkordb-integration

1fa17f9

Merge branch 'main' into feature/falkordb-integration

25c70eb

bogdankostic added the integration:falkordb label Apr 27, 2026

bogdankostic requested changes Apr 27, 2026

View reviewed changes

bogdankostic self-assigned this Apr 27, 2026

		- any-glob-to-any-file: ".github/workflows/faiss.yml"


		integration:falkor-db:

	description: FalkorDB integration for Haystack — GraphRAG document store, embedding retriever, and Cypher retriever
	description: FalkorDB integration for Haystack

	If a ``query`` is provided here, it overrides the ``custom_cypher_query``
	If a `query` is provided here, it overrides the `custom_cypher_query`

	return float(1 / (1 + math.exp(-score / 100)))
	return 1 / (1 + score)

Conversation

ghassenzaara commented Apr 13, 2026 • edited by bogdankostic Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Related Issues

Proposed Changes:

How did you test it?

Notes for the reviewer

Checklist

Uh oh!

CLAassistant commented Apr 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bogdankostic left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ghassenzaara commented Apr 15, 2026

Uh oh!

bogdankostic commented Apr 16, 2026

Uh oh!

ghassenzaara commented Apr 21, 2026

Uh oh!

socket-security Bot commented Apr 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

davidsbatista commented Apr 23, 2026

Uh oh!

ghassenzaara commented Apr 25, 2026

Uh oh!

bogdankostic left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

ghassenzaara commented Apr 13, 2026 •

edited by bogdankostic

Loading

CLAassistant commented Apr 13, 2026 •

edited

Loading

socket-security Bot commented Apr 23, 2026 •

edited

Loading