Memgraph is a relatively young graph database. It supports the Cypher query language and the Bolt protocol – just like Neo4j – therefore it is usually possible to use Neo4j client libraries (called “drivers”) with Memgraph. In fact, according to the Memgraph Drivers documentation, using the Neo4j drivers is the recommended way to communicate with Memgraph from several languages like Go, Java and C#.
Memgraph Python Client Libraries
For Python, there are actually a number of options to choose from:
- pymgclient: I discovered this recently and haven’t used it, but it seems to be a lower-level client that works well.
- gqlalchemy: Memgraph’s recommended client for Python, which uses pymgclient underneath. Perhaps named after Python ORM sqlalchemy, it provides three different ways to query Memgraph, none of which I found to be very practical:
- Basic
execute()
andexecute_and_fetch()
: this seems simple enough, but I haven’t found any way to pass parameters to queries, making it useless for my use case. - OGM: This is a graph equivalent of an ORM. It’s no secret that ORMs are one of the things I avoid like the plague – I’ve already written some of my thoughts on the subject in “ADO .NET Part 1: Introduction“, and time-permitting it will also be the subject of a future article. In a nutshell: I just want to write Cypher queries and execute them, not have to translate them to some library’s arbitrary API.
- Query builder: A fluent query builder, similar in approach to what Elasticsearch provides for .NET. I’m not a fan for the same reasons that apply to ORMs (see previous point above).
- Basic
- Neo4j Bolt Driver for Python: This doesn’t work with Memgraph out of the box, but we’ll talk more about this.
It’s unfortunate that the Neo4j Bolt Driver for Python doesn’t work with Memgraph by default, because if you already have Python code that works with Neo4j, you could otherwise use Memgraph as a drop-in replacement for Neo4j with minimal changes (e.g. fixing incompatible Cypher).
For the rest of this article, I will be focusing on the Neo4j Bolt Driver for Python, to understand why we can’t use it with Memgraph and explain how to get around the problem.
Update 21st November 2022: TL;DR: if you need a quick solution, go to the end of this article.
Why the Neo4j Driver Fails with Memgraph
Let’s make a first attempt to use the Neo4j Bolt Driver for Python with Memgraph.
First, we need to have an instance of Memgraph running. The easiest way is to run it with Docker, e.g. as follows (assuming you’re on Linux):
sudo docker run --rm -it -p 7687:7687 -p 3000:3000 memgraph/memgraph-platform
If this works, it will start a Memgraph shell, and you can also access Memgraph Lab (Memgraph’s web user interface) by visiting http://localhost:3000/
.
Next, create a folder for your Python code. Run the following to install the Neo4j driver:
pip3 install neo4j
At the time of writing this article, the version of the Neo4j Python driver is 5.2.1. With earlier versions, it’s possible you might run into errors such as:
neobolt.exceptions.SecurityError: Failed to establish secure connection to ‘[SSL: WRONG_VERSION_NUMBER] wrong version number (_ssl.c:1131)’
In this case, update the driver as follows:
pip3 install neo4j --upgrade
At this point, we can steal some example code from the Neo4j Bolt Driver for Python, as follows, and put it in a file called main.py:
from neo4j import GraphDatabase
driver = GraphDatabase.driver("neo4j://localhost:7687",
auth=("neo4j", "password"))
def add_friend(tx, name, friend_name):
tx.run("MERGE (a:Person {name: $name}) "
"MERGE (a)-[:KNOWS]->(friend:Person {name: $friend_name})",
name=name, friend_name=friend_name)
def print_friends(tx, name):
query = ("MATCH (a:Person)-[:KNOWS]->(friend) WHERE a.name = $name "
"RETURN friend.name ORDER BY friend.name")
for record in tx.run(query, name=name):
print(record["friend.name"])
with driver.session(database="neo4j") as session:
session.execute_write(add_friend, "Arthur", "Guinevere")
session.execute_write(add_friend, "Arthur", "Lancelot")
session.execute_write(add_friend, "Arthur", "Merlin")
session.execute_read(print_friends, "Arthur")
driver.close()
Once we run this with python3 main.py
, we get a nice big error:
$ python3 main.py
Traceback (most recent call last):
File "main.py", line 18, in <module>
session.execute_write(add_friend, "Arthur", "Guinevere")
File "/home/daniel/.local/lib/python3.8/site-packages/neo4j/_sync/work/session.py", line 712, in execute_write
return self._run_transaction(
File "/home/daniel/.local/lib/python3.8/site-packages/neo4j/_sync/work/session.py", line 484, in _run_transaction
self._open_transaction(
File "/home/daniel/.local/lib/python3.8/site-packages/neo4j/_sync/work/session.py", line 396, in _open_transaction
self._connect(access_mode=access_mode)
File "/home/daniel/.local/lib/python3.8/site-packages/neo4j/_sync/work/session.py", line 123, in _connect
super()._connect(access_mode, **access_kwargs)
File "/home/daniel/.local/lib/python3.8/site-packages/neo4j/_sync/work/workspace.py", line 198, in _connect
self._connection = self._pool.acquire(**acquire_kwargs_)
File "/home/daniel/.local/lib/python3.8/site-packages/neo4j/_sync/io/_pool.py", line 778, in acquire
self.ensure_routing_table_is_fresh(
File "/home/daniel/.local/lib/python3.8/site-packages/neo4j/_sync/io/_pool.py", line 721, in ensure_routing_table_is_fresh
self.update_routing_table(
File "/home/daniel/.local/lib/python3.8/site-packages/neo4j/_sync/io/_pool.py", line 648, in update_routing_table
if self._update_routing_table_from(
File "/home/daniel/.local/lib/python3.8/site-packages/neo4j/_sync/io/_pool.py", line 596, in _update_routing_table_from
new_routing_table = self.fetch_routing_table(
File "/home/daniel/.local/lib/python3.8/site-packages/neo4j/_sync/io/_pool.py", line 534, in fetch_routing_table
new_routing_info = self.fetch_routing_info(
File "/home/daniel/.local/lib/python3.8/site-packages/neo4j/_sync/io/_pool.py", line 504, in fetch_routing_info
cx = self._acquire(address, deadline, None)
File "/home/daniel/.local/lib/python3.8/site-packages/neo4j/_sync/io/_pool.py", line 221, in _acquire
return connection_creator()
File "/home/daniel/.local/lib/python3.8/site-packages/neo4j/_sync/io/_pool.py", line 138, in connection_creator
connection = self.opener(
File "/home/daniel/.local/lib/python3.8/site-packages/neo4j/_sync/io/_pool.py", line 441, in opener
return Bolt.open(
File "/home/daniel/.local/lib/python3.8/site-packages/neo4j/_sync/io/_bolt.py", line 377, in open
connection.hello()
File "/home/daniel/.local/lib/python3.8/site-packages/neo4j/_sync/io/_bolt4.py", line 450, in hello
check_supported_server_product(self.server_info.agent)
File "/home/daniel/.local/lib/python3.8/site-packages/neo4j/_sync/io/_common.py", line 283, in check_supported_server_product
raise UnsupportedServerProduct(agent)
neo4j.exceptions.UnsupportedServerProduct: None
The last three lines indicate that the problem seems to be a simple validation, which we can confirm by looking up the offending function in the Neo4j driver’s source code:
def check_supported_server_product(agent):
""" Checks that a server product is supported by the driver by
looking at the server agent string.
:param agent: server agent string to check for validity
:raises UnsupportedServerProduct: if the product is not supported
"""
if not agent.startswith("Neo4j/"):
raise UnsupportedServerProduct(agent)
What would happen if we simply disable this check? Let’s find out.
Tweaking the Neo4j Driver to Work with Memgraph
First, let’s clone the Neo4j driver’s repo:
git clone https://github.com/neo4j/neo4j-python-driver.git
A quick search shows that there are two places where the server product check is done:
We can disable the validation by replacing the implementation of each function with just pass
:
def check_supported_server_product(agent):
pass
Next, we build this modified version of the Neo4j driver as follows:
python3 setup.py sdist
This creates a file called neo4j-5.2.dev0.tar.gz in a dist subfolder. Take note of the path of this file.
Back in the folder with our Python test code (where we were attempting to communicate with Memgraph), install the package we just built:
$ pip3 install /home/daniel/Desktop/neo4j-python-driver/dist/neo4j-5.2.dev0.tar.gz
Processing /home/daniel/Desktop/neo4j-python-driver/dist/neo4j-5.2.dev0.tar.gz
Requirement already satisfied: pytz in /home/daniel/.local/lib/python3.8/site-packages (from neo4j==5.2.dev0) (2022.1)
Building wheels for collected packages: neo4j
Building wheel for neo4j (setup.py) ... done
Created wheel for neo4j: filename=neo4j-5.2.dev0-py3-none-any.whl size=244857 sha256=ec2951ea1fecf2ae1aacced4d93c66b1b5d90bc3710746ff3814b9b62a96a9af
Stored in directory: /home/daniel/.cache/pip/wheels/0d/4c/55/2486d65ebf98105bc54a490ebd91cea4ba538268a32ffc91f0
Successfully built neo4j
Installing collected packages: neo4j
Attempting uninstall: neo4j
Found existing installation: neo4j 5.2.1
Uninstalling neo4j-5.2.1:
Successfully uninstalled neo4j-5.2.1
Successfully installed neo4j-5.2.dev0
Run the Python code again…
$ python3 main.py
Unable to retrieve routing information
Transaction failed and will be retried in 0.9256931081701124s (Unable to retrieve routing information)
Unable to retrieve routing information
Transaction failed and will be retried in 2.0779915720272504s (Unable to retrieve routing information)
We still have a failure, but this is a simple connectivity issue that is easily fixed by changing the scheme in the URI from neo4j
to bolt
:
driver = GraphDatabase.driver("bolt://localhost:7687",
auth=("neo4j", "password"),)
Running it again, we see that it now works!
$ python3 main.py
Guinevere
Lancelot
Merlin
We can also view the created data in Memgraph Lab to double-check that it really worked:
Conclusion
In this article, we’ve confirmed that, at a basic level, the only thing preventing the Neo4j Bolt Driver for Python from being used with Memgraph is a simple check against a response from the server. We saw that queries could be executed once this check was disabled.
As a result, it’s not clear why Memgraph built their own Python clients instead of simply addressing this check (e.g. by sending the same response as Neo4j, or forking the driver and eliminating the check as I did). I will refrain from speculating on possible reasons, but I found this interesting to investigate and hope it saves time for other people in the same situation.
P.S.: There’s an Easier Way
This section was added on 21st November 2022.
I learned from the Memgraph team that they do provide a way to deal with the server check – it’s just not documented at the time of writing this article. Basically, all you have to do is run Memgraph using a --bolt-server-name-for-init
switch that sets the missing server response. So if you run Memgraph in Docker, you’d need to run it as follows:
sudo docker run --rm -it -p 7687:7687 -p 3000:3000 -e MEMGRAPH="--bolt-server-name-for-init=Neo4j/" memgraph/memgraph-platform
If you run the example code with the bolt://
scheme using the unmodified Neo4j Bolt Driver for Python, it should work just as well.
Update 19th September 2023: as of Memgraph v2.11, --bolt-server-name-for-init
has a default value compatible with the Neo4j Bolt Driver, and therefore no longer needs to be provided.