My journey began with an inspiring blog post: Implementing advanced RAG strategies with Neo4j. Last few months I’ve been working to grasp RAG, LLMs, and LangChain, while also nurturing an idea in HRTech. Currently, we’re preparing mockups, and if things align with potential customers’ needs, we aim to develop an MVP by Q1 2024. That’s when a robust RAG method will be crucial for the project.

My experience in coding chat applications using LLMs and LangChain includes a project utilising a vectorstore. While embedding data into a vectorstore is straightforward, Neo4j presents a bit more of a learning curve. It’s not particularly hard, but it does require understanding the basics of Cypher, Neo4j’s query language, similar to SQL.

What is Neo4j?

Neo4j is a graph database that stores data in nodes and relationships, rather than in tables or documents. This format is akin to sketching ideas on a whiteboard, offering a flexible approach to data management. For more information, visit their website.

Having always been in tune with Mind Maps and a concept Systems Thinking, I found that working with a graph database like Neo4j felt incredibly natural and intuitive. I appreciate methods that can be visualised and easily explained.

Training in Neo4j

Neo4j offers extensive training material. When you sign up for their free AuraDB instance, you’ll find manuals and training links. There’s also the Graph Academy for free, self-paced, hands-on training. Training modules range from Neo4j fundamentals to Data Science, Developer languages, and advanced Cypher techniques.

O’Reilly also offers a free PDF book on Neo4j: O’Reilly Graph Databases.

Additionally, Neo4j’s Sandbox environment allows you to experiment with Cypher using various datasets. Although some manuals dive quickly into advanced data extraction techniques, they’re generally helpful.

My Experience Prepping for the Professional Certificate

I started with the Neo4j Certificated Professional, which consists of 80 questions to be answered in 60 minutes. It’s free, with one attempt allowed every 24 hours. I’ll write separately about the Data Science certificates and LLM training.

I found two different lists of trainings that helps to pass the certificate so here I paste those I completed myself:

Neo4j property graph model
Cypher queries
Graph data modeling
Importing data
Application development concepts (Python one)
Intermediate Cypher Queries – this one seems optional like others available in the Academy curriculum

Preparing for the certificate required more time than anticipated, especially for the ‘Immediate Cypher Queries’ module. I restarted this module midway to parallelly practice with the sandbox. This hands-on approach proved beneficial, though time consuming.

I generally spent 50% more time on each training and I also skimmed through almost a trainings just before the certificate.

Failing & Passing the Certificate

Upon taking the test, I scored 78%, just shy of the 80% passing mark. I noticed that as a non-native English speaker, some questions seemed straightforward at first but required careful reading. The Developer section and questions about data importing were particularly challenging to me.

Each incorrect answer in the test comes with an explanation or a link to documentation, which was quite helpful for understanding my mistakes. Motivated by this, I revisited all my incorrect answers and took the second test the following day. This time, I decided to proceed more cautiously with the questions. Despite my intention to go slower, I completed the test in under 40 minutes, double-checking my responses as I went along.

I observed some changes in the test: a few questions were different, others had the order of their multiple-choice answers shuffled, while some remained the same. But the fact that you already are familiar with the format and pace, pays off. I scored 93.4% and successfully passed the certificate.

Some Post-Study Notes

As I went through the training materials, I have saved some of the advices for later, when I build my first Neo4j database.

Maximum 4 labels per node as a recommendation
Eliminate duplicates
Refactor
Keep meta model simple as there is little value in building rich and expressive features which are not used
Embrace just-enough semantics when creating new organising principles
Careful use of EXPLAIN and PROFILE can significantly boost query performance
Be aware of so called Eager operators. Eager operators pull in all data immediately and often create a choke point
For imports with admin import tool: You don’t have to make sure the CSV files are perfect, just in good enough condition
Performing one match clause will perform better than multiple match clauses since they are fewer relationship traversed

As I plan to pass LLM & Data Science trainings, as well as part of the O’Reilly book this list will get longer 🙂

I hope those of view who were considering training in Neo4j get encouraged by my post.

Cheers,

Ali

From Studying LangChain to Neo4j Professional Certificate

What is Neo4j?

Training in Neo4j

My Experience Prepping for the Professional Certificate

Failing & Passing the Certificate

Some Post-Study Notes

Read Next

Employ LLM and Scrapers to Find Best Lunch Options

From Studying LangChain to Neo4j Professional Certificate

What is Neo4j?

Training in Neo4j

My Experience Prepping for the Professional Certificate

Failing & Passing the Certificate

Some Post-Study Notes

Read Next

Employ LLM and Scrapers to Find Best Lunch Options

Sliding Sidebar