Why Graphs Matter More Than You Think

Written by Craig Shallahamer | Jan 23, 2025 1:56:33 PM

As I was learning ground-up and using "graphs" to supercharge RAG applications, it became obvious these terms are being carelessly thrown around and can mean just about anything. In essence, it all became a big blur.

This article is about focus—turning that big blur into correct and useful definitions so we can get some work done without confusing everyone, including myself.

Let's get started!

Graphs Of All Kinds

Graphs have been around like... forever. There are a ton of different graphs. For example, the simple X-Y graph we used in our Algebra class. But then there are also network, property, and knowledge graphs.

Think of an X-Y graph as a map of numbers, like how high a ball bounces (Y) at different throws (X). Or, how much money we made (Y) for different types of trades (X). We can easily do statistics like averages and standard deviations.

In contrast, network, property, and knowledge graphs create webs of relationships, showing how a ball is connected to its owner, the playground, or a game rule. Or, showing the relationship between the trader (Craig), the broker (Tradier), and the type of trade (iron condor option).

It's true that these relationships are based on additional attributes, but it's the connection or the relationship that begins to bring meaning—more than a simple profit or loss.

The blur appears when working with network, property, and knowledge graphs. But they are truly distinct: network graphs focus on connections, property graphs add extra details, and knowledge graphs make everything meaningful and smart!

But in the age of AI, combining property graphs, LLMs, and an advanced RAG application can result in a jaw-dropping powerful AI application.

Let's get some clarity!

The Pink Panther Diamond

Graphs can be super boring, but their application is incredibly widespread. To make this fun yet educational, I decided to base all the graphs in this article on the same iconic mystery.

So, for this article, all the examples are from the awesome 1975 movie, "The Return of the Pink Panther," starring Peter Sellers as the iconic Inspector Jacques Clouseau. In this film, the Pink Panther diamond is stolen by a thief who appears to be the notorious "Phantom" (Sir Charles Lytton). However, it is later revealed that the diamond was actually stolen by Lytton's wife, Lady Claudine Lytton.

All data and visualizations have been created from a single Python program, which you can download HERE. I will also show snippets of code from this program.

Network Graph

Imagine you have a map with dots and lines. Each dot is something (like a person, a place, or a thing), and each line shows how the dots are connected. This is a network graph—it's like a spiderweb of connections where we care about how things are linked, like who's friends with whom or which roads lead where.

From this point on, each dot will be called a node, and each connecting line will be called an edge.

Network, Property, and Knowledge graphs all contain nodes and edges and are usually grouped or classified. Looking at the snippet of Python code below, you can see how each Network Graph node is defined and then how they are connected and categorized.


# Add nodes by category

suspects = ['The Phantom', 'Inside Man']

locations = ['Casino Royale', 'Safe Room', 'Hotel Lobby']

investigators = ['Inspector Clouseau', 'Security Chief']



# Add nodes with different colors for each category

for suspect in suspects:

    G.add_node(suspect, node_type='suspect')

for location in locations:

    G.add_node(location, node_type='location')

for investigator in investigators:

    G.add_node(investigator, node_type='investigator')

    

# Add edges (connections)

edges = [

    ('The Phantom', 'Casino Royale'),

    ('The Phantom', 'Inside Man'),

    ('Inside Man', 'Safe Room'),

    ('Inspector Clouseau', 'Casino Royale'),

    ('Security Chief', 'Safe Room'),

    ('Casino Royale', 'Safe Room'),

    ('Hotel Lobby', 'Casino Royale')

]

G.add_edges_from(edges)

Once the raw data (nodes, connections, categories) has been defined, we can check our work and also gain understanding by printing the raw data and creating a graph.

Here is the raw data in a simple textual format.


=== Network Graph Example ===
Demonstrating simple connections between entities in the heist

Network Graph Data:

Nodes by Category:
  Suspects: ['The Phantom', 'Inside Man']
  Locations: ['Casino Royale', 'Safe Room', 'Hotel Lobby']
  Investigators: ['Inspector Clouseau', 'Security Chief']

Connections (Edges):
  The Phantom --> Casino Royale
  The Phantom --> Inside Man
  Inside Man --> Safe Room
  Inspector Clouseau --> Casino Royale
  Security Chief --> Safe Room
  Casino Royale --> Safe Room
  Hotel Lobby --> Casino Royale

One way to verbalize some of the data would be, "The locations in this mystery are the Casino Royale, the Safe Room, and the Hotel Lobby," and "The Phantom, who is a suspect, is associated with the Casino Royale." Network Graphs commonly have no written relationship, so I typically say, "is associated with."

What I like about graphs is that we can see the data, verbalize it, and also plot it! Below is the plot of our network graph.

Since you probably know quite a bit about this mystery, you are aware this network graph is lacking information. Read on!

Property Graph

Property graphs provide more details to bring clarity and deeper insight into our dataset.

For each node, which is a person in this story, we might have more information, like their favorite color or a road's speed limit. Even the edges (i.e., node-connecting lines) can have extra facts, like how long a friendship has lasted or the distance between two places.

With these new facts, our network graph got smarter, turning into a property graph. Now we can store extra facts and use them to figure out more, like the fastest way to get somewhere, who has the most friends, or who stole the Pink Panther Diamond!

Previously, we saw that network graphs contain nodes and edges and are usually grouped or classified. Property graphs take this a step further by defining node details like name (e.g., The Phantom, Diamond) and type (e.g., person, item). Also, the connections between nodes become more of a relationship, not just a simple association. For example, the Phantom and the casino are related because the Phantom visited the casino!

We can also add additional properties to both the nodes and the relationships. Think of these as additional clues in our mystery. The more clues we have, the richer the node description becomes, and the more intimate their relationship becomes.

Looking at the snippet of Python code below, you can see how each Property Graph node is defined and then how they are relationally connected. In both cases, as a Python dictionary.


# Define nodes with properties
nodes = {
    'phantom': {
        'id': 'phantom',
        'type': 'person',
        'name': 'The Phantom',
        'properties': {
            'expertise': ['safe_cracking', 'disguise'],
            'last_known_location': 'Monaco'
        }
    },
    'casino': {
        'id': 'casino',
        'type': 'location',
        'name': 'Casino Royale',
        'properties': {
            'security_level': 'Maximum',
            'surveillance_coverage': '90%'
        }
    },
    'diamond': {
        'id': 'diamond',
        'type': 'item',
        'name': 'Pink Panther Diamond',
        'properties': {
            'value': '50 million',
            'security': 'Level 10'
        }
    }
}

# Define relationships with properties
relationships = [
    {
        'from': 'phantom',
        'to': 'casino',
        'type': 'VISITED',
        'properties': {
            'timestamp': '2024-03-15 23:00',
            'duration': '45min',
            'detected': False
        }
    },
    {
        'from': 'phantom',
        'to': 'diamond',
        'type': 'STOLE',
        'properties': {
            'timestamp': '2024-03-15 23:30',
            'method': 'Safe cracking'
        }
    }
]

Once the raw data (nodes, relationships) has been defined, we can check our work and gain a better understanding by printing the raw data and creating a graph.

Here is the raw data in a simple textual format.

=== Property Graph Example ===
Demonstrating relationships with properties

Detailed Properties:

Nodes with Properties:
{
  "phantom": {
    "id": "phantom",
    "type": "person",
    "name": "The Phantom",
    "properties": {
      "expertise": [
        "safe_cracking",
        "disguise"
      ],
      "last_known_location": "Monaco"
    }
  },
  "casino": {
    "id": "casino",
    "type": "location",
    "name": "Casino Royale",
    "properties": {
      "security_level": "Maximum",
      "surveillance_coverage": "90%"
    }
  },
  "diamond": {
    "id": "diamond",
    "type": "item",
    "name": "Pink Panther Diamond",
    "properties": {
      "value": "50 million",
      "security": "Level 10"
    }
  }
}

Relationships with Properties:
[
  {
    "from": "phantom",
    "to": "casino",
    "type": "VISITED",
    "properties": {
      "timestamp": "2024-03-15 23:00",
      "duration": "45min",
      "detected": false
    }
  },
  {
    "from": "phantom",
    "to": "diamond",
    "type": "STOLE",
    "properties": {
      "timestamp": "2024-03-15 23:30",
      "method": "Safe cracking"
    }
  }
]

Looking at the figure below, we can say much more than simply, "The Phantom is associated with the Casino Royale." One simple example would be, "The Phantom visited the casino on March 15 at 23:00 and stayed there for 45 minutes, but our security system did not detect them."

Property Graph Queries

This is a big deal—much more than I ever thought. Property graph queries have come to the rescue of LLMs because they can be used to answer some of the questions LLMs cannot answer. Don't believe me? Just ask ChatGPT:

Prompt:
Can property graphs be used to answer questions that an LLM cannot 
answer?

Response:
Yes, property graphs can be used to answer questions that large 
language models (LLMs) cannot answer effectively, particularly when 
dealing with highly structured, interconnected, or complex data that 
requires precise, context-aware retrieval and inference. Here's how 
property graphs complement LLMs and address certain limitations: ...

First, let's be clear. So far, I have only defined the data using the Python dictionary structure, dumped the data in JSON, and have NOT queried the data. Instead, I have focused on the visualization to answer questions.

Querying property graph data is a big deal and not at all trivial. It is not a topic—it is a career. It is mathematically deep, and its implementation is crazy complicated.

With that said, I hope you understand that this is a huge topic, and for me to even lightly step into this would leave you with so many unanswered questions, you'll wish you had not read this article.

This is why addressing this topic requires another article—one that will just touch the surface of actually performing queries, not diving into the underlying technology.

This is why Oracle is deeply involved with property graphs. In fact, from what I can tell, Oracle Spatial and Graph was first introduced in December of 2020. Since then, Oracle has invested heavily in this technology and its integration into the Oracle ecosystem.

Stay tuned for my next article. Let's move on to knowledge graphs.

Knowledge Graphs

If you compare the property graph and knowledge graph visualizations, they look very similar. In fact, with the data I have put into the knowledge and property graphs, the property graph looks much more useful!

However, there are subtle but significant differences between property and knowledge graphs. It's the difference between saying, "The sky is blue" and knowing, "It's a good time for a walk."

If you get the difference between "The sky is blue" and "It's a good time for a walk," and you never really enjoyed learning grammar or simply don't want to dig into knowledge graphs, by all means... skip to the next section of this article!

Flexibility and Structure

Property graphs are flexible and unstructured. They are very key-value focused, which offers huge advantages when you don't know what you're working with... until you have worked with it. For example, say you're working with a book. Diagramming a cookbook versus a murder mystery could be very different. Yes, you could generalize the keys, but that's not the point. The point is flexibility.

In contrast, knowledge graphs are structured. In fact, they adhere to what's called an ontology. An ontology is a domain-specific structured framework that defines concepts, entities, and relationships. This implies the framework is predefined and accepted. A Resource Description Framework (RDF) is used to express relationships; it is always directional and has three parts. These three parts are called triples, taking the form of:

(Subject) --> (Predicate) --> (Object)

Here are some triple examples:

- The diamond is stored in the vault (not vice versa)
- The event has a location (not vice versa)
- The Phantom is a criminal (type relationship)

Facts and Understanding

While both graphs convey relationships, knowledge graphs by their design are about semantic relationships. If I need relationship facts, then property graphs are great. But if I need relationship semantics, then I want to use a knowledge graph. This can be confusing, so next up are some examples.

Examples Contrasting The Two Graphs

Examples are a great way to understand something. Here are a few examples contrasting the differences between property graphs and knowledge graphs:

Property Graph	Knowledge Graph
The sky is blue.	It's a good time for a walk.
The book is 300 pages.	This book is great for a weekend read.
The car is 80% full.	You can drive 200 miles before refueling.
Person A knows Person B.	Person A can introduce you to Person C through Person B.
The temperature is 75°F.	It's warm enough to have a picnic outside.
Alice has completed a training course.	Alice is qualified for the job opening.
This recipe contains milk.	This recipe is unsuitable for someone who is lactose intolerant.

I think of this as Data versus Action. That may not be technically correct, but it has helped me.

Let me put this another way: pattern matching works great with known facts and their relationships. But understanding something goes beyond pattern matching into the world of nuance.

Continuing Our Pink Panther Mystery

With network and property graphs, we started by defining the graph nodes and then their relationships. But with knowledge graphs, it's common (but not always possible) to first define the Namespace, which is the source for our structure. Then, with the structure known, we define our triples.

Below is the actual Python code used in this article's program.

# Create namespace
PP = Namespace("http://pp.org/")

# Define all triples
triples = [
    (PP.PinkPantherDiamond, RDF.type, PP.Jewel),
    (PP.PinkPantherDiamond, PP.hasValue, Literal("50000000")),
    (PP.PinkPantherDiamond, PP.storedIn, PP.CasinoRoyaleVault),
    (PP.ThePhantom, RDF.type, PP.Criminal),
    (PP.HeistEvent_001, PP.hasTarget, PP.PinkPantherDiamond),
    (PP.HeistEvent_001, PP.hasPerpetrator, PP.ThePhantom),
    (PP.HeistEvent_001, PP.occuredAt, PP.CasinoRoyale),
    (PP.HeistEvent_001, PP.occuredOn, Literal("2024-03-15")),
    (PP.HeistEvent_001, RDF.type, PP.Event),
    (PP.CasinoRoyaleVault, RDF.type, PP.Location),
    (PP.CasinoRoyale, RDF.type, PP.Location)
]

With our structure defined and our data set up, let's simply dump the data.

 
=== Knowledge Graph Example ===

Demonstrating semantic relationships and inferences



Knowledge Graph Triples:

<http://pp.org/HeistEvent_001> <http://pp.org/occuredAt> 
<http://pp.org/CasinoRoyale>

<http://pp.org/HeistEvent_001> <http://pp.org/hasPerpetrator> 
<http://pp.org/ThePhantom>

<http://pp.org/ThePhantom> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> 
<http://pp.org/Criminal>

<http://pp.org/CasinoRoyale> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> 
<http://pp.org/Location>

<http://pp.org/PinkPantherDiamond> <http://pp.org/storedIn> 
<http://pp.org/CasinoRoyaleVault>

<http://pp.org/HeistEvent_001> <http://pp.org/occuredOn> "2024-03-15"

<http://pp.org/CasinoRoyaleVault> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> 
<http://pp.org/Location>

<http://pp.org/HeistEvent_001> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> 
<http://pp.org/Event>

<http://pp.org/PinkPantherDiamond> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> 
<http://pp.org/Jewel>

<http://pp.org/HeistEvent_001> <http://pp.org/hasTarget> 
<http://pp.org/PinkPantherDiamond>

<http://pp.org/PinkPantherDiamond> <http://pp.org/hasValue> "50000000"

Combining the structure, the data, and a visualization, we can reason. If we include the help of a knowledge graph query, impossible human reasoning can occur. Does this sound a little bit like LLM-speak or perhaps highlight the limitations of LLMs? More about this in the concluding section.

While the image is not perfect, it is very simple to understand. By the way, I chose a simpler Python library, and unfortunately, it does not support dragging the nodes.

Querying Knowledge Graph Data

As I mentioned above, by combining the structure, the data, and a query language, impossible human reasoning can occur. Since I am not going to write a separate article on querying knowledge graph data, I felt it was necessary to introduce you to this topic.

SPARQL is one of the ways used to access knowledge graph data. For example, my request was: Find all events involving the Pink Panther Diamond. Here is the code.

 
qres = g.query(
    """
    SELECT ?event ?perpetrator ?date
    WHERE {
        ?event ?p ?o .
        ?event pp:hasTarget pp:PinkPantherDiamond .
        ?event pp:hasPerpetrator ?perpetrator .
        ?event pp:occuredOn ?date .
    }
    """,
    initNs={"pp": PP}
)

for row in qres:
    print(f"Event: {row.event}, Perpetrator: {row.perpetrator}, Date: {row.date}")

Here are the results:

 

Event: http://pp.org/HeistEvent_001, Perpetrator: http://pp.org/ThePhantom, 
Date: 2024-03-15

This is just a taste of what can be done, but I think you get the idea.

Conclusion: What About LLMs and AI?

As I discussed above, LLMs cannot answer all questions correctly, even if they contain all the data and the relationships about the data. However, because LLMs are pretty good at language, they are also pretty good at identifying nodes and relationships. This provides us with the data we can feed into a property graph query to answer the questions an LLM alone can't!

Now, consider this scenario: What if we combined LLMs, property graphs, and a RAG application? The result is a jaw-dropping powerful AI application.

That is really the goal of this whole article series. But I'm taking this one step at a time.

This article was necessary to provide you with just enough information so you're ready for the next one. So, if you've made it this far, you should really enjoy the next post.

All the best in your life's journey,

Craig.

View full post