How to deploy your knowledge graph in a graph database with Ontop
Photo by Ontopic
by Benjamin Cogrel, Deni Jegeni, last update: 17 July 2023 (10 min read)

How to deploy your knowledge graph in a graph database with Ontop

Many people build a knowledge graph by moving and transforming existing data into a graph database. This approach is also known as knowledge graph materialization.

During the past decade, Ontop has become a reference open-source solution for materializing knowledge graphs from relational data sources in large organizations.

In this article, we present two ways to materialize your knowledge graph using Ontop.

Compare deploying a knowledge graphs in a graph database with the deployment  as a virtual knowledge graph

How to materialize data into a graph database using Ontop

Let’s start with the most common way to use materialize with Ontop:

  1. Materialize in RDF files and load into a triplestore

    For this tutorial, you will need the following prerequisites:

    Using the CLI command ontop-materialize (https://ontop-vkg.org/guide/cli#ontop-materialize), you can materialize your KG into one or multiple files. For simplicity, we keep the default option and only materialize it into one file.

    ./ontop materialize -m mapping.ttl -p credentials.properties -f turtle -o materialized-triples.ttl

    After running the command, we have all the content of our KG copied to the file materialized-triples.ttl.

    Now we load this file in the triplestore of our choice, in this case, we use GraphDB. This graph database offers several ways to load files. Here, since our file is only 200 MB, we go for the simplest option and load it directly from the UI.

    Once this is done, we can query this KG using GraphDB.

  2. Deploy a VKG and fetch its content from the graph database

    For this second solution, we make use of the concept of KG virtualization, which you can learn more about in this article.

    This approach is a more direct solution.

    We deploy the KG as a virtual KG first and then query it from the graph database. In this way, you can retrieve the triples and store them locally in the graph database.

    Triples are directly streamed to the graph database: no intermediate file storage is involved, making this solution more direct than the previous one.

    Going back to our example, instead of using the ontop-materialize CLI command, let’s deploy the KG as a virtual KG using the ontop-endpoint command:

    ./ontop endpoint -m mapping.ttl -p credentials.properties

    Now Ontop is deployed as a SPARQL endpoint available at http://localhost:8080/sparql.

    Let’s go now to GraphDB. To fetch and insert all the triples from the VKG exposed by Ontop, we run the following SPARQL INSERT query from GraphDB itself:

         INSERT {
           ?s ?p ?o
         }
         WHERE {
           SERVICE <http://localhost:8080/sparql> {
             ?s ?p ?o
           }
         }
    

    This query materializes the same triples as with the first approach.

Which approach to choose for your use case?

If your dataset is not particularly large and a communication channel is easy to set up between the Ontop SPARQL endpoint and the graph database, we recommend solution #2 as it avoids dealing with files and allocating intermediate storage.

If your dataset is very large, you want to use the most efficient loading solution supported by the triplestore, even if it requires more effort to set it up.

Another interesting feature of solution #2, is that it makes it easy to materialize only fragments of the KG, as it just requires adapting the SPARQL query.

This allows for hybrid KGs, where one part is stored in a graph database, and the rest is kept virtual.

Keeping data virtual is particularly advantageous when dealing with large volumes of sensor data that are constantly updated. It makes sense to keep this part of the knowledge graph virtual while storing rich contextual information in the graph database.

What about ontology?

Users familiar with Ontop may have noticed that we didn’t use an ontology in this example.

If you provide an ontology to Ontop, the resulting KG may be significantly larger than the one without due to the reasoning capabilities embedded in Ontop.

As GraphDB also embeds reasoning capabilities, the reasoning can be done later in GraphDB, rather than before at the Ontop. This makes the materialization simpler and faster.

However, if your graph database doesn’t support reasoning, you can rely on Ontop to perform it.

What about mapping?

Any R2RML mapping is supported (read more on the mapping approach here), as well as the Ontop native format (.obda).

You can create these mappings manually or use a specialized platform like Ontopic Studio. Ontopic Studio is a no-code environment especially built for designing knowledge graphs and managing large mappings.

Get a demo access of Ontopic Studio

Ready to do mapping with a no-code approach? Let us help you. Get a demo access:


We'll never share your email with anyone else.
Please supply a valid email address

From time to time we send updates.