SPARQL Federation Tutorial

This tutorial will walk you through the steps for configuring and using SPARQL Federated Objects.

The examples in this section are based on the Star Wars dataset. In this tutorial, we will extend the existing Star Wars SOML schema with an additional object - Studio, and will link it to our existing data.

Create a New Repository

  1. First, we need to create an external repository. For the purposes of of this tutorial, we will do so in the GraphDB instance that hosts the main Star Wars repository. See how to do it here.
  2. Name the new repository swapi-studios.
  3. To insert data in the newly created repository, execute:
PREFIX : <https://swapi.co/resource/>
PREFIX voc: <https://swapi.co/vocabulary/>
PREFIX wd: <http://www.wikidata.org/entity/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
insert data {
    wd:Q242446 a voc:Studio ;
               rdfs:label "Lucasfilm" ;
               voc:founder wd:Q38222;
               voc:staff :John;
               voc:film <https://swapi.co/resource/film/1>,
                        <https://swapi.co/resource/film/2>,
                        <https://swapi.co/resource/film/3>,
                        <https://swapi.co/resource/film/4>,
                        <https://swapi.co/resource/film/5>,
                        <https://swapi.co/resource/film/6> .

    :John a voc:Employee;
          rdfs:label "John Doe";
          voc:salary 5000 .
}

Define the Federated Service in the Semantic Objects

  1. Define the external service by adding the following property in the application.properties file of the Semantic Objects:

    sparql.federated.services.studios=repository:swapi-studios

  2. Alternatively, if you have started the services following the Quick Start guide, you can add the property to the docker-compose file and restart the Semantic Objects:

    services:
      semantic-objects:
        ...
        environment:
          sparql.federated.services.studios: "repository:swapi-studios"
    

    Note

    In this example, we are using local federation. This is why the address of the service starts with repository:. Alternatively, you can perform the federation over HTTP by using the following service address: http://127.0.0.1:7200/repositories/swapi-studios.

    The federated repository can be located on a completely different server as well, as long as it is reachable from the server hosting the local repository.

  3. Start the Semantic Objects.

Define the New Object in the SOML

  1. Add the following object to the Star Wars SOML schema:

    Studio:
      descr: 'Film production Studio.'
      sparqlFederatedService: studios
      name: rdfs:label
      props:
        founder: {range: Person}
        film: {range: Film}
        staff: {range: Employee, max: inf}
    Employee:
      sparqlFederatedService: studios
      name: rdfs:label
      props:
        salary: {range: integer}
    

    Notice that Studio is configured to use our new studios service. It also has two properties - founder and film, which will link this object to our local data, as well a property with range in the studios service - staff.

  2. Add the following property to the existing Film object:

    Film:
      props:
        studio: {range: Studio, inverseAlias: film}
    

    It will allow us to query the Studio of a Film using an inverseAlias property.

  3. Update the SOML schema in the Semantic Objects instance.

Run a Federated Query

Now let’s run a GraphQL query that will collect the federated data:

query film {
  film (ID: "https://swapi.co/resource/film/1") { #local repository
    id #local repository
    name #local repository
    studio { #federated object
      id #object in the local repository, subject in swapi-studios
      name #swapi-studios
      founder {
        id #object in the swapi-studios, subject in the local repository
        name #local repository
      }
      staff { #federated object
        name #swapi-studios
        salary #swapi-studios
      }
    }
  }
}

The location from where each selection will be fetched is marked with #. The id-s are marked as present in both repositories because they are the links between the objects.

You will see that the data from the external service is returned and added to the film instance:

{
  "data": {
    "film": [
      {
        "id": "https://swapi.co/resource/film/1",
        "name": "Star Wars",
        "studio": {
          "id": "http://www.wikidata.org/entity/Q242446",
          "name": "Lucasfilm",
          "founder": {
            "id": "http://www.wikidata.org/entity/Q38222",
            "name": "George Lucas"
          },
          "staff": [
            {
              "name": "John Doe",
              "salary": "5000"
            }
          ]
        }
      }
    ]
  }
}