Schema Storage & Management

Overview

The Semantic Objects Service manages a collection of Semantic Object schemas. It also manages access to these schemas using Role-Based Access Controls.

This section of the documentation does not discuss the SOML language and syntax, but rather the management of these schemas and the Role-Based Access Controls that constrain access to the schemas.

A collection of SOML schemas is shared amongst instances of the Semantic Objects Service. A Semantic Objects Service instance can bind any one of these Semantic Object schemas to generate a GraphQL endpoint. Thus allowing client applications to perform GraphQL queries and mutations.

The Semantic Objects Service manages and stores schemas within a configured schema storage. See more about the supported stores in the Schema Storage section.

Schema management is achieved by using the Semantics Objects /soml REST API.

RBACs are applied to the collection of schemas. These controls ensure those roles that can create/update/delete or bind a particular schema. RBACs are managed by using the Semantics Objects /soml-rbac REST API.

Quick Start

Hint

You can perform all of the below actions from the Platform web-based administration interface, the Workbench.

A docker-compose.yaml file is required to ensure all mandatory components are started correctly before you can manage SOML schemas.

This docker-compose.yaml configuration will download and start the important containers on a single machine.

Once you have downloaded the compose file, follow the Quick Start guide using this file instead of the one defined in the guide. (Skip the download operation in the Docker Compose section of the guide.)

SOML Schema Creation

To create a schema from the Platform Workbench, follow the steps described here (generate a schema from an existing ontology file) and here (create a new schema).

To create the SWAPI schema within Semantic Objects as service with the identifier /soml/swapi, first download the Semantic Object schema.yaml definition.

You can invoke the following cURL request:

curl --location -X POST 'http://localhost:9995/soml' \
    -H 'Content-Type: text/yaml' \
    -H 'X-Request-ID: some-uuid-correlation-id' \
    -H 'Accept: application/ld+json' \
    -T schema.yaml

SOML Schema Validation

To validate a SOML schema from the Workbench as part of the schema generation workflow without creating it, follow the steps described here.

To validate a SOML schema via cURL without creating it, you can use the following cURL request:

curl --location -X POST 'http://localhost:9995/soml/validate' \
    -H 'Content-Type: text/yaml' \
    -H 'X-Request-ID: some-uuid-correlation-id' \
    -H 'Accept: application/ld+json' \
    -T schema.yaml

The validation consists of two phases:

  • SOML schema validation: the same check performed during SOML schema creation or update

  • SOML schema binding: checks if the SOML schema can be transformed to a GraphQL schema, and, optionally (if SHACL is enabled), to SHACL shapes

SOML Schema List

To view all currently existing SOML schemas managed within the Semantic Objects Service (and MongoDB) from the Workbench, follow the instructions here.

To retrieve a list of all the SOML schemas managed within the Semantic Objects Service (and MongoDB), invoke the following cURL request:

curl --location --request GET 'http://localhost:9995/soml' \
    -H 'X-Request-ID: some-uuid-correlation-id' \
    -H 'Accept: application/ld+json'

SOML Schema Retrieval

To open a particular SOML schema managed within the Semantic Objects Service (and MongoDB) from the Workbench, follow the instructions here.

To retrieve a particular SOML schema (in this case the SWAPI schema) managed within the Semantic Objects Service (and MongoDB), invoke the following cURL request:

curl --location -X GET 'http://localhost:9995/soml/swapi' \
    -H 'X-Request-ID: some-uuid-correlation-id' \
    -H 'Accept: text/yaml'

You can also request a response in application/ld+json format:

curl --location -X GET 'http://localhost:9995/soml/swapi' \
    -H 'X-Request-ID: some-uuid-correlation-id' \
    -H 'Accept: application/ld+json'

SOML Schema Updates

To update a SOML schema from the Workbench, follow the instructions here.

To update the SWAPI schema within the Semantic Objects Service with the identifier /soml/swapi, invoke the following cURL request:

curl --location -X PUT 'http://localhost:9995/soml/swapi' \
    -H 'Content-Type: text/yaml' \
    -H 'X-Request-ID: some-uuid-correlation-id' \
    -H 'Accept: application/ld+json' \
    -T schema.yaml

SOML Schema Binding

To bind (i.e., activate) a SOML schema to the Semantic Objects Service from the Workbench, follow the instructions here.

To bind the SWAPI schema to the Semantic Objects Service, and generate the GraphQL endpoint/schema from it, invoke the following cURL request:

curl --location -X PUT 'http://localhost:9995/soml/swapi/soaas' \
    -H 'X-Request-ID: some-uuid-correlation-id'

SOML Schema Deletion

You can delete a SOML schema from the Workbench only if it is not bound (i.e., activated). This can be done from the Schema Registry and the Manage Schema views.

To delete the schema and unbind one from the Semantic Objects Service (if it is bound), invoke the following cURL request:

curl --location -X DELETE 'http://localhost:9995/soml/swapi' \
    -H 'X-Request-ID: some-uuid-correlation-id' \

Note

The above examples are with security ON, meaning that -H 'X-Request-ID: some-uuid-correlation-id' should be removed if the Platform is started without security.

SOML Schema Statistics

You can see schema statistics from the Workbench by clicking on the schema ID in the Schema Registry list. The Workbench home screen will display statistics only for the currently active schema.

To retrieve SOML schema statistics, use the following cURL request:

curl --location -X GET 'http://localhost:9995/soml/swapi/stats?limit=10' \
    -H 'X-Request-ID: some-uuid-correlation-id' \
    -H 'Accept: application/ld+json'

This will return the top 10 shapes in the SOML schema.

Note

SOML schema statistics are available only for the active schema.

Administration

Schema Management API

See the Quick Start cURL examples for create, read, update, delete, and bind operation examples.

Schema Management RBAC

Only users with a role claim schema-rbac-admin are able to modify the schema rbac (read/write/delete). RBACs authorize access to schema endpoints:

  • GET /soml: Retrieve all schemas

  • GET /soml/{schema-id}: Retrieve a particular schema

  • POST /soml/{schema-id}: Create a particular schema

  • PUT /soml/{schema-id}: Update a particular schema

  • DELETE /soml/{schema-id}: Delete a particular schema

  • GET /soml/{schema-id}/boaas: Bind a particular schema to Semantic Objects Service

Schema RBAC Endpoints

The Semantic Objects Service provides the following endpoints to manage the schema RBAC:

  • Read: GET /soml-rbac the schema RBACs (only accessible for users/role SchemaRBACAdmin)

  • Update: PUT /soml-rbac update the schema RBACs (only accessible for users/role SchemaRBACAdmin)

An example schema RBAC is as follows:

id:          /soml-rbac
label:       Schema Management Role-Based Access Control
creator:     http://ontotext.com
created:     2019-06-15
updated:     2019-06-16
versionInfo: 0.1

rbac:
    roles:
        # Default role which does not need to be configured or declared. Included for completeness.
        Default:
        description: "Default role, which does not need to be declared restricts all schema management access read, write and delete"
        notActions: ["*/*"]
        # Example role definitions which need to be declared by the SOML user:
        SchemaRBACAdmin:
        description: "Administrator role, can read, write and delete objects and schema"
        actions: ["*/*"]
        ReadOnlyUser:
        description: "User which can read all schema"
        actions: ["*/read"]
        SwapiSchemaManager:
        description: "User which has admin access (read, write and delete on the swapi schema only"
        actions: ["swapi/*"]
Reading the Schema RBACs

The following cURL request (with a valid JWT token containing a role claim == SchemaRBACAdmin) will retrieve the schema RBAC:

curl --location -X GET 'http://localhost:9995/soml-rbac' \
    -H 'X-Request-ID: some-uuid-correlation-id' \
    -H 'Accept: text/yaml'
Updating the Schema RBACs

The following cURL request (with a valid JWT token containing a role claim == SchemaRBACAdmin) will update the schema RBAC with the contents of the rbac.yaml file:

curl --location -X PUT 'http://localhost:9995/soml-rbac' \
    -H 'X-Request-ID: some-uuid-correlation-id' \
    -H 'Content-Type: text/yaml' \
    -H 'Accept: application/ld+json' \
    -T rbac.yaml
Getting Schema RBAC roles

The following cURL request will return the RBAC roles for the currently authenticated user.

curl --location -X GET 'http://localhost:9995/soml-rbac/roles' \
    -H 'X-Request-ID: some-uuid-correlation-id' \
    -H 'Accept: application/ld+json'

Schema Storage

Semantic Objects can store created SOML schemas in:

The storage provider is controlled by the soml.storage.provider configuration property. To activate a specific provider, the required properties for that storage must be added as well.

With version 3.5 of the Ontotext Platform, the default storage option is an RDF4J repository replacing the previously used MongoDB one. To ease this transition, we have developed a schema storage migration. See how to enable it here.

RDF4J-compatible Repository

This storage option is enabled by setting soml.storage.provider: rdf4j.

The Platform can use any RDF4J-compatible repository provider to store the managed schemas. This storage option was developed to replace MongoDB, which was the previous default storage option until version 3.5 of the Platform. The benefit compared to MongoDB are the reduced deployment dependencies and complexity as this option does not require local persistence and the deployment of an additional database for the service to operate.

The only requirement is an access to a writable SPARQL endpoint with an existing repository or rights to create a new one.

In order to not interfere with user data, by default the schemas are stored in a separate repository named otp-system. This repository can be created automatically by the Platform or created and provided externally. If you prefer to not create a separate repository, then any compatible existing repository can be used. The schemas are stored in a separate named graph and all operations are executed in that context.

Note

The service requires this repository to not be read-only, at least until all required schemas are created and bound.

The automatic repository creation will try the following steps to create a repository on the configured endpoint address:

  1. A repository with provided custom configuration via soml.storage.rdf4j.repositoryConfig.

  2. A GraphDB cluster worker repository (for GraphDB Standard and Enterprise deployments).

  3. A GraphDB Free repository instance (for GraphDB Free deployment).

  4. Generic Sail in-memory repository.

Note

For cluster deployments with one or more masters and multiple workers, it is not advisable to rely on the automatic repository creation as the repository will be created on the first configured master and will not have resilience in case of master failure.

The Platform’s official Helm charts provide automatic repository creation on the configured workers in the deployed topology.

All related configurations for the current store are prefixed by soml.storage.rdf4j.

If no specific configuration is provided, the default value for it will be taken from the main SPARQL endpoint configuration with the same name, prefixed with sparql.endpoint. This excludes the repository name.

If a specific configuration should not be inherited, it should be explicitly set to its default value. All possible configurations can be found here.

If this store is enabled, the configuration application.name is required. it specifies the service name and must be unique among the deployed Platform services. If two or more service instances have the same name (horizontal scaling), they will use the same bound schema. If not defined, the value of spring.application.name will be used if defined.

The health status of this repository is reported by the SOML health check and does not have a separate health check report.

MongoDB

Warning

In Ontotext Platform version 3.5, MongoDB has been deprecated and will be removed in a future version.

This storage option is enabled by setting soml.storage.provider: mongodb.

The related configurations for the current store are prefixed by soml.storage.mongodb. All available options can be found here.

The option requires an available local file storage with write permissions.

The Platform’s MongoDB Docker container can be accessed at http://localhost:9997.

For more information on how to administer a MongoDB instance, see the Mongo Administration page.

MongoDB Compass

There is no GUI included within the MongoDB container, but it is possible to use one of the popular clients such as MongoDB Compass.

In-memory

This storage option is enabled by setting the soml.storage.provider: in-memory or leaving it empty.

The option does not actually provide schema persistence, as the schemas are stored in the operating memory, so after service restart they are lost.

The option, combined with an initialization script, is perfect for small installations with no complex schema management requirements and additional resources.

The initialization can be one of the following:

  • Using the official CLI tool to upload and bind a schema.

  • Using the configuration soml.preload.schemaPath to provide a schema to automatically load and bind on startup.

  • Using a custom script to initialize the service in case of service restart.

The main drawback of this provider is the inability to communicate and synchronize with other instances of the deployed service. It also prevents the use of the Search Service as it cannot manage schemas on its own.

Storage Migration

The Ontotext Platform provides a means to copy the contents of one schema store to another. This functionality is available for version 3.5 and onward, and is provided to enable existing deployments to switch from the deprecated mongodb provider to the new rdf4j one.

To enable the functionality, set soml.storage.migration.enabled: true. You must also set soml.storage.migration.source to the source store to copy from. If not explicitly set, the destination is inferred to the current storage provider.

Note

This functionality is not applicable if the source or the destination is resolved to in-memory.

The migration process is performed on service start and will be skipped if:

  • the source is empty. This is the first thing the migration checks and if true, exits and nothing else is done.

  • the destination contains any schema.

To override the destination contents, use soml.storage.migration.forceStoreUpdate. If set to true, the migration process will clear the destination store and upload all the schemas from the source. If the source is empty, then it would do nothing.

If combined with soml.storage.migration.cleanBeforeMigration set to true, it can be used to perform clean migration, otherwise an update is made. The update process will override any already existing schema and will not affect the rest.

The following example shows the minimal configuration set to enable the migration:

soml.storage.provider: rdf4j

application.name: soaas-swapi                           # required when provider is 'rdf4j'

# MongoDB must be configured, otherwise the migration will be skipped
soml.storage.mongodb.endpoint: mongodb://mongodb:27017
soml.storage.mongodb.database: soaas
soml.storage.mongodb.collection: soml

rbac.storage.mongodb.collection: soml-rbac              # required, if security is enabled


storage.location: data                                  # if not set, the bound schema will not be set
                                                        # and will require manually to set the schema

soml.storage.migration.enabled: true
soml.storage.migration.source: mongodb
soml.storage.migration.source: rdf4j                    # optional, inferred from soml.storage.provider

All possible configurations can be found here.