Schema Storage & Management¶
What’s in this document?
Overview¶
The Semantic Objects manage a collection of Semantic Object schemas. It also manages access to these schemas using Role-Based Access Controls.
This section of the documentation does not discuss the SOML language and syntax, but rather the management of these schemas and the Role-Based Access Controls that constrain access to the schemas.
A collection of SOML schemas is shared amongst instances of the Semantic Objects Service. A Semantic Objects instance can bind any one of these Semantic Object schemas to generate a GraphQL endpoint. Thus allowing client applications to perform GraphQL queries and mutations.
The Semantic Objects manage and stores schemas within a configured schema storage. See more about the supported stores in the Schema Storage section.
Schema management is achieved by using the Semantics Objects /soml
REST API.
RBACs are applied to the collection of schemas. These controls ensure those
roles that can create/update/delete or bind a particular schema. RBACs
are managed by using the Semantics Objects /soml-rbac
REST API.
Quick Start¶
Hint
You can perform all of the below actions from the Ontotext Semantic Services web-based administration interface, the Workbench.
A docker-compose.yaml
file is required to ensure all mandatory components are
started correctly before you can manage SOML schemas.
This docker-compose.yaml
configuration will download and start the important containers on a single machine.
Once you have downloaded the compose file, follow the Quick Start guide using this file instead of the one defined in the guide. (Skip the download operation in the Docker Compose section of the guide.)
SOML Schema Creation¶
To create a schema from the Workbench, follow the steps described here (generate a schema from an existing ontology file) and here (create a new schema).
To create the SWAPI schema within Semantic Objects as service with the identifier /soml/swapi
,
first download the Semantic Object schema.yaml
definition.
You can invoke the following cURL request:
curl --location -X POST 'http://localhost:9995/soml' \
-H 'Content-Type: text/yaml' \
-H 'X-Request-ID: some-uuid-correlation-id' \
-H 'Accept: application/ld+json' \
-T schema.yaml
SOML Schema Validation¶
To validate a SOML schema from the Workbench as part of the schema generation workflow without creating it, follow the steps described here.
To validate a SOML schema via cURL without creating it, you can use the following cURL request:
curl --location -X POST 'http://localhost:9995/soml/validate' \
-H 'Content-Type: text/yaml' \
-H 'X-Request-ID: some-uuid-correlation-id' \
-H 'Accept: application/ld+json' \
-T schema.yaml
The validation consists of two phases:
- SOML schema validation: the same check performed during SOML schema creation or update
- SOML schema binding: checks if the SOML schema can be transformed to a GraphQL schema, and, optionally (if SHACL is enabled), to SHACL shapes
SOML Schema List¶
To view all currently existing SOML schemas managed within the Semantic Objects (and MongoDB) from the Workbench, follow the instructions here.
To retrieve a list of all the SOML schemas managed within the Semantic Objects (and MongoDB), invoke the following cURL request:
curl --location --request GET 'http://localhost:9995/soml' \
-H 'X-Request-ID: some-uuid-correlation-id' \
-H 'Accept: application/ld+json'
SOML Schema Retrieval¶
To open a particular SOML schema managed within the Semantic Objects (and MongoDB) from the Workbench, follow the instructions here.
To retrieve a particular SOML schema (in this case the SWAPI schema) managed within the Semantic Objects (and MongoDB), invoke the following cURL request:
curl --location -X GET 'http://localhost:9995/soml/swapi' \
-H 'X-Request-ID: some-uuid-correlation-id' \
-H 'Accept: text/yaml'
You can also request a response in application/ld+json
format:
curl --location -X GET 'http://localhost:9995/soml/swapi' \
-H 'X-Request-ID: some-uuid-correlation-id' \
-H 'Accept: application/ld+json'
SOML Schema Updates¶
To update a SOML schema from the Workbench, follow the instructions here.
To update the SWAPI schema within the Semantic Objects with the identifier /soml/swapi
, invoke the following cURL request:
curl --location -X PUT 'http://localhost:9995/soml/swapi' \
-H 'Content-Type: text/yaml' \
-H 'X-Request-ID: some-uuid-correlation-id' \
-H 'Accept: application/ld+json' \
-T schema.yaml
SOML Schema Binding¶
To bind (i.e., activate) a SOML schema to the Semantic Objects from the Workbench, follow the instructions here.
To bind the SWAPI schema to the Semantic Objects, and generate the GraphQL endpoint/schema from it, invoke the following cURL request:
curl --location -X PUT 'http://localhost:9995/soml/swapi/soaas' \
-H 'X-Request-ID: some-uuid-correlation-id'
SOML Schema Deletion¶
You can delete a SOML schema from the Workbench only if it is not bound (i.e., activated). This can be done from the Schema Registry and the Manage Schema views.
To delete the schema and unbind one from the Semantic Objects (if it is bound), invoke the following cURL request:
curl --location -X DELETE 'http://localhost:9995/soml/swapi' \
-H 'X-Request-ID: some-uuid-correlation-id' \
Note
The above examples are with security ON, meaning that -H 'X-Request-ID: some-uuid-correlation-id'
should be removed if the Semantic Objects are started without security.
SOML Schema Statistics¶
You can see schema statistics from the Workbench by clicking on the schema ID in the Schema Registry list. The Workbench home screen will display statistics only for the currently active schema.
To retrieve SOML schema statistics, use the following cURL request:
curl --location -X GET 'http://localhost:9995/soml/swapi/stats?limit=10' \
-H 'X-Request-ID: some-uuid-correlation-id' \
-H 'Accept: application/ld+json'
This will return the top 10 shapes in the SOML schema.
Note
SOML schema statistics are available only for the active schema.
Administration¶
Schema Management API¶
See the Quick Start cURL examples for create, read, update, delete, and bind operation examples.
Schema Management RBAC¶
Only users with a role claim schema-rbac-admin are able to modify the schema rbac (read/write/delete). RBACs authorize access to schema endpoints:
GET /soml
: Retrieve all schemasGET /soml/{schema-id}
: Retrieve a particular schemaPOST /soml/{schema-id}
: Create a particular schemaPUT /soml/{schema-id}
: Update a particular schemaDELETE /soml/{schema-id}
: Delete a particular schemaPUT /soml/{schema-id}/soaas
: Bind a particular schema to the Semantic Objects
Schema RBAC Endpoints¶
The Semantic Objects provide the following endpoints to manage the schema RBAC:
- Read:
GET /soml-rbac
the schema RBACs (only accessible for users/role SchemaRBACAdmin) - Update:
PUT /soml-rbac
update the schema RBACs (only accessible for users/role SchemaRBACAdmin)
An example schema RBAC is as follows:
id: /soml-rbac
label: Schema Management Role-Based Access Control
creator: http://ontotext.com
created: 2019-06-15
updated: 2019-06-16
versionInfo: 0.1
rbac:
roles:
# Default role which does not need to be configured or declared. Included for completeness.
Default:
description: "Default role, which does not need to be declared restricts all schema management access read, write and delete"
notActions: ["*/*"]
# Example role definitions which need to be declared by the SOML user:
SchemaRBACAdmin:
description: "Administrator role, can read, write and delete objects and schema"
actions: ["*/*"]
ReadOnlyUser:
description: "User which can read all schema"
actions: ["*/read"]
SwapiSchemaManager:
description: "User which has admin access (read, write and delete on the swapi schema only"
actions: ["swapi/*"]
Reading the Schema RBACs¶
The following cURL request (with a valid JWT token containing a role claim == SchemaRBACAdmin) will retrieve the schema RBAC:
curl --location -X GET 'http://localhost:9995/soml-rbac' \
-H 'X-Request-ID: some-uuid-correlation-id' \
-H 'Accept: text/yaml'
Updating the Schema RBACs¶
The following cURL request (with a valid JWT token containing a role claim == SchemaRBACAdmin) will
update the schema RBAC with the contents of the rbac.yaml
file:
curl --location -X PUT 'http://localhost:9995/soml-rbac' \
-H 'X-Request-ID: some-uuid-correlation-id' \
-H 'Content-Type: text/yaml' \
-H 'Accept: application/ld+json' \
-T rbac.yaml
Getting Schema RBAC roles¶
The following cURL request will return the RBAC roles for the currently authenticated user.
curl --location -X GET 'http://localhost:9995/soml-rbac/roles' \
-H 'X-Request-ID: some-uuid-correlation-id' \
-H 'Accept: application/ld+json'
Schema Storage¶
Semantic Objects can store created SOML schemas in:
The storage provider is controlled by the soml.storage.provider configuration property. To activate a specific provider, the required properties for that storage must be added as well.
With version 3.5 of the Ontotext Semantic Objects, the default storage option is an RDF4J repository replacing the previously used MongoDB one. To ease this transition, we have developed a schema storage migration. See how to enable it here.
RDF4J-compatible Repository¶
This storage option is enabled by setting soml.storage.provider: rdf4j
.
The Semantic Objects can use any RDF4J-compatible repository provider to store the managed schemas. This storage option was developed to replace MongoDB, which was the previous default storage option until version 3.5 of the Semantic Objects. The benefit compared to MongoDB are the reduced deployment dependencies and complexity as this option does not require local persistence and the deployment of an additional database for the service to operate.
The only requirement is an access to a writable SPARQL endpoint with an existing repository or rights to create a new one.
In order to not interfere with user data, by default the schemas are stored in a separate repository named otp-system
.
This repository can be created automatically by the Semantic Objects or created and provided externally.
If you prefer to not create a separate repository, then any compatible existing repository can be used.
The schemas are stored in a separate named graph and all operations are executed in that context.
Note
The service requires this repository to not be read-only, at least until all required schemas are created and bound.
The automatic repository creation will try the following steps to create a repository on the configured endpoint address:
- A repository with provided custom configuration via
soml.storage.rdf4j.repositoryConfig
.- A GraphDB cluster worker repository (for GraphDB Standard and Enterprise deployments).
- A GraphDB Free repository instance (for GraphDB Free deployment).
- Generic Sail in-memory repository.
Note
For cluster deployments with one or more masters and multiple workers, it is not advisable to rely on the automatic repository creation as the repository will be created on the first configured master and will not have resilience in case of master failure.
Ontotext Semantic Services’ official Helm charts provide automatic repository creation on the configured workers in the deployed topology.
All related configurations for the current store are prefixed by soml.storage.rdf4j
.
If no specific configuration is provided, the default value for it will be taken from the main SPARQL endpoint configuration with the same name, prefixed with sparql.endpoint. This excludes the repository name.
If a specific configuration should not be inherited, it should be explicitly set to its default value. All possible configurations can be found here.
If this store is enabled, the configuration application.name is required.
it specifies the service name and must be unique among the deployed Semantic Services. If two or more service instances have the same name (horizontal scaling), they will use the same bound schema. If not defined, the value of spring.application.name
will be used if defined.
The health status of this repository is reported by the SOML health check and does not have a separate health check report.
MongoDB¶
Warning
In Ontotext Semantic Objects version 3.5, MongoDB has been deprecated and will be removed in a future version.
This storage option is enabled by setting soml.storage.provider: mongodb
.
The related configurations for the current store are prefixed by soml.storage.mongodb
.
All available options can be found here.
The option requires an available local file storage with write permissions.
The Semantic Objects MongoDB Docker container can be accessed at
http://localhost:9997
.
For more information on how to administer a MongoDB instance, see the Mongo Administration page.
MongoDB Compass¶
There is no GUI included within the MongoDB container, but it is possible to use one of the popular clients such as MongoDB Compass.
In-memory¶
This storage option is enabled by setting the soml.storage.provider: in-memory
or leaving it empty.
The option does not actually provide schema persistence, as the schemas are stored in the operating memory, so after service restart they are lost.
The option, combined with an initialization script, is perfect for small installations with no complex schema management requirements and additional resources.
The initialization can be one of the following:
- Using the official CLI tool to upload and bind a schema.
- Using the configuration soml.preload.schemaPath to provide a schema to automatically load and bind on startup.
- Using a custom script to initialize the service in case of service restart.
The main drawback of this provider is the inability to communicate and synchronize with other instances of the deployed service. It also prevents the use of the Semantic Search as it cannot manage schemas on its own.
Storage Migration¶
The Ontotext Semantic Objects provide a means to copy the contents of one schema store to another. This functionality is available for version 3.5 and onward, and is provided to enable existing deployments to switch from the deprecated mongodb
provider to the new rdf4j
one.
To enable the functionality, set soml.storage.migration.enabled: true
. You must also set
soml.storage.migration.source
to the source store to copy from. If not explicitly set, the destination is inferred to the current storage provider.
Note
This functionality is not applicable if the source or the destination is resolved to in-memory
.
The migration process is performed on service start and will be skipped if:
- the source is empty. This is the first thing the migration checks and if
true
, exits and nothing else is done.- the destination contains any schema.
To override the destination contents, use soml.storage.migration.forceStoreUpdate
.
If set to true
, the migration process will clear the destination store and upload all the schemas from the source. If the source is empty, then it would do nothing.
If combined with soml.storage.migration.cleanBeforeMigration
set to true
, it can be used to perform clean migration, otherwise an update is made.
The update process will override any already existing schema and will not affect the rest.
The following example shows the minimal configuration set to enable the migration:
soml.storage.provider: rdf4j
application.name: soaas-swapi # required when provider is 'rdf4j'
# MongoDB must be configured, otherwise the migration will be skipped
soml.storage.mongodb.endpoint: mongodb://mongodb:27017
soml.storage.mongodb.database: soaas
soml.storage.mongodb.collection: soml
rbac.storage.mongodb.collection: soml-rbac # required, if security is enabled
storage.location: data # if not set, the bound schema will not be set
# and will require manually to set the schema
soml.storage.migration.enabled: true
soml.storage.migration.source: mongodb
soml.storage.migration.source: rdf4j # optional, inferred from soml.storage.provider
All possible configurations can be found here.