Migration Guide¶
What’s in this document?
The Ontotext Platform migration guide will walk you through the steps for migrating breaking changes and deprecations introduced during the different releases of the Platform.
Migration from 3.x to 4.0¶
Semantic Objects version 4.0 has several breaking changes that are described in the following sections.
GraphDB 10¶
In Semantic Objects version 4.0, support for GraphDB 9.x has been discontinued. This is due to the incompatible versions of the RDF4J library used in GraphDB 9.x (3.7.6) and GraphDB 10.x (4.2.x). Furthermore, the cluster protocol in GraphDB 10 has undergone significant changes, necessitating the use of a different cluster client. As a result of the change in cluster client, some configuration settings have been removed. The following configurations are no longer available:
sparql.endpoint.cluster.unavailableReadTimeout
sparql.endpoint.cluster.unavailableWriteTimeout
sparql.endpoint.cluster.scanFailedInterval
sparql.endpoint.cluster.retryOnHttp4xx
sparql.endpoint.cluster.retryOnHttp5xx
soml.storage.rdf4j.cluster.unavailableReadTimeout
soml.storage.rdf4j.cluster.unavailableWriteTimeout
soml.storage.rdf4j.cluster.scanFailedInterval
soml.storage.rdf4j.cluster.retryOnHttp4xx
soml.storage.rdf4j.cluster.retryOnHttp5xx
The following configurations have been added to change the behavior of the cluster client for GraphDB 10:
- Configurations about the primary SPARQL endpoint used to access the client data:
sparql.endpoint.cluster.clusterStatusTimeout
sparql.endpoint.cluster.clusterStatusConnectTimeout
sparql.endpoint.cluster.concurrentStatusCheck
sparql.endpoint.cluster.leaderDiscoveryRetries
sparql.endpoint.cluster.leaderDiscoveryRetryDelay
sparql.endpoint.cluster.leaderOperationRetries
- The SOML storage configurations used to access the schema store:
soml.storage.rdf4j.cluster.clusterStatusTimeout
soml.storage.rdf4j.cluster.clusterStatusConnectTimeout
soml.storage.rdf4j.cluster.concurrentStatusCheck
soml.storage.rdf4j.cluster.leaderDiscoveryRetries
soml.storage.rdf4j.cluster.leaderDiscoveryRetryDelay
soml.storage.rdf4j.cluster.leaderOperationRetries
MongoDB¶
A significant breaking change in Semantic Objects version 4.0 is the discontinuation of support for MongoDB as a schema storage solution. As a consequence of this change, the schema migration functionality from MongoDB to GraphDB has also been removed.
In addition, certain configuration settings have been removed as a result of this change. The following configurations are no longer available:
soml.storage.mongodb.endpoint
soml.storage.mongodb.database
soml.storage.mongodb.collection
soml.storage.mongodb.connectTimeout
soml.storage.mongodb.readTimeout
soml.storage.mongodb.readConcern
soml.storage.mongodb.writeConcern
soml.storage.mongodb.applicationName
soml.storage.mongodb.serverSelectionTimeout
soml.storage.mongodb.healthCheckTimeout
soml.storage.mongodb.healthcheckSeverity
soml.storage.migration.enabled
soml.storage.migration.source
soml.storage.migration.destination
soml.storage.migration.forceStoreUpdate
soml.storage.migration.cleanBeforeMigration
soml.storage.migration.somlMigration
soml.storage.migration.cleanOnComplete
soml.storage.migration.async
soml.storage.migration.retries
soml.storage.migration.delay
rbac.storage.mongodb.endpoint
rbac.storage.mongodb.database
rbac.storage.mongodb.collection
rbac.storage.mongodb.healthCheckTimeout
As a result of removing MongoDB support, the configuration setting soml.storage.provider
no longer includes the mongodb
option.
GraphQL¶
Semantic Objects version 4.0 introduces a GraphQL schema optimization feature that modifies the behavior of the GraphQL schema generator. Specifically, the generator will no longer include Scalars and related input types that are not referenced in the schema.
For APIs that utilize GraphQL schema merging or federation and rely on these scalars and input definitions, it is necessary to provide them during schema merging. Alternatively, the optimization feature can be disabled by setting the configuration graphql.enableReducedSchema
to false
.
Data types¶
In Semantic Objects version 4.0, changes have been made to the way the data types xsd:time
, xsd:dateTime
, and xsd:dateTimeStamp
are returned to clients. In previous versions, these types would always return three positions for fractional sections (e.g. 14:23:30.000
, 14:22:44.120
, or 12:20:42.124
). However, in the new version, trailing zeroes are no longer returned, and all available fractional seconds up to the maximum of 9 allowed are included. For instance, the examples above would be returned as 14:23:30
, 14:22:44.12
, or 12:20:42.124765
.
Logging¶
Semantic Objects version 4.0 includes updates to the GraphQL and SPARQL loggers aimed at simplifying logging configuration, improving log readability, and facilitating log management.
The following loggers have been modified as part of these updates:
- Renamed
sparql-queries
tosparql.query
- Renamed
query-results
tosparql.query.results
- Renamed
query-durations
tosparql.query.times
- Added
sparql.update
that will log SPARQL updates - Changed
com.ontotext.soaas.controllers.QueryServiceController
tographql
Extensions¶
Semantic Objects version 4.0 includes a new approach for loading extensions and plugins, utilizing the com.ontotext.soaas.plugin.PluginsManager
. This expands the capability for loading extensions using java.util.ServiceLoader
, with the ability to discover Spring beans or to manually register plugin instances during runtime.
Workbench¶
Semantic Objects Workbench version 4.0 includes new configuration options aimed at providing greater control over security setups with a wider range of Identity providers. For more information on these updates, please refer to the Workbench Administration section.
Migration from 3.7 to 3.8¶
- Elasticsearch-related configurations (
elasticsearch.*
) from the Semantic Objects has been moved to the Semantic Search.
Migration from 3.5 to 3.6¶
- MongoDB is removed from all Docker/Docker compose examples. If you need reference, please go to the documentation of version 3.5.
Helm Deployments¶
This is a version with major breaking changes that resolves a lot of issues with the old monolithic Helm chart.
This version makes use entirely of sub-chart so make sure you familiarize with their values.yaml
For more detailed information please refer to the CHANGELOG.md
file which is included in the Helm chart.
Migration from 3.4 to 3.5¶
Before proceeding with the migration, make sure you have read the release notes for Ontotext Platform 3.5.0.
Helm Deployments¶
In version 3.5, the Helm chart introduces the following breaking changes:
- High Availability deployment of PostgreSQL with a replication manager. This requires a migration of the persistent data due to a migration to Bitnami’s PostgreSQL HA chart.
- Deprecation of MongoDB in favor of RDF4J SOML schema storage.
- GraphDB’s official Helm chart is now used as a sub-chart.
If you wish to preserve the persistent data of existing deployments, follow the steps described below.
SOML Schema Storage Migration¶
Starting from version 4.0 of the Semantic Objects, schema migration from MongoDb is no longer supported due to the official removal of MongoDb support as SOML schema storage.
Migration Steps¶
The following steps assume an existing deployment in the default
namespace named platform
.
Note
The migration will cause temporary downtime of several Platform components due to updates in their configuration maps, pod specifications, persistence changes, etc.
Back up all persistent volume data.
PostgreSQL migration
Add Bitnami’s Helm charts repository
helm repo add bitnami https://charts.bitnami.com/bitnami
Prepare an override file named
fusion-ha.migration.yaml
with the following content:# Should be the same as in the platform's 3.5 chart fullnameOverride: fusionauth-postgresql # If the existing deployment has different passwords, update the next configurations to match postgresql: username: fusionauth password: fusionauth database: fusionauth postgresPassword: postgres repmgrPassword: fusionauth replicaCount: 1 pgpool: adminPassword: fusionauth # Update the persistence to the required settings persistence: storageClass: standard size: 1Gi resources: limits: memory: 256Mi
Install a temporary deployment of
bitnami/postgresql-ha
with the prepared values and wait until the new pods are running:helm install -n default -f fusion-ha.migration.yaml --version 7.6.2 postgresql-mig bitnami/postgresql-ha
This deployment will serve to migrate the existing PostgreSQL data into the new HA replica set.
Execute the PostgreSQL data migration with:
kubectl -n default exec -it fusionauth-postgres-0 -- sh -c "pg_dumpall -U fusionauth | psql -U postgres -h fusionauth-postgresql-pgpool"
Enter the password for the system postgres user from
fusion-ha.migration.yaml
. The default ispostgres
.Note
If the existing deployment has different credentials, update the command above with the relevant ones.
Uninstall the temporary deployment:
helm uninstall -n default postgresql-mig
Wait until the pods are removed. The migrated data will be stored into dynamically provisioned PVs/PVCs that will be bound when the Platform chart is upgraded later on.
GraphDB migration
Due to the migration to the official GraphDB helm chart, a migration of the PVs is needed. To migrate GraphDB’s data, the new pods must use the old pods PVs. To achieve this, follow the steps:
Patch all GraphDB PVs (masters and workers) with
"persistentVolumeReclaimPolicy":"Retain"
:kubectl patch pv <graphdb-pv-name> -p '{"spec":{"persistentVolumeReclaimPolicy":"Retain"}}'
This will ensure that the PVs won’t be accidentally deleted.
Delete the GraphDB deployment. If a cluster is used, delete all master and worker deployments.
kubectl delete deployment.apps/<graphdb-deployment-name>
Delete the GraphDB PVCs. If a cluster is used, delete all master and worker PVCs.
kubectl delete pvc <graphdb-pvc-name>
This will release the PVs so they can be reused by the new masters/workers.
Patch the PVs with
"claimRef":null
so they can go from statusReleased
toAvailable
:kubectl patch pv <graphdb-pv-name> -p '{"spec":{"claimRef":null}}'
Patch the PVs with
claimRef
matching the PVCs that will be generated by thevolumeClaimTemplates
.In Platform 3.5, the default volumes used for GraphDB are dynamically provisioned by using
volumeClaimTemplates
. The newly created pods must create PVCs that can claim the old PVs. In order to do this, thevolumeClaimTemplates
for GraphDB’s instances in thevalues.yaml
file must be configured so that they match the PVs specs.For example, if you have an old GraphDB PV that is
10Gi
withstorageClassName: standard
and withaccessModes: ReadWriteOnce
, then thevolumeClaimTemplates
for the GraphDB instance must be set like this:volumeClaimTemplateSpec: accessModes: - "ReadWriteOnce" resources: requests: storage: "10Gi" storageClassName: standard
After you have set the correct
volumeClaimTemplates
, the old GraphDB PVs must be patched so that they are available to be claimed by the generated PVCs. The PVC names generated by the GraphDB chart have the following format:- For masters (and standalone instance):
graphdb-master-X-data-dynamic-pvc
- For workers:
graphdb-worker-Y-data-dynamic-pvc
Where X and Y are the counters for masters and workers, respectively.
Also, the namespace of the PVs
claimrefs
must be updated with the used namespace.The PVs patch is done like this (example for standalone GraphDB):
kubectl patch pv graphdb-default-pv -p '{"spec":{"claimRef":{"name":"graphdb-master-1-data-dynamic-pvc-graphdb-master-1-0"}}}' kubectl patch pv graphdb-default-pv -p '{"spec":{"claimRef":{"namespace":"default"}}}'
If a cluster is used, repeat this with the respective PV names and masters/workers count in the
claimref
name. After the patch of the PVs is done, the PVs are ready forhelm upgrade
. When an upgrade is done, the new GraphDB pod/pods should create PVCs that claim the correct PVs that were used by the previous GraphDB.- For masters (and standalone instance):
Provisioning user
The official GraphDB chart uses a special user for all health checks and provisioning. If you are using the Ontotext Platform with GraphDB security enabled, set
graphdb.graphdb.security.provisioningUsername
andgraphdb.graphdb.security.provisioningPassword
to a user that has an Administrator role in GraphDB, so that the health checks and provisioning jobs can work correctly.
(Optional) Elasticsearch PVs
In Platform 3.5, the default persistence is changed to use dynamic PV provisioning. If you wish to preserve any existing Elasticsearch data, set the following in your
values.yaml
overrides:elasticsearch: volumeClaimTemplate: storageClassName: ""
This override will disabled the dynamic PV provisioning and will use the existing PVs.
Note
This step can be skipped in favor of simply rebinding the SOML schema, which will trigger reindexing in the Elasticsearch.
Upgrade the existing chart deployment.
helm upgrade --install -n default --set graphdb.deployment.host=<your hostname> --version 3.5.0 platform ontotext/ontotext-platform
Note
The upgrade process will take up to several minutes due to redeployment of updated components.