Helm Charts

The Helm chart allows you to deploy the Ontotext Platform in Kubernetes with data provisioning, security, and monitoring. It is recommended to use Kubernetes version 1.19 or higher.

In this page, you will find the most important information about how this Helm chart works and how to use it properly.

The Ontotext Platform Helm chart is split into multiple sub-charts that are combined into an umbrella chart. Some external Helm charts are used as sub-charts as well, such as InfluxDB, Grafana, Elasticsearch, Kibana, GraphDB, and PostgreSQL.

For detailed information on how to configure the external sub-charts, please refer to their values.yaml and README files.

Prerequisites

Helm Chart Download

You can download the Ontotext Platform Helm chart, including all sub-charts managed by Ontotext, from the Helm repository.

If you do not have access to the Ontotext Helm repository, please contact us at sales@ontotext.com to obtain the charts.

To add Ontotext’s Helm repository to your setup, execute:

helm repo add ontotext http://maven.ontotext.com/repository/helm-public/

See more information on how to use the external sub-charts:

Local Deployment

If this is your first time installing a Helm chart, make sure to read the following introductions before continuing:

Binaries

Note

sudo may be required.

Kubernetes Environment

Note

It is recommended to use Kubernetes version 1.19 or above.

Minikube

Follow the install documentation for Minikube.

Driver

Carefully choose the suitable Minikube driver for your system.

Warning

Some of the drivers are compatible but have known issues. For example, the docker driver does not support the ingress add-on that is required, and the none driver goes into DNS resolve loop in some Linux distributions.

Resources

It is important to define resource limitations for the Minikube environment. Otherwise it will default to limits that may not be sufficient to deploy the whole Platform.

The default resource limitations require around 12 gigabytes of RAM. This is configurable per service in the values.yaml file and should be tuned for every deployment.

When starting Minikube, it is preferable to allocate a bit more than the required amount. For example, to create a Minikube environment in VirtualBox with 8 CPUs and 16 gigabytes of memory, use:

minikube start --vm-driver=virtualbox --cpus=8 --memory=16000

Add-ons

Minikube has built-in services as part of its add-on system. By default, some of the required plugins are disabled and have to be enabled.

To expose services, enable Minikube’s ingress with:

minikube addons enable ingress

To collect metrics, enable Minikube’s metrics server with:

minikube addons enable metrics-server

DNS Resolving

The Platform is deployed with a Kubernetes ingress service that is configured to listen for requests on a specific hostname. Any other requests are not handled.

This hostname is specified in values.yaml under deployment.host. By default, it is configured for localhost, which is suitable for the none Minikube driver. In every other case, you need to reconfigure it to a hostname that is DNS resolvable.

Some of the options are:

  • Configure or update an existing DNS server - recommended for production deployment
  • Update your hosts file - suitable for local development

To find out the IP address of the Minikube environment, use:

minikube ip

If you wish to access the Platform locally on http://platform.local/ and the IP address of the Minikube environment is 192.168.99.102, you should modify your hosts file with:

192.168.99.102  platform.local

See this how-to on modifying the hosts file in different OS.

Deploying the Helm Chart

Once you have set up your Kubernetes cluster and the necessary binaries (kubectl and Helm), you can start with the Helm chart deployment.

Secrets

First, you need to create some secrets with the Ontotext Platform and GraphDB licenses that will be used by the Helm chart later.

  1. Create a secret for the Platform license.

    The Platform requires a license in order to operate, which will be provided to you by our sales team. After obtaining it, create a secret with the platform.license parameter:

    kubectl create secret generic platform-license --from-file platform.license
    
  2. Create a secret for the GraphDB license.

    The Platform uses GraphDB to query and store Semantic Objects. It is a mandatory component that requires a license. After obtaining one from our sales team, create a secret with the graphdb.license parameter:

    kubectl create secret generic graphdb-license --from-file graphdb.license
    
  3. (Optional) Create a secret for pulling Docker images.

    All Platform-related Docker images are published in Docker Hub by default. This is required only if you will be pulling images from a secured Docker registry.

    kubectl create secret docker-registry $DOCKER_REGISTRY \
            --docker-server=$DOCKER_REGISTRY_SERVER \
            --docker-username=$DOCKER_USER \
            --docker-password=$DOCKER_PASSWORD \
            --docker-email=$DOCKER_EMAIL
    

    Replace the variables with the given credentials or export them in your environment before running the command. Then define or update the global.imagePullSecrets map in the values.yaml file.

Note

Secret names can differ from the given samples but their configurations should be updated to refer to the correct ones. See values.yaml.

Quick Start

The Helm chart includes an example SOML schema and example security configurations. You can install the Platform with them.

However, they are only samples and it is not recommended to start and use them in a production deployment. See Customizing on how to override them.

To install the Platform on http://platform.local, run:

helm install --set global.ingressHost=platform.local ontotext-platform ontotext/ontotext-platform

After a minute or two, Helm will print out the result from installing the Platform and the URLs that can be accessed.

You should see the following output:

--------------------------------------------------------------------------------------------
  ___        _        _            _         ____  _       _    __
 / _ \ _ __ | |_ ___ | |_ _____  _| |_      |  _ \| | __ _| |_ / _| ___  _ __ _ __ ___
| | | | '_ \| __/ _ \| __/ _ \ \/ / __|     | |_) | |/ _` | __| |_ / _ \| '__| '_ ` _ \
| |_| | | | | || (_) | ||  __/>  <| |_      |  __/| | (_| | |_|  _| (_) | |  | | | | | |
 \___/|_| |_|\__\___/ \__\___/_/\_\\__|     |_|   |_|\__,_|\__|_|  \___/|_|  |_| |_| |_|

--------------------------------------------------------------------------------------------
version: 3.7.0 | security: true | monitoring: true | federation: false | search: true
GDB cluster: false | GDB security: true

** Please be patient while the chart is being deployed and services are available **
You can check their status with kubectl get pods

Web applications:
* OTP workbench UI: http://platform.local/workbench
* GraphDB workbench UI: http://platform.local/graphdb
* FusionAuth UI: http://platform.local/admin
* Grafana UI: http://platform.local/grafana
* Kibana UI: http://platform.local/kibana

GraphQL endpoints:
* Semantic objects service GraphQL endpoint: http://platform.local/soaas/graphql
* Semantic search service GraphQL endpoint: http://platform.local/semantic-search/graphql

Provisioning

The Kubernetes configurations are agnostic to data provisioning.

Initial provisioning of the SOML schema and RBAC schema is realized with additional configmaps. See templates/platform/platform-soml-schemas-configmap.yaml.

Additionally, GraphDB repositories and settings are provisioned using configmaps containing the needed configuration files. See the configmaps in templates/graphdb folder.

If the Platform is deployed with security, there is additional security provisioning for FusionAuth done by using the FusionAuth kickstart. You can find the default provisioning configmaps in templates/security and the provisioning files in files/fusionauth.

If the monitoring services are started, there is additional provisioning for Grafana, providing some default dashboards and the necessary InfluxDB datasource. Some default Telegraf configurations are made as well – for more information, see templates/monitoring and files/grafana.

This achieves separation of data and infrastructure provisioning.

Once run, the data provisioning will not execute again. The only exception is Grafana’s provisioning that detects certain annotations and can provision additional datasources and dashboards at any time (check their Helm chart docs for more info).

Note

The first install of the Platform will be slower because the provisioning has to finish before Helm completes the install phase.

Persistence

By default, the Helm chart creates persistent volumes dynamically using volumeClaimTemplate. This means that on the first launch, the deployment will create PVCs and PVs that will not be destroyed if the Helm chart is uninstalled. On the next install, the same PVCs/PVs will be reused. If you want to make a clean installation, the old ones must be deleted manually.

Usage of Existing PVs

By default, deploying the Helm chart will create PVCs and PVs dynamically. However, you can use your existing PVs for the deployed components. Unfortunately, there is no easy way to see what the PVs names and their claimRef should be. The easiest way to see this info, so you can rename your existing PVs and add the necessary claimRef info, is to:

  1. Deploy the Helm chart once.
  2. See the created PVCs names.
  3. Uninstall the chart.
  4. Delete the created PVCs and PVs.
  5. Change the names of your existing PVs according to the gathered info, put in the right claimRef.
  6. Install the chart.

After installing the chart with the changes on your PVs, the dynamically created PVCs should match their requirements to your PVs and claim them. Also, make sure that your PVs are in the correct state.

For more info, see Reserving a persistent volume.

Cloud Deployment

For cloud deployment, you have to prepare persistent disks, a storage class (or classes), and finally persistent volumes manifests. Once this is done, every component must be reconfigured in values.yaml to point to the new persistent volume and not the default one. Each component has a persistence section that has to be updated.

Note

Avoid deploying PostgreSQL using Azure File for storage, as PostgreSQL needs to be able to create hard links that are not supported by Azure File.

Additionally, PostgreSQL requires specific permissions for its data directory. Since Azure File does not support changing the permissions after it is mounted, special care should be taken when configuring the Azure File storage. See here for more details.

API Gateway

The Platform services are proxied using the Kong API Gateway. By default, it is configured to route:

  • Semantic Objects Service
  • GraphDB Workbench: configurable with platform.graphdb.expose from values.yaml
  • FusionAuth (if security is enabled)
  • Grafana (if monitoring is enabled): configurable with monitoring.grafana.expose from values.yaml
  • Ontotext Platform’s Search Service (if deployed)
  • Kibana (used to manage Elasticsearch) configurable with kibana.expose

See the Kong default declarative configuration files/kong.dbless.yaml to understand what and how is being proxied.

To learn more about the declarative syntax, see the Kong documentation.

Customizing

Each component is configured with sensible defaults, some of which are applied from the values.yaml file. Make sure to read it thoroughly and to understand each property and the impact of changing any one of them.

Some default values of the sub-charts are overridden, but there are default values in the sub-charts values.yaml files as well, so make sure to read them as well before customizing any of the underlying services.

Most of the components allow the overriding of their configuration maps and secrets from values.yaml. The default configuration overrides are separated by service in the templates folder. Each service has overridden configmaps and/or secrets. Some of those overrides use additional configuration files located in the files directory.

Note

If you are familiar with Kubernetes, you could modify the components configuration templates directly.

values.yaml

Helm allows you to override values from values.yaml in several ways:

  • Preparing another values.yaml:

    helm install -f overrides.yaml ontotext-platform ontotext/ontotext-platform
    
  • Overriding specific values:

    helm install --set monitoring.enabled=false --set security.enabled=false ontotext-platform ontotext/ontotext-platform
    

For more information, see the Helm values files documentation.

Deployment

Some of the important properties to update according to your deployment are: ingress.tls.*, graphdb.graphdb.protocol, and global.ingressHost. Configure the Ingress controller and some of components on which they are accessible. The global.ingressHost must be a resolvable hostname and not an IP address.

Resources

Each component is defined with default resource limits that are sufficient to deploy the Platform and use it with small sets of data. However, for production deployments it is obligatory to revise these resource limits and tune them for your environment. You should consider common requirements like amount of data, users, or expected traffic.

Look for <component>.resources blocks in values.yaml. During Helm’s template rendering, these YAML blocks are inserted in the Kubernetes pod configurations as pod resource limits. Most resource configuration blocks are referring to official documentations.

See the Kubernetes documentation on defining resource limits.

Assigning Pods to Nodes

Each component in the Helm chart supports specifying nodeSelector, affinity, and tolerations in values.yaml. This allows you to schedule pods across a multi-node cluster with different roles and resources.

Additionally, most components also support topologySpreadConstraints, which can be used to spread pods across failure domains in the cluster.

By default, no node restrictions are applied.

For more details, see the Kubernetes Assigning Pods to Nodes, Taints and Tolerations, and Pod Topology Spread Constraints documentation pages.

GraphDB Configuration

The Helm chart uses GraphDB’s Helm chart as a sub-chart and allows the usage of all options provided by the sub-chart. This includes:

  • Different cluster topologies
  • Security and properties provisioning
  • Automated backups
  • Repository consistency scans
  • Extended master/worker nodes configurations

GraphDB Repository

By default, the provisioning creates a default repository in GraphDB. This repo is provided by graphdb-master-repo-configmap or graphdb-worker-repo-configmap (depending on whether or not a cluster is used) that reads it from the examples/graphdb/master.default.ttl or examples/graphdb/worker.default.ttl file.

To change the default TTL file, you can prepare another configuration map containing a config.ttl file entry:

kubectl create configmap graphdb-worker-repo-configmap --from-file=worker-config.ttl
kubectl create configmap graphdb-master-repo-configmap --from-file=master-config.ttl

After that, update the property graphdb.graphdb.masters.repositoryConfigmap graphdb.graphdb.workers.repositoryConfigmap from values.yaml to refer to the new configuration maps.

GraphDB Cluster Mode

The GraphDB Helm chart allows you to deploy GraphDB in cluster mode. By default, this is disabled and only a single master node is deployed.

To deploy in a cluster, enable graphdb.graphdb.topology in the values.yaml file to 1m_3w/2m3w_rw_ro/2m3w_muted and configure graphdb.graphdb.clusterConfig to match the desired topology.

See GraphDB cluster topologies and the GraphDB documentation on Setting up cluster with a single master.

Search Service

The Platform Helm chart allows you to deploy the Ontotext Platform Search Service as a sub-chart. The Search Service needs Elasticsearch and Kibana Helm charts in order to work properly. Those are added in Chart.yaml as dependencies.

To enable the deployment of the Search Service, search.enabled must be set to true in the values.yaml file. A few sub-sections in the values.yaml for platform-search, elasticsearch, and kibana must be filled as well.

When the Search Service is enabled and the Helm chart is deployed, the SOML schema will be provisioned to the Search Service as well as the Semantic Objects Service.

By default, the Search Service is enabled (see the values.yaml for details).

Important

Make sure to configure the necessary persistent volumes for the Search Service and Elasticsearch.

SOML and RBAC Schema

By default, the provisioning uses an example SOML schema and an RBAC schema. Both are loaded from templates/platform/platform-soml-schemas-configmap.yaml that reads the schemas from files/soml/schema.yaml and files/soml/rbac-schema.yaml files.

You can change the schemas by preparing another configuration map with the desired schema(s) inside. For example, if you want to override the SOML schema, the config map should contain a schema.yaml file entry:

kubectl create configmap soaas-soml-configmap --from-file=schema.yaml

After that, edit the semantic-objects.soml.schema.configmap property from values.yaml to point to the new configuration map.

The same applies for the RBAC schema. The property for it is semantic-objects.soml.rbac.configmap.

Security

Security is enabled by default and can be turned on and off with the security.enabled boolean property. However, due to some sub-chart configuration limitations, the deployment of FusionAuth and its database (PostgreSQL) must be turned off separately with the fusionauth.enabled and fusionauth_postgresql.enabled properties.

By default, the provisioning uses FusionAuth’s Kickstart with configuration files from files/fusionauth/*. It contains example users, JWT settings, FusionAuth tenant, and application settings.

To change the default users/roles or other provisioning settings, see the properties under security.*, such as security.roles and security.users.

PostgreSQL (used by FusionAuth) is deployed in HA mode with Pgpool. By default, only one replica is deployed, but this can be changed in the values.yaml file.

GraphDB Security

GraphDB can be secured separately without enabling the whole security stack. If security.enabled is false and graphdb.graphdb.security.enabled is true (the default), then the security provisioning will secure GraphDB automatically. By default, only basic security will be enabled. This is done through the settings.js, where you can set users and roles. More security-related options can be configured by providing a graphdb.properties file.

For more information, see the GraphDB Helm chart and the GraphDB Security documentation.

Apollo Federation Gateway

The Platform Helm chart allows you to include another /graphql endpoint and federate it under the Apollo federation gateway.

By default, the Apollo gateway is disabled and no federation is applied behind the /graphql endpoint. To change that, set apollo.enabled to true and supply another configuration map that configures Apollo to federate all required /graphql endpoints. Then update the apollo.services property to include the new configuration map.

By enabling Apollo, the API gateway will configure a /federation route through Apollo gateway.

The Helm chart provides some default configurations for the Apollo gateway located in templates/platform/platform-apollo-*.

Note

See the GraphQL Federation documentation for Apollo configurations. After deploying with Apollo enabled, you must create the federation user in FusionAuth and assign the proper roles.

Monitoring

Monitoring is enabled by default. This can be changed by setting monitoring.enabled to false. By turning it off, Helm will not deploy any of the monitoring configurations and components.

The Ontotext Platform Helm chart deploys Grafana and InfluxDB using their official Helm charts as sub-charts.

Telegraf is deployed with a custom sub-chart managed by Ontotext.

Grafana Dashboards

By default, the Ontotext Platform Helm chart deploys Grafana with a few preconfigured dashboards using the sidecar loading. See Sidecar for dashboards.

You can see the default dashboards JSON files in files/grafana/dashboards and the configmap used to provision them in templates/monitoring/platform-grafana-dashboards.yaml.

Grafana Datasources

By default, Grafana is preconfigured with a datasource matching the deployed InfluxDB. However, if necessary, the datasource(s) can be changed to an external InfluxDB instance or another database. This is done again by the Sidecar method of provisioning.

You can see the default datasource configuration file in files/grafana/datasources and the configmap used to provision it in templates/monitoring/platform-grafana-datasources.yaml.

Grafana Notifications

By default, there are no configured notifiers, but you can use the sidecar provisioning if necessary. See Sidecar for notifiers.

Uninstall

To remove the deployed Platform, use:

helm uninstall ontotext-platform

Note

Keep in mind that this will not remove any data, so the next time the Platform is installed, the data will be loaded by its components.

Provisioning will be skipped.

Troubleshooting

Helm Install Hangs

If there is no output after helm install, it is likely that a hook cannot execute. Check the logs with kubectl logs.

Connection Issues

If connections time out or the pods cannot resolve each other, it is likely that the Kubernetes DNS is broken. This is a common issue with Minikube between system restarts or when an incompatible Minikube driver is used. See the Kubernetes documentation on debugging DNS resolution.