Sanitizing

What’s in this document?

Overview

The Ontotext Semantic Objects process and work with a lot of different datasets and ontologies, so it is a common requirement for such a system to support specific formatting and various naming conventions used to describe the data model. Although the base language for the SOML is YAML (see SOML Introduction), which allows almost every character to be used within the key names, GraphQL is a bit more strict about it in that it does not allow the usage of hyphens - or dots . in the names of types and properties.

To avoid this limitation and allow for diversity of the naming convention choices in the schemas that will be used, the Semantic Objects introduce an additional stage to the process that handles the upload of the SOML to the schema store. Its purpose is to sanitize all non-allowed punctuation characters contained in names, i.e., replace them with something that is allowed and recommended for usage. In this case, we replace a hyphen or a dot with the underscore _ character.

Examples and Results

To see what happens when we upload a SOML schema using hyphens and/or dots in the names of the objects, properties, or prefixes, let’s have a look at the following example illustrating the schema transformation and the way the names should be used afterwards.

Suppose we have built the following SOML schema (or have produced it with a tool like OWL2SOML):

id: "/soml/simple-schema"
created: 2020-05-12
updated: 2020-05-14
creator: http://ontotext.com
versionInfo: 1.0


prefixes:
  ont: "http://ontotext.com/ontology/"
  character: "http://ontotext.com/ontology/character/"
  human: "http://ontotext.com/ontology/human/"
  droid: "http://ontotext.com/ontology/droid/"
  my.prefix-ns: "http://ontotext.com/ontology/sanitizable.prefix-ns/"
  object-name.prefix: "http://ontotext.com/ontology/object-name.prefix/"
  properties.name.prefix: "http://ontotext.com/ontology/properties.name.prefix/"


specialPrefixes:
  vocabUrl: "http://ontotext.com/ontology/"


properties:
  height: {inverseAlias: "alias"}
  my.prefix-ns:name: {rdfProp: "prop"}
  properties.name.prefix:weight: {descr: "weight-property", max: 1}


objects:

  Character:
    type: "ont:Character"
    descr: "A character in a film"
    props:
      height: {rdfProp: "characterProp"}
      name: {inverseAlias: "characterAlias"}
      my.prefix-ns:prop: {rdfProp: "characterProp"}

  object-name.prefix:Droid:
    type: ["ont:Droid"]
    descr: "A droid that will take over the world"
    name: my.prefix-ns:name
    props:
      my.prefix-ns:prop-weight: {label: "My weight-property", inverseOf: "properties.name.prefix:weight", min: 1}
      rdf_property: {rdfProp: "my.prefix-ns:prop"}


rbac:
  roles:
    Default:
      description: Overriden default role that can read anything but not write or delete
      actions: [
        "*/*/read",
        object-name.prefix:Droid/*/*
      ]
      notActions: [
        "*/*/write",
        "*/*/delete",
        object-name.prefix:Droid/my.prefix-ns:prop-weight/*
      ]

After sanitizing, the schema will look like this:

id: "/soml/simple-schema"
created: 2020-05-12
updated: 2020-05-14
creator: http://ontotext.com
versionInfo: 1.0


prefixes:
  ont: "http://ontotext.com/ontology/"
  character: "http://ontotext.com/ontology/character/"
  human: "http://ontotext.com/ontology/human/"
  droid: "http://ontotext.com/ontology/droid/"
  my_prefix_ns: "http://ontotext.com/ontology/sanitizable.prefix-ns/"
  object_name_prefix: "http://ontotext.com/ontology/object-name.prefix/"
  properties_name_prefix: "http://ontotext.com/ontology/properties.name.prefix/"


specialPrefixes:
  vocabUrl: "http://ontotext.com/ontology/"


properties:
  height: {inverseAlias: "alias"}
  my_prefix_ns:name: {rdfProp: "prop"}
  properties_name_prefix:weight: {descr: "weight-property", max: 1}


objects:

  Character:
    type: "ont:Character"
    descr: "A character in a film"
    props:
      height: {rdfProp: "characterProp"}
      name: {inverseAlias: "characterAlias"}
      my_prefix_ns:prop: {rdfProp: "characterProp"}

  object_name_prefix:Droid:
    type: ["ont:Droid"]
    descr: "A droid that will take over the world"
    name: my_prefix_ns:name
    props:
      my_prefix_ns:prop_weight: {label: "My weight-property", inverseOf: "properties_name_prefix:weight", min: 1}
      rdf_property: {rdfProp: "my_prefix_ns:prop"}


rbac:
  roles:
    Default:
      description: Overriden default role that can read anything but not write or delete
      actions: [
        "*/*/read",
        object-name.prefix:Droid/*/*
      ]
      notActions: [
        "*/*/write",
        "*/*/delete",
        object_name_prefix:Droid/my_prefix_ns:prop_weight/*
      ]

As you can see, the names of the prefixes, properties, and objects are transformed into a proper, GraphQL standards acceptable format. Furthermore, all the places where the names are referred are also updated in order to preserve the consistency of the provided schema.

For example, querying of Droid weight using the schema above should now look like this:

object_name_prefix_Droid {
  my_prefix_ns_prop_weight
}

Restrictions

Although the sanitizing process allows the usage of punctuation characters by replacing all of them with underscore, it may sometimes lead to ambiguity. This is why we have introduced several restrictions in the SOML validation process. They will detect such ambiguous cases, reject the schema, and report the issue, so that you can fix it and retry the upload.

The following examples will cause conflicts of the mapped GraphQL names, and are not allowed:

  • prefix namespaces like: pre-fix, pre_fix, pre.fix
  • property names like: pfx:some-prop, pfx:some_prop, pfx:some.prop
  • object names like: pfx:Object.name, pfx:Object-name, pfx:Object_name
  • or their combinations, e.g.,: pre-fix:Object.name, pre.fix:Object-name, pfx:some-prop, pfx_some:prop, pfx:some_prop, pfx:some.prop, some-more:prop, some:more-prop, etc.