Properties

Properties are the ‘fields’ of semantic objects. They come in two varieties (see Domain and Range):

  • Relations to other objects (object properties)

  • Attributes that hold scalars (datatype properties)

Props are defined in two ways:

  • You can first define a section properties: with global property definitions.

  • Each object has a section props: listing all its properties (props inherited from a super-class are added to this list).

  • Prop characteristics may be added to or overwritten (redefined) - you can see how to do it in Characteristic Inheritance.

  • If you do not need to redefine characteristics, you can list the prop without any characteristics. In the example below, both prop1 and prop2 are reused without change:

properties:
  prop1: {range: integer}
  prop2: {range: Obj1}
objects:
  Obj1:
    props:
      prop1:    # Doesn't list any characteristics
      prop2: {} # Same but more verbose
  • You can define new props in objects directly, i.e., the section properties: is optional and should be utilized when a property is used several times.

properties:
  prop1: {range: integer}
  prop2: {range: Obj1}
objects:
  Obj1:
    props:
      prop1:    # Doesn't list any characteristics
      prop2: {} # Same but more verbose
      prop3: {range: string}

Property Characteristics

Properties have the following characteristics:

Characteristic

Default

Description/notes

name

(mandatory)

lowerCamelCase symbol. The YAML key

label

name

Human-readable name

descr

(optional)

Description or clarification

range

string

Datatype or SOML object type

rangeCheck

false

Must check the type discriminator of the target object (boolean)

typeCast

false

Cast data prop to its target datatype (boolean)

min

0

Minimum number of values, integer (mutations)

max

1

Maximum number of values, integer. inf means unlimited (mutations)

nonNullable

false

Controls the nullability value for the query return types

nonNullableElements

false

Controls the nullability value for the collection elements

inverseAlias

(none)

Virtual inverse of a property

inverseOf

(none)

Materialized inverse properties

rdfProp

(none)

RDF property name (if not allowed in GraphQL or hard to read)

symmetric

false

Self-inverse (boolean)

regex

(none)

Unanchored regex pattern (can use ^...$ to anchor it)

prefix

(none)

String (not a regex) that is anchored to the start

pattern

(none)

An array of a pattern and pattern flags, in that order

lang

(none)

String or YAML dictionary containing the language configurations for langString and stringOrLangString properties

maxInclusive

(none)

Defines the max inclusive value of a property with a numerical range (date or number)

minInclusive

(none)

Defines the min inclusive value of a property with a numerical range (date or number)

maxExclusive

(none)

Defines the max exclusive value of a property with a numerical range (date or number)

minExclusive

(none)

Defines the min exclusive value of a property with a numerical range (date or number)

maxLength

(none)

Defines the max length (inclusive) of a String or langString property

minLength

(none)

Defines the min length (inclusive) of a String or langString property

valuesIn

(none)

An array of permitted values for a property. The values must be comma separated and conform to the property’s range

valuesListExclusive

true

Defines if mutations are blocked from providing values that are NOT part of the values defined within valuesIn. When valuesListExclusive is false, at least one of the values defined within valuesIn must be provided, but values outside of the valuesIn list will be accepted as long as there is one value that is within the valuesIn list.

search

(none)

Defines the search configuration for the property. It contains information about whether the property is indexable or not, what language analyzers should be used, etc. The configuration is partially inheritable for abstract shapes or global schema configuration. More information about default and allowed values can be found here.

scaleFactor

1.0

Required when the property is declared as searchable and is of type Decimal. Accepts double value and is used to generate proper search mapping for Elastic.

Furthermore, we are also considering adding the following characteristics. Please send us feedback whether they would be useful to you.

Characteristic

Default

Description/notes

subPropertyOf

(none)

Super-property (involves RDFS inference)

kind

computed from range

One of object literal mixed, see below

transitive

false

Transitively closed (chain p,p infers p)

transitiveOver

(none)

Transitive over another prop q (chain p,q infers p)

inverseFunctional

false

A node can have only one incoming relation of this kind (boolean)

required

false

An external required property used for a calculation @requires

external

false

An external property owned by another federated service @external

And at a later stage also:

  • display characteristics for using on UIs:

    • group (string): mapped to SHACL sh:group

    • order (integer): mapped to SHACL sh:order

Characteristic Inheritance

Properties can be inherited using two mechanisms:

  • An object’s props: list may refer to props predefined in the global list properties:. In that case, the inherited characteristics can be overwritten freely.

  • When an object Obj2 inherits Obj1 (see Inheritance), it inherits all its props with their characteristics. Because Obj2 must fulfill the promises (interface) of Obj1, only the following changes are possible:

    • Tighten the cardinality interval min..max (see Cardinality for more details).

    • Change inverseAlias. This allows a specialized property to be used in a subclass.

    • Change label and descr to describe the property more accurately.

    • All other changes are forbidden.

    • Keep the same range.

      Note

      Please be aware that we would like to allow changing the range to a covariant, but this is currently not possible due to limitations of GraphQL. Even though it is supported for Object Type Validation (”field must return a type which is equal to or a sub‐type of (covariant) the return type of implementedField field’s return type”), it is not supported for Input Objects: “That named argument on field must accept the same type (invariant) as that named argument on implementedField”. This means that if you used a covariant subclass, you would not be able to search for its fields (see Where Filtering).

      We have posted issue #629 against the GraphQL specification project.

If the same prop name is used in objects not related by inheritance, its characteristics can vary independently and without restriction. In other words, every prop is defined locally to its object class and can vary between classes.

The following sections describe property characteristics in detail.

Name and IRI

Two characteristics control the prop name and IRI: “name” (the YAML key) and rdfProp.

Names are mapped to IRI as follows:

  • If rdfProp is provided, it is used to form the IRI, otherwise “name” is used.

    • A value like prop is mapped to the default vocab namespace.

    • A value like pfx:prop is mapped using prefix pfx:.

  • “name” is always used as the field name in GraphQL.

    • A value like pfx:name is represented as field pfx_name in GraphQL.

Domain and Range

The source and target of properties are determined by:

  • domain: the domain (originating class) of a property is the object class where it appears.

  • range: determines what values a prop can hold: object class, super-class (GraphQL interface), or a scalar. Default is string (a simple scalar).

For increased flexibility, we do not restrict properties to a single domain and range: they may vary for the same prop name.

rangeCheck and typeCast

The following characteristics determine how values are treated concerning the declared range while querying, and can be used to “fix” some data quality problems:

  • rangeCheck (boolean): whether to check the type discriminator of the target object, see Object Typing. Applies only to object properties, and the default is false, except when the property has inverseAlias characteristic. For more information see Inverses and rangeCheck. Use true when the domain class uses the same prop to point to a variety of RDF types, and you want to select only one of them.

For example, DBpedia’s dbo:parent property has many cases where the target is not dbo:Person, which you can find using this query.

select * {
  ?x dbo:parent ?y
  filter not exists {?y rdf:type dbo:Person}
}

One example case is Herman I, Margrave of Meissen who has the house of Billung listed as one of his parents. The reason is that the corresponding Wikipedia page has this infobox:

{{ infobox nobility
| father = [[Eckard I, Margrave of Meissen]]
| mother = Suanhild of [[Billung]]

The father correctly links to a person, but the mother links only to a house, not a person. You can eliminate such values while querying by using the following schema:

objects:
  dbo:Person:
    props:
      dbo:parent: {range: dbo:Person, rangeCheck: true}
  • typeCast (boolean): Must cast a data prop to its target datatype. Use “true” when the RDF database has wrong/missing datatypes.

For example, the Geonames dump includes population, latitude, and longitude fields whose values are mere strings (lack a datatype) - see the RDF for Bulgaria as an example:

gn:population  "7000039" ;
wgs84:lat      "42.66667" ;
wgs84:long     "25.25" .

You can “fix” this with the following schema:

properties:
  gn:population: {label: "Population", range: integer, typeCast: true}
  wgs84:lat:     {label: "Latitude",   range: decimal, typeCast: true}
  wgs84:long:    {label: "Longitude",  range: decimal, typeCast: true}

Cardinality

The (outgoing) cardinality of a property in the context of mutations input is controlled by these characteristics:

  • min (integer), default 0. min: 1 or greater means the field is required (mandatory).

  • max (integer), default 1. max: inf or any value greater than 1 means the field is multi-valued.

SOML takes up the GraphQL default [0..1], thus properties are optional and single-valued (functional) by default (see owl:FunctionalProperty in OWL2 Primer: Property Characteristics).

We validate the following constraints:

  • 0<=min<=max

  • max>=1

  • When inheriting a property from a super-class:

    • You can tighten the cardinality interval min..max (keep or increase min and keep or decrease max).

    • You cannot change a multi-valued property (max>1) to single-valued (max=1) due to GraphQL restrictions (section Objects states this compatibility condition: “If it is a List type and the interface field type is also a List type” where “it” refers to the subclass, “interface” refers to the superclass, and “List type” means a multi-valued property.)

In generated GraphQL type, min and max are represented by a constraint directive added to the field used to validate the input data for the property when executing mutations. For example, the following definition:

Droid:
  props:
    primaryFunction: {label: "primary function", min: 1, max: 1}

will produce GraphQL types like:

type Droid {
  "primary function"
  primaryFunction: String @constraints(minCount : 1, maxCount : 1)
}

input Droid_Create_Input {
  primaryFunction: String!
}

where type Droid is the output type for all Droid objects, and Droid_Create_Input is the type used in the mutations when creating Droid objects.

Please note that multi-valued properties generate significantly more expensive SPARQL queries (see Queries). Each such property causes:

  • A new iterative sub-query, if accessed with limit, offset, or order.

  • A new UNION clause, if accessed “unadorned”.

Therefore we recommend using single-valued properties whenever possible. However, if you mis-declare a property as single-valued while the data includes multiple values, this will cause the following problems:

  • The generated SPARQL will involve a Cartesian Product i.e., a combinatorial explosion of all value combinations of the mis-declared fields.

  • A limit on the parent object may return the wrong (fewer) number of objects.

Nullability

The characteristic that controls the nullability of a property is called nonNullable. It is used in the generation of the GraphQL schema, more specifically that of the output types (query responses). The main purpose is to provide information on how a specific property should be handled in situations where it is requested but there is no value for it - see GraphQL Errors and Non-Nullability.

In order to ensure compliance with the specification for fields (properties), the default value for nonNullable is false. This means that null is acceptable when the property is requested but its value is missing.

Following the GraphQL nullability specification, if a property is marked as non-nullable (nonNullable: true) and there is no value for it, it will result in an error when requested. The error will be propagated further into the selection chain until a nullable selection is found. All errors will be applied to the result set.

Note

Keep in mind the difference between the nonNullable, min, and max characteristics. nonNullable affects the result when querying properties, while min and max affect the property when it is inputted using mutation.

Multi-valued (max > 1) properties can have an additional characteristic nonNullableElements that controls whether the elements of the returned collection can be null themselves. This can be combined with the nonNullable characteristic to produce the following GraphQL schema and behavior:

nonNullable

nonNullableElements

GraphQL schema

Description/notes

false

false

[Character]

The entire field can be null, but if it does return a value, it will be an array. However, any member of the array may also be null.

true

false

[Character]!

The field cannot return null, but any individual item in the returned list can be null.

false

true

[Character!]

The entire field can be null, but if it does return a value, it needs to be an array and no item in that array can be null.

true

true

[Character!]!

The field cannot return null, must resolve to an array, and none of the individual items inside that array can be null.

Note

From Ontotext Platform version 3.4.0 onwards, multi-valued properties are no longer always generated as non-nullable arrays, and their nullability can be controlled by the nonNullable and nonNullableElements characteristics.

Immutability

This characteristic has been introduced with version 3.2.0 of the Ontotext Platform.

A property can be made immutable for the GraphQL endpoint if it has the characteristic readOnly set to true. If set, the property will be excluded from the mutation input types for the particular type and its subtypes.

The characteristic can be used to:

  • Mark a property as externally managed.

  • Allow defining materialized computed values into the database.

  • Define a fixed alias of a predicate but with different configurations.

With the introduction of this characteristic, the name property of the Nameable interface is marked as read-only, and as such is removed from all mutations. As this is a major change, any existing queries must be updated to use the aliased property instead. Check the Migration guide on how to update any existing queries.

Inverses

When creating a knowledge graph, it is important to allow navigation (connectivity) from each part of the graph to each other part. RDF and SPARQL allow every property to be navigated in either direction: given a node :y, you can find its incoming links :p using the triple pattern ?x :p :y, or the SPARQL property path :y ^:p ?x. (This innate bidirectional connectivity has led the creators of the PROV-O ontology to recommend not creating pairs of inverse properties so as to reduce duplication.)

But that is not the case in GraphQL, which can navigate only explicit relations (in the forward direction). This is why we provide a property characteristic to introduce virtual inverses:

  • inverseAlias (property): defines this as a virtual inverse of the indicated property, i.e., this is only an alias that is not stored in the RDF repository.

    • If p: {inverseAlias: q} then a GraphQL query for p will generate a SPARQL property path ^q.

    • You still need to define all characteristics of p, including range, min, max.

    • The platform makes several consistency checks, e.g., that q is defined in the class that is the range of p (or a super-class), and the range of q is the domain of p (or a super-class).

    • If different object types can link to the current object using q, use rangeCheck to ensure that p will select only one of them.

There are also some characteristics that declare real inverses, i.e., inverse triples stored in the repository:

  • inverseOf (property): declares two props to be inverses of each other, i.e., with proper inference each will be inferred from the other.

  • symmetric (boolean): declares this prop to be symmetric, i.e., with proper inference each direction will be inferred from the other.

Inverses and rangeCheck

There is a specific rule in place that will prevent creating a property with inverseAlias characteristic in combination with rangeCheck: false. If the property is declaring inverseAlias, it is mandatory for the rangeCheck to be true in order to prevent mixing of data about different types in cases when the inverse property is used for multiple types or is inherited by a common abstract type. If the rangeCheck is not set explicitly, it will be generated with default value true. This is an special case when both characteristics are used. The below example demonstrates what exactly is happening, and the reason for enforcing such behavior.

Having the following schema, data, and query:

objects:
  Character:
    props:
      actors: {inverseAlias: character, max: inf, range: Actor, rangeCheck: false }
      films: {inverseAlias: character, max: inf, range: Film, rangeCheck: false }
  Actor:
    props:
      character: {range: Character, max: inf}
  Film:
    props:
      character: {range: Character, max: inf}
<actor/MarkHammil> a voc:Actor;
  voc:character <character/LukeSkywalker>.

<film/StartWarsIV> a voc:Film;
  voc:character <character/LukeSkywalker>.

<character/LukeSkywalker> a voc:Character.
character {
    id
    films {
        id
    }
}

Will result in the following result:

{
  "data":{
    "character":[
      {
        "id":"/character/LukeSkywalker",
        "films":[
          {
            "id":"/actor/MarkHammil"
          },
          {
            "id":"/film/StartWarsIV"
          }
        ]
      }
    ]
  }
}

As you can see, the section containing the films has an additional result for actor, which should not be present. This behavior is caused because the filtering by range is not applied when the data is queried from the data store. When the rangeCheck characteristic is set to true, that filtering will be applied and the retrieved result will be correct.

Note

The schema used in the example above will be detected as invalid as it violates the rule for properties with inverseAlias and rangeCheck characteristics. It is used only for the purpose of the example.

Pattern and Prefix

A couple of characteristics restrict the possible string values of a property:

  • regex: unanchored regex pattern (can use ^...$ to anchor it). Deprecated in favour of pattern. Will be removed in a later release. Information on the removal, including its exact time, will be made public prior to removal.

  • prefix: string (not a regex) that is anchored to the start, similar to ShEx “stem”. Will be deprecated in a later release.

  • pattern: an array containing the unanchored regex pattern and any regex flags that should be used with it. If a string is passed, it assumes that it is the regex string and no flags are set. Checked via sh:pattern in SHACL.

These can be used for both data properties and IRIs.

  • If the property has range iri, you can use an IRI relative to base_iri.

  • If you want to check the IRI of an object property, use the same characteristics on the target object’s class.

Language Configurations

The lang characteristic can be applied only to properties of range langString or stringOrLangString. It can be used for defining defaults for:

  • Default language to use during fetching of the properties using the fetch sub-characteristic

  • A validation language spec that will be used to restrict the values that are allowed to be written in that property using the validate sub-characteristic

  • A default language to apply to String literals during property value insert using the implicit sub-characteristic

The characteristic can have two forms:

  • Short form lang: en - allows setting only the fetch sub-characteristic. This is equivalent to lang: {fetch: en}

  • Full form - any of the sub-characteristics can be set

The characteristic also supports inheritance the following way: if the spec is not present in the sub-property, it is inherited as a whole, and if present, then it is merged characteristic by characteristic. The same rule is applied to referenced fields defined in the properties section.

If the lang spec is not defined in a property and is not inherited from a parent, it will be initialized with a value found in the config section of the current schema, where a global schema configuration can be found for the lang configuration characteristics.

Here is an example of language configuration usage:

config:
  lang: {validate: UNIQ}

properties:
  desc: {range: stringOrLangString, lang: en}

objects:
  Character:
  kind: abstract
    props:
      desc: {lang: {implicit: en}}
  Human:
    inherits: Character
    props:
      desc: {lang: {fetch: 'en,fr', validate: "en,fr;UNIQ"}}

Here, we have defined a property desc with range stringOrLangString to have a default fetch characteristic to English using the short format. When we apply the inheritance rules, we will end up with an effective schema as the one below:

config:
  lang: {validate: UNIQ}

properties:
  desc: {range: stringOrLangString, lang: {fetch: en, validate: UNIQ}}

objects:
  Character:
    props:
      desc: {lang: {fetch: en, validate: UNIQ, implicit: en}}
  Human:
    inherits: Character
    props:
      desc: {lang: {fetch: 'en,fr', validate: "en,fr;UNIQ", implicit: en}}

From this example, we can see that Character.desc got the fetch: en characteristic as well as the validate: UNIQ from the global configuration, and the Human.desc inherited the implicit value from the Character while overriding the other two configurations.

If a subtype needs to remove any of the inherited characteristics, it can use the following format:

  • lang: "" will remove the inherited fetch spec without affecting the other two

  • lang: {validate: ""} will remove the inherited validate spec

  • lang: {} will reset all three specs

For more information on how to use the different lang specs, see the tutorials:

Predefined Properties

There are some built-in (predefined) props that you can use in GraphQL and should not redefine in SOML instances.

"Semantic object (GraphQL) type. Determined by object discriminators (typeProp, type). Mandatory, single-value"
__typename: String!
"Objects: rdf:type, literals: datatype", range: iri, max: inf, descr: "Most but not all nodes have some. Optional, multi-value"
type:   [ID]!
"Pref name of an object to represent it. This abstracts from the specific RDF prop used as a label, e.g. x:prefName, skos:prefLabel, rdfs:label. Optional in interface Nameable, single-value"
name: String

See Interface Object and Interface Nameable for details.

Property Examples

In the following example from a family relations domain, we highlight the use of inverseAlias:

properties:

objects:
  Person:
    props:
      parent:   {range: Person}
      mother:   {range: Female, rangeCheck: true}
      father:   {range: Male,   rangeCheck: true}
      child:    {range: Person, inverseAlias: parent, max: inf}
      daughter: {range: Female, inverseAlias: parent, rangeCheck: true, max: inf}
      son:      {range: Male,   inverseAlias: parent, rangeCheck: true, max: inf}
      sibling:  {range: Person, symmetric: true, max: inf}
      sister:   {range: Female, rangeCheck: true, max: inf}
      brother:  {range: Male,   rangeCheck: true, max: inf}
  Male:
    inherits: Person
  Female:
    inherits: Person
  • parent, sibling are present in the underlying data.

  • child is defined as an inverseAlias, so it can be accessed without being present in the data.

  • sibling is declared symmetric (both triples of the pair should be present or inferred in the repository).

  • The gender-dependent variants mother, father, daughter, son, sister, brother are declared with rangeCheck so they should sub-set the respective super-property (parent, child, sibling) to comply with the indicated range.

Below are some example props from the Company Graph domain. Remember that string is the default range, so it does not need to be specified. We start with some simple characteristics like name and description (string), industry/technology (classifications), founded/closed (date).

prefName:              {label: "Preferred name",   min: 1,   descr: "A single selected name"}
altName:               {label: "Alternative name", max: inf, descr: "Former or Trade names"}
description:           {label: "Description"}
industry:              {label: "Industry",   range: Industry,   max: inf}
technology:            {label: "Technology", range: Technology, max: inf}
foundedOn:             {label: "Founded on", range: date}
closedOn:              {label: "Closed on",  range: date}

Let’s define some IRIs that point to a company profile on an external site:

  • We define prefix patterns so the IRIs can be validated.

  • To a define a pattern, you often need to use \ (e.g., to escape . or use a special construct like \w), so it is best to use single quotes in YAML.

crunchbaseUrl:         {label: "CrunchBase URL", range: iri, pattern: '^https?://www\.crunchbase\.com/organization/'}
dbpediaUrl:            {label: "DBPedia URL",    range: iri, prefix: "http://dbpedia.org/resource/"}
facebookUrl:           {label: "Facebook URL",   range: iri, prefix: "https://www.facebook.com/"}
linkedinUrl:           {label: "LinkedIn URL",   range: iri, pattern: '^https?://www\.linkedin\.com/company-beta'}
twitterUrl:            {label: "Twitter",        range: iri, prefix: "http://twitter.com/"}
wikidataUrl:           {label: "Wikidata",       range: iri, pattern: ['http://www.wikidata.org/entity/Q\d+', 'i'] }
websiteUrl:            {label: "Website",        range: iri}

Now let’s define some lookup lists (nomenclatures) following the SKOS ontology. level and order are extra custom properties.

skos:inScheme:         {label: "In scheme", min: 1, range: ConceptScheme}
skos:notation:         {label: "Notation", descr: "Code of a Concept"}
skos:prefLabel:        {label: "Preferred label", min: 1}
skos:broader:          {label: "Broader",        max: inf, range: Concept, descr: "Broader concept in same ConceptScheme"}
skos:broaderMatch:     {label: "Broader match",  max: inf, range: Concept, descr: "Broader concept in different ConceptScheme"}
skos:narrower:         {label: "Narrower",       max: inf, range: Concept, descr: "Narrower concept in same ConceptScheme",      inverseOf: skos:broader}
skos:narrowerMatch:    {label: "Narrower match", max: inf, range: Concept, descr: "Narrower concept in different ConceptScheme", inverseOf: skos:broaderMatch}
level:                 {label: "Level", range: integer, descr: "Concept level in its ConceptScheme hierarchy"}
order:                 {label: "Order", range: integer, descr: "Tree order of the concept"}

We can represent geographic information using GeoNames. It contains plenty of data, but in SOML we capture only a few props:

  • featureCode is a list of over 650 place types such as A.PCLI “independent political entity” (i.e., country) or A.ADM1 “first-order administrative division” (e.g., US state). Later we will use this to capture three main object types of interest: Country, Region, City.

  • featureCode is a super-type, such as A “Administrative Boundary Features (country, state, region,…)”.

  • a primary name and multiple alternateName in various languages.

  • ancestor links to the level of region (parentADM1), country (parentCountry), and any levels (parentFeature). These are mandatory only within the object class they are applied to (e.g., City has parentADM1 but State and Country do not).

  • countryCode which is present in all administrative places but not natural features such as oceans and mountains.

  • population, lat, long are represented as mere strings in the data, so we cast them to appropriate datatypes.

gn:name:               {label: "Name", range: string, min: 1}
gn:alternateName:      {label: "Alt name", range: langString, max: inf}
gn:featureClass:       {label: "Feature class", range: iri, min: 1}
gn:featureCode:        {label: "Feature code",  range: iri, min: 1}
gn:parentADM1:         {label: "Parent state",   range: Geoname, min: 1}
gn:parentCountry:      {label: "Parent country", range: Geoname, min: 1}
gn:parentFeature:      {label: "Ancestor place", range: Geoname, min: 1, max: inf, descr: "Includes all ancestors"}
gn:countryCode:        {label: "Country code", range: string}
gn:population:         {label: "Population", range: integer, typeCast: true}
wgs84:lat:             {label: "Latitude",   range: decimal, typeCast: true}
wgs84:long:            {label: "Longitude",  range: decimal, typeCast: true}