Datatypes

SOML predefines some common datatypes and their mappings to RDF (XML) and GraphQL. Their configuration file is located in meta-model/models. The mapping complies with TopQuadrant GraphQL-SHACL mapping. The Platform implements a subset of the XML built-in datatypes, which are also used in RDF. They are highlighted in red below:

../_images/platform-XSD-types.png

These types are mapped between SOML, RDF (XML), and GraphQL, as shown below. Currently, you cannot define your own datatypes.

types:
  # GraphQL builtin types
  int:                {rdf: 'xsd:int',                graphql: Int,                descr: "Signed 32‐bit integer"}
  double:             {rdf: 'xsd:double',             graphql: Float,              descr: "Signed double-precision 64-bit floating point (IEEE 754-1985)"}
  string:             {rdf: 'xsd:string',             graphql: String,             descr: "Unicode string, default RDF and SOML datatype"}
  boolean:            {rdf: 'xsd:boolean',            graphql: Boolean,            descr: "True/false"}
  iri:                {rdf: 'rdfs:Resource',          graphql: ID,                 descr: "IRI of object or external resource (RFC 3987)"}

  # GraphQL extension types
  long:               {rdf: 'xsd:long',               graphql: Long,               descr: "Signed 64‐bit integer",                                      graphqlExtension: true}
  short:              {rdf: 'xsd:short',              graphql: Short,              descr: "Signed 16‐bit integer",                                      graphqlExtension: true}
  byte:               {rdf: 'xsd:byte',               graphql: Byte,               descr: "Signed 8‐bit integer",                                       graphqlExtension: true}
  unsignedLong:       {rdf: 'xsd:unsignedLong',       graphql: UnsignedLong,       descr: "Unsigned 64‐bit integer",                                    graphqlExtension: true}
  unsignedInt:        {rdf: 'xsd:unsignedInt',        graphql: UnsignedInteger,    descr: "Unsigned 32‐bit integer",                                    graphqlExtension: true}
  unsignedShort:      {rdf: 'xsd:unsignedShort',      graphql: UnsignedShort,      descr: "Unsigned 16‐bit integer",                                    graphqlExtension: true}
  unsignedByte:       {rdf: 'xsd:unsignedByte',       graphql: UnsignedByte,       descr: "Unsigned 8‐bit integer",                                     graphqlExtension: true}
  decimal:            {rdf: 'xsd:decimal',            graphql: Decimal,            descr: "Decimal, unlimited-precision number",                        graphqlExtension: true}
  integer:            {rdf: 'xsd:integer',            graphql: Integer,            descr: "Integer, unlimited digits",                                  graphqlExtension: true}
  positiveInteger:    {rdf: 'xsd:positiveInteger',    graphql: PositiveInteger,    descr: "Positive integer (>0), unlimited digits",                    graphqlExtension: true}
  nonPositiveInteger: {rdf: 'xsd:nonPositiveInteger', graphql: NonPositiveInteger, descr: "Non-positive integer (<=0), unlimited digits",               graphqlExtension: true}
  negativeInteger:    {rdf: 'xsd:negativeInteger',    graphql: NegativeInteger,    descr: "Negative integer (<0), unlimited digits",                    graphqlExtension: true}
  nonNegativeInteger: {rdf: 'xsd:nonNegativeInteger', graphql: NonNegativeInteger, descr: "Non-negative integer (>=0), unlimited digits",               graphqlExtension: true}
  negativeFloat:      {rdf: 'xsd:float',              graphql: NegativeFloat,      descr: "An Float scalar that must be a negative value",              graphqlExtension: true}
  nonNegativeFloat:   {rdf: 'xsd:float',              graphql: NonNegativeFloat,   descr: "An Float scalar that must be greater than or equal to zero", graphqlExtension: true}
  positiveFloat:      {rdf: 'xsd:float',              graphql: PositiveFloat,      descr: "An Float scalar that must be a positive value",              graphqlExtension: true}
  nonPositiveFloat:   {rdf: 'xsd:float',              graphql: NonPositiveFloat,   descr: "An Float scalar that must be less than or equal to zero",    graphqlExtension: true}
  dateTime:           {rdf: 'xsd:dateTime',           graphql: DateTime,           descr: "Date and Time: yyyy-mm-ddThh:mm:ss, no timezone",            graphqlExtension: true}
  time:               {rdf: 'xsd:time',               graphql: Time,               descr: "Time: hh:mm:ss, no timezone",                                graphqlExtension: true}
  date:               {rdf: 'xsd:date',               graphql: Date,               descr: "Date: yyyy-mm-dd",                                           graphqlExtension: true}
  year:               {rdf: 'xsd:gYear',              graphql: Year,               descr: "Year: yyyy",                                                 graphqlExtension: true}
  yearMonth:          {rdf: 'xsd:gYearMonth',         graphql: YearMonth,          descr: "Year & Month: yyyy-mm",                                      graphqlExtension: true}

  # Literal and union datatypes
  literal:            {rdf: 'rdf:Literal',             graphql: Literal, descr: "Any RDF literal"}
  langString:         {rdf: 'rdf:langString',          graphql: Literal, descr: "Language-tagged string"}
  stringOrLangString: {union: [string, langString],    graphql: Literal, descr: "string or langString"}
  dateOrYearOrMonth:  {union: [date, year, yearMonth], graphql: Literal, descr: "date or year or yearMonth"}
  • iri is considered rdfs:Resource (an RDF object), rather than a literal with datatype xsd:anyURI. Properties that link to internal resources are declared with a specific object type, not the iri datatype. iri is mapped to the GraphQL built-in type ID. This type is validated according to RFC 3987. We require all objects to have an IRI.

  • double is an IEEE 754 double-precision number. It is mapped to GraphQL Float, which despite its name is a Double number.

  • In addition to the built-in 32-bit Int, we implement 8-bit Byte, 16-bit Short, and 64-bit Long, as well as their unsigned variants.

  • If you need an xsd:float (to be mapped to GraphQL extension single) please send us feedback.

  • We implement unlimited-digits Integer and its variants Positive, Negative, NonPositive, NonNegative.

  • We implement unlimited-precision Decimal. Note the difference between double (built-in but limited) and decimal (infinite precision but more expensive).

Note

All custom GraphQL scalars extensions that are currently provided by The Platform will be returning string representation of the numbers. The main reason for taking this approach is the differences in the support of the numbers for JavaScript, GraphQL and Java. Returning the results as string, gives freedom to the client to decide how to process and represent the number to the end user.

Note

Handling of numbers with leading zeros

GraphQL has issues, when processing numbers with leading zeros, which should be solved with the acceptance of the new version of the GraphQL specification. The specification is in pre-release state and not yet applied in graphql-java library, used to process the queries in The Platform. Once it is applied the numbers with leading zeros will be invalid and should be reported as errors.

GraphQL Extension Datatypes

graphqlExtension datatypes are not GraphQL built-ins. They are declared as GraphQL scalar and are implemented in a supporting library that provides parsing, serialization, and validation of values. For example:

"Decimal infinite-precision number"
scalar Decimal

"Year: yyyy"
scalar Year

"Year & Month: yyyy-mm"
scalar YearMonth

"Date: yyyy-mm-dd"
scalar Date

"Date and Time: yyyy-mm-ddThh:mm:s"
scalar DateTime

Literals and Union Datatypes

RDF Literals consist of a string value, and a datatype IRI (e.g., ^^xsd:integer) or language tag (e.g., @en). Whenever the type is known and fixed we use one of the simpler types (GraphQL built-in or extension). There are, however, many situations where the type is not known in advance or can vary. In such cases the literal must carry its lang tag or datatype.

We declare a GraphQL object type Literal representing an RDF literal with fields value type lang. (Note: TopQuadrant uses a similar approach for LangString, but our approach is more general):

"Literal value"
type Literal @descr(_:"Includes optional datatype and language-tag (but not both)") {
  "Value"
  value: String!
  "Datatype"
  type: ID
  "Language tag"
  lang: String
}

Both type and lang are optional, allowing the flexibility to represent:

  • Plain string: type is null (we do not use xsd:string, which is sort of redundant), and lang is also null.

  • Datatyped value: type is a datatype IRI, typically from the xsd: namespace.

  • langString: type is null and lang is a valid, case-normalized according to BCP47 section 2.1.1, IANA language tag (as used in XML and RDF).

We use Literal to represent:

  • A generic literal (note: this is a future feature, send us feedback if you need it)

  • A langString

  • Union datatypes, which are useful in situations where data values for the same field come with syntactic differences:

    • Different “precision” (dateOrYearOrMonth).

    • With or without lang tag (stringOrLangString).

    • If you need more union datatypes (e.g., of Numeric types), please send us feedback.

Examples of such literals:

{
  "createdOn": {
    "type": "xsd:gYearMonth",
    "value": "1990-03"
  },
  "prefName": {
    "lang": "de",
    "value": "Du hast Mich"
  }
}

Querying such data in GraphQL is a bit less convenient, e.g.:

{
  company(ID:"...") {
    prefName {value}
    createdOn {value}
}

Lexical vs Value Space

RDF Datatypes have a lexical (string) space, a value (normalized) space, and a mapping between them. For example:

  • Both 1^^xsd:boolean and "true"^^xsd:boolean" (as well as the Turtle shortcut true) map to the same value, the Boolean true.

  • All lexical values "1"^^xsd:integer, "+1"^^xsd:integer, "+01"^^xsd:integer (as well as the respective Turtle shortcuts 1, +1, +01) map to the same value, the integer 1.

  • All lexical values "1"^^xsd:decimal, "+1.0"^^xsd:decimal, "+01.00"^^xsd:decimal (as well as the respective Turtle shortcuts 1.0, +1.0, +01.00) map to the same value, the decimal 1.

  • Both "2019-12-01"^^xsd:date and "002019-12-01"^^xsd:date map to the same date.

If two literals have the same value but different lexical form, then:

  • They compare same with =.

  • They compare same with ... in (...).

  • They compare different with sameTerm().

  • You cannot find a direct triple with one of the literals, if it was recorded with the other literal.

You can check the first three bullets (e.g., for integers) using a SPARQL query like this:

PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
select
  ("1"^^xsd:integer="+1"^^xsd:integer as ?b1)
  ("1"^^xsd:integer="+01"^^xsd:integer as ?b2)
  ("1"^^xsd:integer in ("+01"^^xsd:integer) as ?b3)
  (sameTerm("1"^^xsd:integer,"+01"^^xsd:integer) as ?b4)
where {}

In our translation of GraphQL to SPARQL queries, we take care to eliminate the difference between lexical space and value space. In other words, you can find a literal by any of its lexical forms, regardless of how it was recorded in the database.

We do this by comparing literals for equality =. This is a bit slower than direct triple access, but the GraphDB Literal Index makes it pretty fast.

Timezones

RDF defines three datatypes that can be used with or without timezone (xsd:date xsd:time xsd:dateTime), and one for which the timezone is required (xsd:dateTimeStamp).

According to OWL2 Time Instants, dates and times without timezone are only partially comparable because such a value could denote an absolute value that varies by +/-14 hours. (An OWL DateTime wiki discussion from 2008 considers allowing only xsd:dateTimeStamp in OWL.)

GraphDB compares dateTime as if it had a Z timezone, but date and time without timezone are not comparable to those with timezone. (Every date/time literal is equal to itself, regardless of whether it has a timezone or not.)

PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
select
  ("2019-12-01T04:00:00-05:00"^^xsd:dateTime ="2019-12-01T10:00:00+01:00"^^xsd:dateTime as ?b1)  # true
  ("2019-12-01T10:00:00"^^xsd:dateTime       ="2019-12-01T10:00:00+00:00"^^xsd:dateTime as ?b2)  # true
  ("2019-12-01T10:00:00"^^xsd:dateTime       ="2019-12-01T10:00:00-00:00"^^xsd:dateTime as ?b3)  # true
  ("2019-12-01T10:00:00"^^xsd:dateTime       ="2019-12-01T10:00:00Z"^^xsd:dateTime      as ?b4)  # true
  ("2019-12-01T10:00:00"^^xsd:dateTime       ="2019-12-01T10:00:00+02:00"^^xsd:dateTime as ?b5)  # false
  ("2019-12-01T10:00:00"^^xsd:dateTime       ="2019-12-01T10:00:00-02:00"^^xsd:dateTime as ?b6)  # false
  ("2019-12-01"^^xsd:date                    ="2019-12-01"^^xsd:date                    as ?b7)  # true
  ("2019-12-01"^^xsd:date                    ="2019-12-01+00:00"^^xsd:date              as ?b8)  # false
  ("2019-12-01"^^xsd:date                    ="2019-12-01-00:00"^^xsd:date              as ?b9)  # false
  ("2019-12-01"^^xsd:date                    ="2019-12-01+01:00"^^xsd:date              as ?b10) # false
  ("10:00:00"^^xsd:time                      ="10:00:00"^^xsd:time                      as ?b11) # true
  ("10:00:00"^^xsd:time                      ="10:00:00+00:00"^^xsd:time                as ?b12) # false
  ("10:00:00"^^xsd:time                      ="10:00:00-00:00"^^xsd:time                as ?b13) # false
  ("10:00:00"^^xsd:time                      ="10:00:00+02:00"^^xsd:time                as ?b15) # false
  ("10:00:00"^^xsd:time                      ="10:00:00-02:00"^^xsd:time                as ?b16) # false
where {}

Warning

Given these complications, we strongly recommend not mixing date/time values with and without timezone.

Additional Resources

As an addition to the implementation of the custom GraphQL scalars in The Platform, we also provide and support implementation of the same set of scalars in JavaScript. The implementation can be found in our public GitHub repository – ontotext-platform-custom-scalars. The library can be used as standard NPM package with public npm. It will be regularly updated and published whenever any changes to the scalars have been made in the Platform.