GraphQL Search Query Tutorial¶
What’s in this document?
The following sections provide examples of Star Wars GraphQL queries and responses.
If you have started the services following the Quick Start guide, you can also try out the queries and modify them as you see fit.
All examples in this section are based on the Star Wars dataset.
Queries¶
The GraphQL implementation of the Search Service allows you to invoke queries with syntax nearly identical to that of Elasticsearch.
Basic¶
The most basic search query you can execute is:
query getWookiees { wookiee_search { max_score hits { score wookiee { id name rdfs_label {value lang} } } } }
As you can see, the query response contains all wookiees
present in the dataset, their Elasticsearch score, and the requested fields in the sub-selection.
Compound Queries¶
Compound queries wrap other compound or leaf queries with the purpose of either combining their results and scores, changing their behavior, or switching from query to filter context.
Boolean Queries¶
The following GraphQL query shows a simple term query that finds all characters with height
of 177.
query character_query { character_search( query: { term: { height: { value: "177" } } } ) { max_score hits { score character { id name height } } } }
You can mix different query types using Boolean queries. For example, this GraphQL query will find
all characters with name
“Lando Calrissian” and height
of 177:
query character_query_bool { character_search( query: { bool: { must: [{ match: { name: { query: "Lando Calrissian" } } }, { term: { height: { value: "177" } } }] } } ) { max_score hits { score character { id name height } } } }
We can make a query to the character
index for a name matching "Luke SkywalekS"
with fuzziness
and fuzzy_transpositions
, and we can boost the found results score by 4.0
:
query test { character_search( query: { bool: {must: {match: {name: {query: "Luke SkywalekS" fuzziness: "2" boost: 4.0 fuzzy_transpositions: true } }}} }) { max_score hits { score character { id name } } } }
You can limit the results by using different query types like Match queries, Wildcard queries, or Term queries.
For example, the following GraphQL query will return all characters whose type
contains “Droi”, or have a birthYear
19BBY
, or have name
Jek Tono Porkins
.
query qq { character_search( query: { bool: {should: [ {match: {name: {query: "Jek Tono Porkins" boost: 6.0 } } }, {match: {birthYear: {query: "19BBY" } } }, {wildcard: {type: {value: "*Droi*" boost: 8.0 } } } ]} } size: 1000 ) { max_score hits { score character { id name type birthYear } } } }
Function Score Queries¶
The function_score allows you to modify the score of documents that are retrieved by a query. This can be useful if, for example, a score function is computationally expensive and it is sufficient to compute the score on a filtered set of documents.
Here is an example where we return all Characters
, but boost the results by 5 of those who have either “white”, “blue”, or “pale” skinColor
:
{ character_search(query:{ bool: { must: {match_all:{}} should: { function_score:{ query:{ terms: {skinColor: ["white", "blue", "pale"]}} boost: 5 random_score:{} boost_mode:multiply } } } } size: 20){ total hits{ score character{ id name skinColor } } } }
Term-level Queries¶
You can use term-level queries to find documents based on precise values in structured data.
Types of Term-level Queries¶
Fuzzy queries and fuzziness parameter in other queries are also supported.
They allow finding results containing terms similar to the search terms.
For example, the following fuzzy query will result in finding characters with green
eyeColor although the search term contains switched, missing, or extra characters:
query fuzzy { character_search( query: { fuzzy: { eyeColor: { value: "brene" fuzziness: "2" transpositions: true } } } size: 1000 ) { max_score hits { score character { name eyeColor } } } }
Exists queries return documents that contain an indexed value for a field.
For example, the following GraphQL query will return only Characters
that have indexed value for cybernetics
. Note that cybernetics
is property that belongs to the
Human
subclass, and not Character
, so in the qeury
input we refer to it as Human_cybernetics
and in the request we get it using inline fragment ``… on Human ``:
{ character_search(query:{ exists:{ field: Human_cybernetics, boost: 5 } }){ hits{ score character{ id name ... on Human { cybernetics } } } } }
Term queries returns documents that contain an exact term in a provided field.
You can use the term query to find documents based on a precise value such as a price, a product ID, or a username.
{ character_search( query: { term: { skinColor: { value: "white" } } } ) { total hits { score character { id name skinColor } } } }
Terms queries are the same as the term query, except you can search for multiple values.
{ character_search( query:{ terms: { skinColor:["white", "blue"] } } size: 15 ){ total hits{ score character{ id name skinColor } } } }
Match_all Query¶
You can also use Match_all queries to match all documents from a given indexed object type.
For example, the following GraphQL query will return all indexed Films
(max 1,000) with their score boosted to 5.0:
query allFilms { film_search( query: { match_all: {boost: 5.0} } size:1000 ) { hits { score film { id name } } } }
Boost¶
Some matching results can be boosted using the Elasticsearch boost
parameter. For example, the query below will return all characters with type Human
or with height
over 190.
The score of the Human
characters will be boosted by 5.0
and the score of the characters with height
will be boosted by 10.0
.
The score of the characters that are both Human
and taller than 190 will be the highest, followed by the taller non-Human
characters, followed by the Human
characters.
query boost { character_search( query: { bool: {should: [ {match: {type: {query: "https://swapi.co/vocabulary/Human" boost: 5.0 } } }, {range: {height: {gt: "190" boost: 10.0 } } } ]} } size: 4 ) { max_score hits { score character { name type height } } } }
Sorting¶
The following GraphQL query shows how to apply results sorting on the height
field in descending order:
query character_sorting { character_search( sort: { height: { order: desc } } ) { max_score hits { score character { id name height } } } }
Currently, the following sorting options are supported:
order
mode
missing
numeric_type
See more about Elasticsearch sorting here.
Paging¶
This GraphQL query shows how to apply paging to the response.
The size
argument defines the page size, while from
defines the amount of records to skip:
query character_query { character_search( sort: { height: { order: desc } } size: 6 from: 10 ) { max_score hits { score character { id name height } } } }
Note
By default, the size of the response is limited to 10.
Aggregations¶
Elasticsearch aggregations summarize data as metrics, statistics, or other analytics. The Ontotext Platform supports two aggregation types: metrics and bucket aggregations, as well as sub-aggregations between them. Let’s have a look at them in more detail.
Metrics Aggregations¶
Metrics aggregations compute metrics based on values extracted in one way or another from the documents that are being aggregated. The values are usually extracted from the fields of the document using the field data. We will take a look at some of them below.
Avg aggregations are single-value metrics aggregations that compute the average of numeric values that are extracted from the aggregated documents. This query shows an aggregation over the average height
of Characters
.
{ character_search ( aggs: [ { name: "avg01" value: { avg: { field: "height" } } } { name: "avg01missing" value: { avg: { field: "height" missing: 50 } } } ] ) { aggregations } }
Cardinality aggregations are single-value metrics aggregation that calculate an approximate count of distinct values. This query aggregates the cardinality of different Character
fields.
{ character_search ( aggs: [ { name: "homeworld nested" value: { nested: { path: "homeworld" } aggs: { name: "homeworld cardinality" value: { cardinality: { field: "homeworld.id" precision_threshold: 10 } } } } } { name: "gender cardinality" value: { cardinality: { field: "gender" missing: "N/A" } } } ] ) { aggregations } }
Max aggregations are single-value metrics aggregation that keeps track and returns the maximum value among the numeric values extracted from the aggregated documents. This query aggregates max values of different Character
fields.
{ character_search ( aggs: [ { name: "height max" value: { max: { field: "height" } } } { name: "height max missing" value: { max: { field: "height" missing: 1000 } } } { name: "height max script" value: { max: { field: "height" script: { lang: "painless" source: "if (_value != null) return _value * params.multiplier; return -1;" params: {multiplier: 3} } } } } ] ) { aggregations } }
Min aggregations are single-value metrics aggregations that keep track and return the minimum value among numeric values extracted from the aggregated documents. This query aggregates min values of different Human
fields.
{ character_search ( query:{ term: { type: { value: "https://swapi.co/vocabulary/Human" } } } aggs: [ { name: "height min" value: { min: { field: "height" } } } { name: "height min missing" value: { min: { field: "height" missing: 10 } } } { name: "height min script" value: { min: { field: "height" script: { lang: "painless" source: "if (_value != null) return _value * params.multiplier; return 10000;" params: {multiplier: 3} } } } } ] ) { aggregations } }
Percentile ranks aggregations are multi-value metrics aggregations that calculate one or more percentile ranks over numeric values extracted from the aggregated documents. This query aggregates the percentile ranks for averageHeight
of Species
.
{ species_search ( aggs: [ { name: "averageHeight percentile ranks" value: { percentile_ranks: { field: "averageHeight" values: [70, 100, 200] } } } ] ) { aggregations } }
Percentiles aggregations are multi-value metrics aggregations that calculate one or more percentiles over numeric values extracted from the aggregated documents. This query aggregates the percentiles for Character
mass
with different Elasticsearch options: missing value, keyed response, HDR histogram, compression.
{ character_search ( aggs: [ { name: "mass percentile ranks" value: { percentiles: { field: "mass" } } } { name: "mass percentile ranks missing" value: { percentiles: { field: "mass" missing: 300 } } } { name: "mass percentile ranks not keyed" value: { percentiles: { field: "mass" keyed: false } } } { name: "mass percentile ranks hdr" value: { percentiles: { field: "mass" hdr: { number_of_significant_value_digits: 1 # returned value accuracy is lowered by this , so 99.9 percentile gets higher than the max mass value } percents: [ 95, 99, 99.9 ] } } } { name: "mass percentile ranks compression" value: { percentiles: { field: "mass" tdigest: { compression: 101 } } } } ] ) { aggregations } }
Stats aggregations are multi-value metrics aggregations that compute stats over numeric values extracted from the aggregated documents. This query aggregates stats for Character
height
and mass
.
{ character_search ( aggs: [ { name: "character height stats" value: { stats: { field: "height" } } } { name: "character height stats missing" value: { stats: { field: "height" missing: 100 } } } { name: "character mass stats" value: { stats: { field: "mass" } } } ] ) { aggregations } }
String stats aggregations are multi-value metrics aggregations that compute statistics over string values extracted from the aggregated documents. This query aggregates string stats for Character gender
by field
, by script
, using missing
, with show_distribution
.
{ character_search ( aggs: [ { name: "character gender string_stats" value: { string_stats: { field: "gender" } } } { name: "character gender string_stats missing" value: { string_stats: { field: "gender" missing: "droiddddddddddddddddd" } } } { name: "character gender string_stats script" value: { string_stats: { script: { source: "if (doc.gender.size() != 0) doc.gender.value" } } } } { name: "character gender string_stats with distribution" value: { string_stats: { field: "gender" show_distribution: true } } } ] ) { aggregations } }
Sum aggregations are single-value metrics aggregations that sum up numeric values that are extracted from the aggregated documents. This query aggregates sum for Character height
and mass
.
{ character_search ( aggs: [ { name: "character height sum" value: { sum: { field: "height" missing: 100 } } } { name: "character mass sum" value: { sum: { field: "mass" } } } ] ) { aggregations } }
Top metrics aggregations select metrics from the document with the largest or smallest “sort” value. This query aggregates a character’s height for top mass
and height for top _score
for Characters with eyeColor
matching “blue”, “gray”, “yellow”, or “green”.
{ character_search ( query: { terms: { eyeColor: ["blue", "gray", "yellow", "green"] } } aggs: [ { name: "character's height for top mass" value: { top_metrics: { metrics: [ {field: "height"} {field: "mass"} ] sort: {mass: desc} } } } ] size: 2 ) { aggregations hits{ score character{ id name height mass eyeColor } } } }
Value_count aggregations are single-value metrics aggregations that count the number of values that are extracted from the aggregated documents. This query aggregates the value count of the residents
of a planet
(requires nestingLevel >=1
).
{ planet_search ( aggs: [ { name: "planet" value: { terms:{ field: "id" } aggs:{ name: "residents stats" value: { nested: { path: "resident" } aggs:{ name: "residentStats" value: { value_count: { field: "resident.id" } } } } } } } ] ) { aggregations } }
Weighted avg aggregations are single-value metrics aggregations that compute the weighted average of numeric values that are extracted from the aggregated documents. This query aggregates the weighted average for Film
boxOffice
weighted by cost
.
{ film_search ( aggs: [ { name: "film" value: { terms:{ field: "id" } aggs:[ { name: "boxOffice wighted by cost" value: { weighted_avg: { value: { field: "boxOffice" } weight: { field: "cost" } } } } { name: "boxOffice wighted by cost with missing" value: { weighted_avg: { value: { field: "boxOffice" missing: "333" } weight: { field: "cost" missing: "1" } value_type: "double" } } } ] } } ] ) { aggregations } }
Bucket Aggregations¶
Bucket aggregations create buckets of documents. Depending on the aggregation type, each bucket is associated with a criterion that determines whether or not a document in the current context falls into it. Bucket aggregations also compute and return the number of documents that fell into each bucket. Let’s take a look at some of them below.
Date range aggregations are range aggregations dedicated for date values.
The following GraphQL query shows the bucket aggregation for awardRecognition
objects over awardDate
field.
{ awardRecognition_search ( aggs: { name: "awardDate_aggs" value: { date_range: { field: "awardDate" format: "yyyy-MM-dd" ranges: [ {to: "now-30y/y"} {from: "now-30y/y", to:"now-20y/y"} {from: "now-20y/y", to:"now-10y/y"} {from: "now-10y/y"} ] } } } ) { aggregations } }
Filter aggregations define a single bucket of all the documents in the current document set context that match a specified filter.
This query shows an aggregation that will return the average mass of characters with height
of 183.0.
{ character_search ( aggs: { name: "filter" value: { filter: { term: {height: {value: "183.0"}} } aggs:[ { name: "avg_mass" value: {avg: {field: "mass"}} } ] } } ) { aggregations } }
Missing aggregations are field data based single bucket aggregations that create a bucket of all documents in the current document set context that are missing a field value. This query shows an aggregation that will return the average height of characters without mass.
{ character_search ( aggs: { name: "missing_mass" value: { missing: { field: "mass"} aggs: { name: "avg_height" value: {avg: {field: "height"}} } } } ) { aggregations } }
Nested aggregations are special single bucket aggregations that enable aggregating nested documents. This query will return the string_stats
location.type
for film/4
.
{ film_search ( query: { terms: {id: ["https://swapi.co/resource/film/4"]} } aggs: { name: "location_min_type" value: { nested: { path: "location" } aggs: { name: "min_type" value: { string_stats: {field: "location.type"} } } } } size: 100 ) { aggregations hits{ film{ id name location { id type } } } } }
Range aggregations are multi-bucket value source based aggregations that enable you to define a set of ranges, each representing a bucket.
The next query shows an aggregation for characters with height
: under 170, from 170 to 190, from 190 to 200, 200 and above.
{ character_search ( aggs: { name: "height_ranges" value: { range: { field: "height" ranges: [ {to: 170} {from: 170, to:190} {from: 190, to:200} {from: 200} ] } } } ) { aggregations } }
Terms aggregations are multi-bucket value source based aggregations where buckets are dynamically built, one per unique value. This query shows an aggregation over homeworld
with a limit of 5 groups with default sorting, with count in descending order.
{ character_search ( aggs: { name: "terms1" value: { nested: { path: "homeworld" } aggs: { name: "terms1" value: { terms: { field: "homeworld.id" size: 5 show_term_doc_count_error: true } } } } } ) { aggregations } }
This is another example of terms aggregation over height
sorted by the count of the aggregation in ascending order.
{ character_search ( aggs: { name: "terms2" value: { terms: { field: "height" order: { _count: "desc" } size: 16 shard_size: 5 } } } size: 100 ) { aggregations } }
Another terms aggregation over gender
excluding values that match the pattern fem.*
.
{character_search ( aggs: { name: "terms2" value: { terms: { field: "gender" size: 15 exclude: "fem.*" } } } size: 100 ) { aggregations } }
And a final terms aggregation query over root object bucket per type
, stats for height
, mass
, diameter
, and length
.
{ root_search ( aggs: { name: "rootTerms" value: { terms: { field: "type" size: 19 show_term_doc_count_error: true } aggs: [ { name: "height_stats" value: {stats: {field: "height"}} } { name: "mass_stats" value: {stats: {field: "mass"}} } { name: "diameter_max" value: {max: {field: "diameter"}} } { name: "length_max" value: {max: {field: "length"}} } ] } } ) { aggregations } }
Sub-aggregations¶
Bucket aggregations support bucket or metric sub-aggregations. For example, a terms aggregation with an avg sub-aggregation calculates an average value for each bucket of documents. There is no level or depth limit for nesting sub-aggregations.
This query uses metrics aggregations in sub-aggregation: create buckets by Planet id
; calculate min
, max
, average
, percentiles
for resident.height
for each planet
.
{ planet_search ( aggs: [ { name: "planet" value: { terms:{ field: "id" size: 10 } aggs:{ name: "residents" value: { nested: { path: "resident" } aggs:[ { name: "residentAvgHeight" value: { avg: { field: "resident.height" } } } { name: "residentMinHeight" value: { min: { field: "resident.height" } } } { name: "residentMaxHeight" value: { max: { field: "resident.height" } } } { name: "residentPercentilesHeight" value: { percentiles: { field: "resident.height" } } } ] } } } } ] ) { aggregations } }
This query shows sub-aggregations for character terms
-film
/terms
-gender
.
{ character_search ( aggs: { name: "film" value: { nested: { path: "film" } aggs: { name: "film_terms" value: { terms: { field: "film.id" } aggs: { name: "reverse_nested" value: { reverse_nested: {} aggs: { name: "gender" value: { terms: { field: "gender" missing: "probably_droid" } } } } } } } } } ) { aggregations } }
The following GraphQL query shows how to aggregate the height
field with different aggregation
types: the count of the different heights, the min value of all heights, and the max value:
query character_aggregations { character_search( aggs: [{ name: "heights", value: { terms: { field: "height" } } }, { name: "max-height", value: { max: { field: "height" } } }, { name:"min-height", value: { min: { field: "height" } } }] ) { max_score aggregations } }
You can explore the GraphQL schema to see all available query and aggregation input types. This can be done with any GraphQL introspection tool, for example GraphiQL or GraphQL Playground.
You can also mix together queries, sorting, and aggregations as long as they are valid according to the Elasticsearch DSL.
query character_aggregations { character_search( query: { match: { eyeColor: { query: "blue" } } } from: 5 size: 5 sort: { height: { order: desc } } aggs: [{ name: "heights", value: { terms: { field: "height" } } }, { name: "max-height", value: { max: { field: "height" } } }, { name:"min-height", value: { min: { field: "height" } } }] ) { max_score hits { score character { id name } } aggregations } }