Overview and features¶
Ontotext Refine (“Refine”) is a version of the open-source OpenRefine data transformation tool adapted to work with Ontotext GraphDB.
Refine allows fast cleaning, mapping and transformation of any structured data to RDF and loading it to GraphDB.
It supports input from:
- Tabular formats (TSV, CSV, *SV)
- Fixed-width text files
- Excel (XLS, XLSX)
- JSON, JSON-LD, XML
- RDF: XML, Turtle/N3
- Databases (PostgreSQL, MySQL, MariaDB, SQLite)
You can input data from local files, remote URLs, and clipboard snippets.
Refine enables you to:
Create projects and upload your data file(s)
Clean and transform the data using powerful row and column manipulations, faceting, clustering
Implement complex transformations using:
- Expressions and GREL (Google Refine Expression Language)
- GraphDB Functions including SPIN functions
- Combining datasets between Refine projects by using the cross() function.
- Combining multiple repositories and projects using SPARQL Federation and the virtual SPARQL endpoint of each Refine project
- the Refine command line interface
Create a visual RDF mapping of the cleaned data
- The RDF mapping visual UI is optimized to guide you in defining URLs, choosing the right predicates and types, defining datatypes, etc.
- Generate the respective SPARQL query
- Export the RDF data
Expose a virtual SPARQL endpoint that allows you to write complex SPARQL queries
- Export RDF data using a SPARQL
CONSTRUCT
query - Load RDF data to a GraphDB repository using Federated SPARQL
UPDATE
query
- Export RDF data using a SPARQL
Export project configurations and mappings in order to automate a transformation on more data of a similar structure
You can generate Refine queries from semantic models using the open source rdf2rml toolkit