SPARQL To Objects (S2O)
A SPARQL extension for expressing mappings between RDF graphs and JSON objects.

Authors:
Alberto Reggiori, Asemantics S.r.l.
Andrea Marchesini, Asemantics S.r.l.
Version:
0.41 2007-11-08

Abstract

This document describes SPARQL To Objects (nicknamed "S2O"), an extension for mapping RDF graphs to JSON objects. It leverages the SPARQL syntax for expressing graph-patterns matching an input source; and the expressiveness of the JSON syntax to mock up templates for the generated object data-structures.

This document does not discuss any JSON specific object modeling language but the universal one built into JSON itself (a collection of name/value pairs or an ordered list of values). This means that rather than mandating a JSON based object modeling language to express object hierarchies, object types, object identifiers, relationships and so on; most it is left to the user decision when mocking up the JSON data structures as best fit the target application. We expect future specifications to layer more specific semantics on top of JSON structures generated by a S2O engine.

In addition this document does not discuss any protocol details about a specific service/software which might leverage on S2O to express mappings between RDF graphs and other specific object services (E.g. object or relational stores).

Status of This Document

Published for discussion.


Table of Contents

Appendices

1 Introduction

The SPARQL query language provides a way to construct a single RDF graph by specifying a graph template; even though it does not provide a way to construct arbitraries objects structures. S2O is a SPARQL extension to express declarative mappings between RDF graphs and user-defined JSON objects. More generally, it is meant to provide a generic mechanism to express mappings between RDF models and domain/application-specific models.

Using a text-based and human-readable syntax based on the existing SPARQL query syntax and a template syntax based on the JSON grammar, the learning curve for developers and implementation costs are reduced.

SPARQL for JSON aims to:

1.1 Scope and Limitations

SPARQL for JSON:

1.2 Document Conventions

1.2.1 Namespaces

In this document, examples assume the following namespace prefix bindings unless otherwise stated:

Prefix IRI
rdf: http://www.w3.org/1999/02/22-rdf-syntax-ns#
rdfs: http://www.w3.org/2000/01/rdf-schema#
xsd: http://www.w3.org/2001/XMLSchema#

1.2.2 Data Descriptions

This document uses the TURTLE data format to show each triple explicitly. Turtle allows IRIs to be abbreviated with prefixes:

@prefix dc:   <http://purl.org/dc/elements/1.1/> .
@prefix :     <http://sparql4json.org/example/> .
:example  dc:title  "SPARQL for JSON" .

1.2.3 Result Descriptions

Resulting JSON objects illustrated as follow:

[ { "dc:title": "SPARQL for JSON" } ]

2 Mapping RDF triples to JSON objects

S2O mapping queries consist of a SPARQL basic graph pattern and an object template. The basic graph pattern matches a subgraph of the RDF Dataset, while the object template expresses how to generate a JSON structure associated to the matched RDF triples. The result of a mapping query is a mapping sequence, corresponding to the ways in which the query's graph pattern matches the data. There may be zero, one or multiple object mappings to a query. The structure of each result object can be completely different from the actual RDF objects structure of the source. In constructing a result object, RDF terms from the source graph can be filtered and reordered, and arbitrary structure can be added. This mechanism allows a S2O query to be applicable to a wide class of RDF Dataset that have similar structures.

2.1 Examples

This section provides a set of example mappings expressed using the default Universal Object Modeling Language.

2.1.1 Mapping a simple object.

Data:

<http://example.org/book/book1> <http://purl.org/dc/elements/1.1/title> "SPARQL To Objects" .

Mapping Query:

MAPPING JSON {
   "title": ?title
}
WHERE
{
  <http://example.org/book/book1> <http://purl.org/dc/elements/1.1/title> ?title .
}    

This query, on the data above, has one solution:

Query Result:

[
   {
      "title": "SPARQL To Objects"
   }
]
    

2.1.2 Mapping multiple objects.

Data:

@prefix foaf:  <http://xmlns.com/foaf/0.1/> .

_:a  foaf:name   "Johnny Lee Outlaw" .
_:a  foaf:mbox   <mailto:jlow@example.com> .
_:b  foaf:name   "Peter Goodguy" .
_:b  foaf:mbox   <mailto:peter@example.org> .
_:c  foaf:mbox   <mailto:carol@example.org> .

Mapping Query:

PREFIX foaf:   <http://xmlns.com/foaf/0.1/>
MAPPING JSON {
   "name": ?name,
   "contact": ?mbox
}
WHERE
  { ?x foaf:name ?name .
    ?x foaf:mbox ?mbox }

Query Result:

[
   {
      "name": "Johnny Lee Outlaw",
      "contact": "mailto:jlow@example.com"
   },
   {
      "name": "Peter Goodguy",
      "contact": "mailto:peter@example.org"
   }
]
    

2.1.3 Mapping an RSS 1.0 channel to a JSON object

Data:

@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix : <http://purl.org/rss/1.0/> .

<http://example.com/rss/feed1>
    a :channel;
    :link "http://example.com/rss/feed1";
    :title "Sample RSS 1.0 feed";
    :items [
        a rdf:Seq;
        rdf:_1 <http://example.com/post/1>;
        rdf:_2 <http://example.com/post/2>;
        rdf:_3 <http://example.com/post/3>
    ] .

<http://example.com/post/1>
    a :item;
    :link "http://example.com/post/1";
    :title "post 1";
    :description "this is description of post 1" .

<http://example.com/post/2>
    a :item;
    :link "http://example.com/post/2";
    :title "post 2";
    :description "this is description of post 2" .

<http://example.com/post/3>
    a :item;
    :link "http://example.com/post/3";
    :title "post 3";
    :description "this is description of post 3" .

Mapping Query:

PREFIX rss: <http://purl.org/rss/1.0/>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
MAPPING JSON {
   "type": "Channel",
   "link": ?channel_link,
   "title": ?channel_title,
   "items": [
      {
         "type": "Item",
         "link": ?item_link,
         "title": ?item_title,
         "description": ?item_description
      }
   ]
}
WHERE
{
  ?channel a rss:channel ;
                 rss:items ?seq ;
                 rss:title ?channel_title ;
                 rss:link ?channel_link .
        ?seq a rdf:Seq ;
             ?li ?item .
        ?item a rss:item ;
              rss:title ?item_title ;
              rss:link ?item_link ;
              rss:description ?item_description .
}
ORDER BY str(?li)

This mapping query, on the data above, has one solution:

Query Result:

[
   {
      "type": "Channel",
      "link": "http://example.com/rss/feed1",
      "title": "Sample RSS 1.0 feed",
      "items": [
         {
            "type": "Item",
            "link": "http://example.com/post/1",
            "title": "post 1",
            "description": "this is description of post 1"
         },
         {
            "type": "Item",
            "link": "http://example.com/post/2",
            "title": "post 2",
            "description": "this is description of post 2"
         },
         {
            "type": "Item",
            "link": "http://example.com/post/3",
            "title": "post 3",
            "description": "this is description of post 3"
         }
      ]
   }
]
    

2.1.4 Mapping RDF Literals constructs

Being the S2O mapping model ortogonal to the underlying RDF triples model, a conversion between RDF Literals and JSON scalar built-in data types might be necessary. An S2O engine should automatically cast/map most of the main XML-Schema Datatypes to JSON string values. Even though the language provides a few functions in order to explicily map RDF literals to JSON built-in scalar structures such as null, boolean, number and string. Viceversa, it might be necessary to map RDF Literal specific components such as language and datatype to specific JSON values.

Here is an example converting a few RDF literals into specific JSON scalar built-in types.

The data below contains three RDF literals:

@prefix dt:   <http://example.org/datatype#> .
@prefix ns:   <http://example.org/ns#> .
@prefix :     <http://example.org/ns#> .
@prefix xsd:  <http://www.w3.org/2001/XMLSchema#> .

:x   ns:p     "cat"@en .
:y   ns:p     "42"^^xsd:integer .
:z   ns:p     "abc"^^dt:specialDatatype .

Mapping Query:

PREFIX ns:   <http://example.org/ns#>
PREFIX dt:   <http://example.org/datatype#>
PREFIX xsd:  <http://www.w3.org/2001/XMLSchema#>
PREFIX s2o:  <http://www.asemantics.com/specs/s2o/functions/>
PREFIX json: <http://www.json.org/>
MAPPING JSON {
   "cat": {
	"value": ?cat,
	"language": s2o:lang(?cat)
	},
   "num": {
	"value": json:number(?num),
	"datatype": s2o:datatype(?num)
	},
   "abc": {
	"value": json:string(?abc),
	"datatype": s2o:datatype(?abc)
	}
}
WHERE
{
   ?r1 ns:p ?cat .
   FILTER( lang(?cat) = "en" ) .
   ?r2 ns:p ?num .
   FILTER( datatype(?num) = xsd:integer ) .
   ?r3 ns:p ?abc .
   FILTER( datatype(?abc) = dt:specialDatatype ) .
}

Note the above s2o prefix is bound to the http://www.asemantics.com/specs/s2o/functions/ namespace in order to instruct the S2O engine to use extension functions for managing RDF literals. Likewise the json prefix is bound to the http://www.json.org/ for using the JSON specific functions.

Query Result:

[
   {
      "cat": {
         "value": "cat",
	 "language": "en"
      },
      "num": {
	 "value": 42,
	 "datatype": "http://example.org/datatype#integer"
      },
      "abc": {
	 "value": "abc",
	 "datatype": "http://example.org/datatype#specialDatatype"
      }
   }
]
    

3 Mapping Model

An S2O mapping query defines a transformation of an RDF graph matched against an input RDF Dataset to a sequence of JSON objects. A mapping returns zero, one or multiple JSON objects as specified in an object-template expressed using a specific object modeling language. The result forms a mapping sequence by taking a SPARQL query solution sequence, substituting for the variables in the object-template, and combining the JSON values into a single JSON array or object structure.

3.1 Definitions

This section provides definitions of concepts used elsewhere into this document.

3.1.1 SPARQL specific definitions

This document uses the definitions from the SPARQL Query Language for RDF.

3.1.2 JSON specific definitions

This document uses the definitions from the JSON syntax definition.

3.1.3 Mapping Query

A mapping query is a transformation between a basic graph pattern and an object template into a mapping sequence.

3.1.4 Object Template

An object template represents a prototype for a JSON object structure. Each instance of a template is a valid JSON object. A mapping sequence consists of JSON objects. Each mapping query must express one and only one object template.

3.1.5 Object Modeling Language

An object modeling language defines the syntax and the semantics of an object template in order to express an instance of an application domain object model in terms of JSON objects. An instance of a specific object template expressed into a specific object modeling language is an JSON object model expressed into the given language.

3.1.6 Mapping Sequence

A mapping sequence is a set of JSON objects representing the result of a mapping query. A mapping sequence must be a JSON array of objects.

3.2 Universal Object Modeling Language

The universal object modeling language corresponds to the syntax and semantics defined by the JSON itself. There format defines two structures: a collection of name/value pairs and an ordered list of values. This document specifies that a mapping sequence resulting from processing an object template expressed using the universal object modeling language must be a JSON array of objects.

3.2.1 Universal Object Template

The universal object template resembles the JSON syntax and leverages the expressivity of SPARQL query syntax.
3.2.1.1 Syntax

The universal object template syntax follows the standard JSON object syntax, with the following extensions:

To the above syntax changes the following applies:

3.2.1.2 Relationships

The JSON syntax allows to implicitly define object relationships using collections of name/value pairs (objects) or ordered list of values (arrays). The core definitions do not provide in any way a notation to express relationship cardinality or the semantics of the specific relationship (association, aggregation etc.). Most is left to the application for interpretation.

The object universal modeling language defines that:

Given the above the following applies:

3.2.1.3 Grouping query solutions

An S2O engine must form a mapping sequence by taking a SPARQL query solution sequence, substituting for the variables in the object-template, and combining the JSON values into a single JSON object structure. An object template might express one or more 1-to-many relationships between the mapped object and one or more related objects. In order to correctly generate the corresponding JSON data structure the S2O engine must group the sequence of SPARQL query solutions (variable bindings) by the corresponding object. This means that for each unique group of variable bindings an object as specified into the sub-part of the array object template must be instantiated.

While the core universal object template syntax allows to express arrays or arrays of objects; in order to semantically group the sub-parts of the object template the s2o:group-by() extension function might be used. This document defines that such grouping extension function must receive as first argument an object template corresponding to the sub-part of the 1-to-many objects relationship; and as second argument a SPARQL expression. The output result is an JSON value and is meant to be a placeholder of the grouping function into the caller object template space.

3.3 Using Alternative Object Modeling Languages

At some extend JSON can be considered a neutral object modeling language to express objects and relationships between them. Likewise S2O is meant to be a generic language to express mappings between RDF graphs and object models. Even though, there might be several existing JSON based object modeling languages (for example JDIL - Data Integration with JSON) to express more complex application domain models than with the universal object model above.

It is expected that several different JSON object-modeling specific best-practice scheme/vocabularies will crystallize and be adopted in the future. In order to provide an full extensible object mapping solution the S2O language provides an optional USING modifier to the MAPPING JSON construct enabling implementations to process and validate the corresponding object-templates and generated structures accordingly.

By default this specification only mandate an S2O implementation to support the Universal Object Modeling Language.

The definition and best-practice usage of alternative JSON based object modeling languages is outside the scope of this document.

4 S2O Syntax

The AS keyword can used to associated a short name/identifer to the specific mapping. This can be useful in the implementation of Internet protocols exchanging S2O mappings.

A S2O Grammar

B Internet Media Type, File Extension and Macintosh File Type

The Internet Media Type / MIME Type for the S2O JSON Format is "application/json".

It is recommended that S2O query files have the extension ".s2o" (all lowercase) on all platforms.

It is recommended that S2O query files stored on Macintosh HFS file systems be given a file type of "TEXT".

C References

[SPARQL-Q]
SPARQL Query Language for RDF, Eric Prud'hommeaux, Andy Seaborne (editors), W3C Candidate Recommendation 14 June 2007.
[JSON]
JSON.
[JSON-SYNTAX]
The application/json Media Type for JavaScript Object Notation (JSON)
[RDF-OBJECTS]
RDF Objects.
[TURTLE]
Turtle - Terse RDF Triple Language, Dave Beckett.
[SPARQL-JSON-RES]
Serializing SPARQL Query Results in JSON.

 

D FAQ

Why mapping to object?
A majority of current software is developed using the object-oriented paradigm, while RDF is generally triple based. Either Java and Javascript (as well as most modern programming languages) have native support for OO programming; and most DBMS have support for expressing Object-Relational Mapping (ORM) or equivalent transformations. In addition, an object model is more natural to an end-user/programmer than a graph or other composite data types.
How S2O differs from other JSON serialization formats of RDF?
There have been several proposals listed here and here for providing round-tripplable or alternative RDF serializations. Even though most applications work at less-granular domain-object level which is generally not an RDF graph; even though the semi-structured nature of the RDF model provides extensible and evolving data models. S2O aims to fill that gap by
How S2O differs from the proposed DAWG serialization of SPARQL Query Results in JSON?
The Serializing SPARQL Query Results in JSON specification is meant to provide a simple enough JSON serialization of RDF graphs to a single and self-contained tabular strcture; while leaving most of the mapping to application specific domain objects to the user. S2O aims to provide a more generic transformation of RDF graphs to domain-specific data models.
Is S2O similar to ActiveRDF?
S2O is declarative and is leveraging on the SPARQL query model; while ActiveRDF works directly on RDF triples representations. And in addition ActiveRDF works on full-blown domain specific RDF-Schema/OWL ontologies definitions in order to map resulting objects; instead S2O allows to convert more naturally RDF triples to JSON data structures simply using ad-hoc SPARQL query driven transformations.
Why S2O default Universal Object Modeling Language returns an array of objects rather than a top container object directly? Isn't RDF an ordered graph?
Each S2O result is a JSON array of zero, one or more objects. This allows more freedom to the user how to restructure the input RDF semi-structured information without mandating JSON specific object type and identifier schemes (E.g. @type or @id hash keys). In fact, each array element can represent an opaque JSON object, and the semantics of each actual object key/value pairs are left to the application processing the data. Rather than being hardcoded into the language. For example, if we would have returned as top-most structure a JSON object (in fact an hash table) we would have to define specific semantics into the S2O object-template structure to map RDF Terms to specific JSON strings (hash keys); this has also the additional benefit in case of bNodes are being serialized, they are being hidden behind an in-memory anonymous JSON hash reference.