definition cited from SPARQL 1.1 Overview:
SPARQL 1.1 is a set of specifications that provide languages and protocols to query and manipulate RDF graph content on the Web or in an RDF store.
I roughly divide W3C's SPARQL 1.1 Specifications into 5 parts:
protocol
SPARQL 1.1 Protocol
SPARQL 1.1 Graph Store HTTP Protocol
result format
SPARQL 1.1 Query Results JSON Format
SPARQL 1.1 Query Results CSV and TSV Formats
SPARQL Query Results XML Format (Second Edition)
application
SPARQL 1.1 Service Description
SPARQL 1.1 Federated Query
SPARQL 1.1 Entailment Regimes
conformant accessment
SPARQL 1.1 Test Cases
This note is dedicated to record SPARQL languages(query, update), and Jena's ARQ(an implementation) supports.
No one really want to read the specifications directly, especially the unpatients, including myself. So I choose to stand on the shoulders of giants. Section 2.1 Query is mostly cited from [1].
At first glance, SPARQL Query's key words seems to be copyed from Relational Storage Query language SQL. Be careful, although the key words may be familiar, the underlying modeling method and storage are different.
(1) a sample with projection and modifier:
# prefixes
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX owl: <http://www.w3.org/2002/07/owl#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
# select clause, with projection
SELECT ?sub ?prop ?obj
WHERE {
?sub ?prop ?obj.
}
LIMIT 0, 10 # using query modifier
The WHERE
caluse is a specification of graph pattern, and each line in WHERE
caluse is a triple, and should end with .
. The notations that begin with ?
are variables, and when SAPRQL endpoint generate a query answer, these variables should bind to a specific IRI resource.
(2) a sample cited from "Linked Data in Action" with additional RDF dataset and abbreviation of triple:
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX owl: <http://www.w3.org/2002/07/owl#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX pos: <http://www.w3.org/2003/01/geo/wgs84_pos#>
SELECT ?name ?latitude ?longitude
# named graph
FROM <http://3roundstones.com/dave/me.rdf>
FROM <http://semanticweb.org/wiki/Special:ExportRDF/Michael_Hausenblas>
WHERE {
?person foaf:name ?name ;
foaf:based_near ?near .# abbreviation
?near pos:lat ?latitude ;
pos:long ?longitude .
}
?person foaf:name ?name ;foaf:based_near ?near.
uses an abbreviation of turtle syntax. Here are some forms of turtle syntax abbreviations:
a :b c. == a :b c;
a :e f. :e f.
a :b c. == a :b c; == a :b c,d
a :b d. :b d.
The FROM
caluses define so-called additional RDF dataset, without FROM
, the default graph of SPARQL endpoint is used.
(3) queries with named graph and blank nodes
PREFIX tbl: <http://www.w3.org/People/Berners-Lee/card#>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
SELECT *
FROM NAMED <http://www.koalie.net/foaf.rdf>
FROM NAMED <http://heddley.com/edd/foaf.rdf>
FROM NAMED <http://www.cs.umd.edu/∼hendler/2003/foaf.rdf>
FROM NAMED <http://www.dajobe.org/foaf.rdf>
FROM NAMED <http://www.isi.edu/∼gil/foaf.rdf>
FROM NAMED <http://www.ivan-herman.net/foaf.rdf>
FROM NAMED <http://www.kjetil.kjernsmo.net/foaf>
FROM NAMED <http://www.lassila.org/ora.rdf>
FROM NAMED <http://www.mindswap.org/2004/owl/mindswappers>
WHERE {
GRAPH ?originGraph {
_:blank1 foaf:knows _:blank2.
_:blank2 rdf:type foaf:Person;
foaf:nick ?nickname;
foaf:name ?realname
}
}
The FROM NAMED <IRI>
caluses denote named graphs, they are used to indentify Resource's data sources. The GRAPH ?originGraph {<graph-pattern>}
is used to define named graphs' patterns, ?originGraph
is called the variable of named graph. Also we can explicitly assign a particular in the GRAPH
caluse, here is a example:
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
SELECT ?nickname
WHERE {
GRAPH <http://www.w3.org/People/Berners-Lee/card> {
:blank3 foaf:nick ?nickname
}
}
Notations like _:blank1
are blank nodes, they play the same role just like Resource variables, but they could not be included in the query answers. Blank nodes has 2 forms of notaions:
_:blank :prop obj. == [] :prop obj. or
[ :prop obj]
_:blank :prop1 obj1. == [:prop1 obj1]
_:blank :prop2 obj2. :prop2 obj2.
sub :prop1 _:blank. == sub :prop1 [:prop2 obj].
_:blank :prop2 obj.
_:blank :prop1 obj1; == [:prop1 obj1;
:prop2 obj2. :prop2 obj2].
_:blank :prop obj1, == [:prop obj1, obj2]
obj2.
DISTINCT
is used to remove duplicated result in query answers.
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX dbprop: <http://dbpedia.org/property/>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
SELECT DISTINCT ?picture
WHERE {
?person rdfs:label "George Washington"@en;
dbprop:occupation ?job;
dbprop:birthPlace ?birthLoc;
foaf:img ?picture
}
Just like DISTINCT
, excepte that SPARQL endpoint can return any number of duplicated results.
ORDER BY
sort results using alphabetical order and numberic order, and uses ASC()
order by default. Use DESC()
to specify descented order.
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX dbprop: <http://dbpedia.org/property/>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
SELECT ?job ?birthLoc ?picture
WHERE {
?person rdfs:label "George Washington"@en;
dbprop:occupation ?job;
dbprop:birthPlace ?birthLoc;
foaf:img ?picture
}
ORDER BY ?birthLoc DESC(?job)
OFFSET
and LIMIT
should play with ORDER BY
, used to generate streaming results.
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX dbprop: <http://dbpedia.org/property/>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
SELECT ?job ?birthLoc ?picture
WHERE {
?person rdfs:label "George Washington"@en;
dbprop:occupation ?job;
dbprop:birthPlace ?birthLoc;
foaf:img ?picture
}
ORDER BY ?birthLoc DESC(?job)
OFFSET 0 LIMIT 10
FILTER
is used to filter query answer result through some boolean conditions.
These conditions can be specified using: (a) a subset of XQuery, XPath operators and functions, or (b) SPARQL specific operators.
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX dbprop: <http://dbpedia.org/property/>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
SELECT ?prop ?object
WHERE {
?person rdfs:label "George Washington"@en;
dbprop:presidentStart ?start;
?prop ?object.
FILTER(xsd:integer(?start) + 4 <= xsd:integer(?object))
}
The position of FILTER
in WHERE
clause is not important.
OPTIONAL
is used to add more graph patterns in WHERE
caluses, while does not restrict results when there are no query bindings for these patterns.
PREFIX ex: <http://www.example.com/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX dbprop: <http://dbpedia.org/property/>
SELECT ?l1 ?l2 ?l3 ?l4
WHERE {
?person rdfs:label "George Washington"@en.
?l1 dbprop:namedFor ?person.
OPTIONAL { ?l2 dbprop:blankInfo ?person }
OPTIONAL { ?l3 ex:isNamedAfter ?person }
OPTIONAL { ?person ex:wasFamousAndGaveNameTo ?l4 }
}
UNION
is used to aggregate results of two or more graph patterns to generate a new result.
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
SELECT *
WHERE {
{ ?unknown foaf:gender "male" }
UNION
{ ?unknown foaf:gender "female" } .
{ ?unknown foaf:interest <http://www.iamhuman.com> }
UNION
{ ?unknown foaf:interest <http://lovebeinghuman.org> }
}
CAUTION: No examples are expressed in CONSTRUCT
, ASK
and DESCRIBE
.
CONSTRUCT
is used to transform datas from RDF datasets to datas in another RDF dataset.
ASK
is used when you want to confirm whether a given graph pattern is exist in the SPARQL endpoint, it returns a boolean result.
DESCRIBE
is used when query clients do not know the details of the structure of data, the SPARQL endpoint decides how to organize the data.
TODO
<groupId>org.apache.jena</groupId>
<artifactId>apache-jena-libs</artifactId>
<type>pom</type>
<version>2.13.0</version>
</dependency>
see SelectQueryUsingModel.java
see SelectQueryUsingDataset.java
or paly with existed models:
Dataset dataset = DatasetFactory.create() ;
dataset.setDefaultModel(model) ;
dataset.addNamedModel("http://example/named-1", modelX) ;
dataset.addNamedModel("http://example/named-2", modelY) ;
try(QueryExecution qExec = QueryExecutionFactory.create(query, dataset)) {
...
}
see SelectQueryUsingRemoteService.java
ResultSet results = qexec.execSelect() ;
results = ResultSetFactory.copyResults(results) ;
return results;
Classes are located in package arq
.
java -cp .. arq.rsparql --service 'http://www.sparql.org/books/sparql' \
'PREFIX books: <http://example.org/book/> \
PREFIX dc: <http://purl.org/dc/elements/1.1/> \
SELECT ?book ?title WHERE { ?book dc:title ?title }'
java -cp .. arq.rsparql --service='http://www.sparql.org/books/sparql' \
--file=query/books.rq
here query
directory specified in --file
arguments is in the CLASSPATH.
Other command line classes refer ARQ - Command Line Applications for more details.
cd $JENA_HOME/bin
./rsparql --service 'http://www.sparql.org/books/sparql' \
'PREFIX books: <http://example.org/book/> \
PREFIX dc: <http://purl.org/dc/elements/1.1/> \
SELECT ?book ?title WHERE { ?book dc:title ?title }'
# or
./rsparql --service='http://www.sparql.org/books/sparql' --file=./query/books.rq
QueryExecution qexec = ...;
Model resultModel = qexec.execConstruct() ;
QueryExecution qexec = ...;
Model resultModel = qexec.execDescribe() ;
QueryExecution qexec = ...;
boolean result = qexec.execAsk()
[1] Hebeler J, Fisher M, et al.Web 3.0与Semantic Web编程[M]. 清华大学出版社, 北京.2010