- Andrea Gazzarini
- Apache Jena
- 0 Comments
SPARQL Integration tests with SolRDF
In 2014 I got a chance to give some contribution to a great project, CumulusRDF, an RDF store on a cloud-based architecture. The project Integration Test Suite was definitely a challenging task.
I used JUnit for running some examples coming from Learning SPARQL by Bob DuCharme (O’Reilly, 2013). Both O’Reilly and the Author (BTW thanks a lot) gave me permissions to do that in the project.
So, when I set up the first prototype of SolRDF, I wondered how I could create a complete (integration) test suite for doing more or less the same thing…and I came to the obvious conclusion that something of that work could be reused.
Something had to be changed. That mainly because CumulusRDF uses Sesame as underlying RDF framework, while SolRDF uses Jena…but at the end it was a minor change…they are both valid, easy and powerful.
So, in LearningSPARQL_ITCase there is:
- A setup method for loading the example data;
- A teardown method for cleaning up the store;
The example data is provided, in the LearningSPARQL website, in several files. Each file can contain: a small dataset or a query or an expected result (in tabular format). Coming back to the test suite, the flow should load the small dataset X, run the query Y and verify the results Z.
Although in a previous post we described how to load a sample dataset in SolRDF, this is something that you can do from the command line, and not in a JUnit test. Instead, using Jena we can automatize the data loading in SolRDF using these few lines:
// DatasetAccessor provides access to
// remote datasets using SPARQL 1.1 Graph Store HTTP Protocol
DatasetAccessor dataset = DatasetAccessorFactory.createHTTP();
Dataset memoryDataset = DatasetFactory.createMem();
// Load a local memory model
Model memoryModel = memoryDataset.getDefaultModel();
// Load the memory model in the remote dataset
memoryModel.read(dataURL, ...);
dataset.add(memoryModel);
Great, data has been loaded! In another post I will explain what I did, in SolRDF, for supporting the SPARQL 1.1 Graph Store HTTP Protocol.
It’s time to run some query in order to assert and check the corresponding results. As you can see the tests execute the same query twice: the first is against a memory model, the second towards SolRDF. In this way, assuming the Jena memory model behaviour as a ground truth, each test will be able to check and compare results coming from SolRDF:
final Query query = QueryFactory.create(readQueryFromFile(...));
QueryExecution execution = null;
QueryExecution memExecution = null;
try {
execution = QueryExecutionFactory.sparqlService(SOLRDF_URL, query);
memExecution = QueryExecutionFactory.create(query, memoryDataset);
ResultSet rs = execution.execSelect();
ResultSet mrs = memExecution.execSelect();
assertTrue(ResultSetCompare.isomorphic(rs, mrs));
} catch (...) {
...
} finally {
// Close executions
}
Last but not least, the RDF store needs to be cleared after each test. Although the Graph Store protocol would be very useful for such purpose, it cannot be implemented in Solr because some HTTP methods (i.e. PUT and DELETE) cannot be used in RequestHandlers: Solr allows those methods only for /schema and /config requests. So while a clean up could be easily done using something like this:
dataset.deleteDefault();
Or, in HTTP:
DELETE /rdf-graph-store?default HTTP/1.1
Host: example.com
It’s not possible to implement it so the only remaining approach is a Solr plain way to do that:
SolrServer solr = new HttpSolrServer(SOLRDF_URI);
solr.deleteByQuery("*:*");
solr.commit();
That has nothing to do with RDF and with the Graph Store protocol, but for such purpose (specifically test-scoped) it sounds like a good compromise.
That’s all! I just merged all those stuff in the master so feel free to have a look. If you want to run the integration test suite you can do that from command line:
> cd $SOLRDF_HOME
> mvn clean install
or in Eclipse, using the predefined Maven launch configuration
solrdf/src/dev/eclipse/run-integration-test-suite.launch.
Just right-click on that file e choose “Run as…”. After starting the suite, you can see these messages:
...
(build messages)
...
[INFO] ---------------------------------------------------
[INFO] Building Solr RDF plugin 1.0
[INFO] ---------------------------------------------------
...
(unit tests)
...
[INFO] ---------------------------------------------------
[INFO] TESTS
[INFO] ---------------------------------------------------
...
Tests run: 12, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 1.691 sec
Tests run: 15, Failures: 0, Errors: 0, Skipped: 0
...
(cargo section. It starts the embedded Jetty)
...
[INFO] [beddedLocalContainer] Jetty 7.6.15.v20 Embedded starting...
...
[INFO] [beddedLocalContainer] Jetty 7.6.15.v20 Embedded started
...
(integration tests section)
...
[INFO] ---------------------------------------------------
[INFO] TESTS
[INFO] ---------------------------------------------------
...
Running org.gazzax.labs.solrdf.integration.LearningSparql_ITCase
[INFO] Running Query with prefixes test...
[INFO] [store] webapp=/solr path=/rdf-graph-store params={default=} status=0 QTime=712
...
[DEBUG] : Query type 222, incoming Accept header...
...
[INFO] [store] Closing main searcher on request.
...
[INFO] [beddedLocalContainer] Jetty 7.6.15.v20140411 Embedded is stopped
[INFO] --------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] --------------------------------------------------
[INFO] Total time: 42.302s
[INFO] Finished at: Tue Feb 10 18:19:21 CET 2015
[INFO] Final Memory: 39M/313M
[INFO] --------------------------------------------------