FragLink: A Linked Data Fragments Server Development Kit

The Resource Description Framework (RDF) is a World Wide Web Consortium (W3C) standard. Originally designed as a data model for metadata, it quickly became the de facto standard for describing and exchanging graph data, especially in semantic web applications.

It provides a way to describe resources and their relationships in a machine-readable format: the information is structured as triples, statements that consist of a subject, a predicate, and an object. These triples form a graph structure, where nodes are resources, and edges represent relationships between resources, like in the picture below:

The same graph can be represented textually using the compounding statements:

				
					Andrea knows Mario
Andrea lives in Viterbo
Viterbo is in Italy
Italy is in Europe

Mario lives in Italy
Mario knows John

John lives in Germany
Germany is in Europe

How To Store RDF Data? The RDF Store

An RDF Store, triple store, or RDF database is a specialized database designed for storing and querying RDF data. It typically supports SPARQL, a query language for querying RDF data.

SPARQL allows users to express complex queries to retrieve information from the RDF store based on patterns and conditions. The following SPARQL example retrieves the title of all books in the dataset whose price is less than 30.

				
					PREFIX  dc:  <http://purl.org/dc/elements/1.1/>
PREFIX  ns:  <http://example.org/ns#>
SELECT  ?title ?price
WHERE   { ?x ns:price ?price .
          FILTER (?price < 30) .
          ?x dc:title ?title . }

How To Provide RDF Data?

When I started diving into the topic of Linked Data Fragments, I was impressed by the following image on the LDF homepage.

The simple image quickly gives the reader an immediate idea about what we will discuss. The two extremes above are valid options if the matter is to publish and interact with RDF data. Let’s quickly explore them.

Data Dump

RDF portals like Dbpedia or VIAF provide online access to their search services. However, being public services, such access is limited to some usage quota.

For example, you cannot build a system that processes a considerable amount of data and executes a corresponding massive amount of API calls to those services quickly. The reason is clear: the portal is meant to provide its services at a high/medium quality to a large, potentially very vast audience; as a consequence, the marginal effort, in terms of resource usage, spent by a single API call cannot exceed a given rate.

If you are in that context, what can you do? Download the public dataset and manage it on your own in a local RDF Store, which brings us to the next extreme.

SPARQL Server

You have an RDF dataset, and you want to expose SPARQL capabilities. A first approach would be, “Okay, let’s use an RDF Store.”

So far, so good! However, the approach has some drawbacks, especially if the dataset is considerable in cardinality. In that scenario, the “high server cost” item listed above (the first of the three points in the list of the SPARQL endpoint extreme) could play a relevant role.

A scalable RDF storage solution has a cost, which is often “important,” especially in case the dataset is large; from that perspective, the open-source landscape is very poor.

Even if you leave the “on-premises” option for subscribing to a cloud-managed solution, be prepared to have a new relevant item in your grocery list.

Linked Data Fragments

Linked Data Fragments aims to offer an alternative, i.e., a third scenario in the middle, between the two extremes above. The idea is to destructure and distribute the query execution in small computation fragments.

Instead of having a centralized (i.e., server) SPARQL engine, which requires a high computation cost, the SPARQL query is destructured in an intermediate layer in its compounding blocks called “triple patterns.”

Let’s ignore any potential query optimization that the intermediate layer could apply (i.e., query rewriting, patterns reordering). Once the query has been destructured, each pattern is sent to a Triple Pattern Server, a remote service in charge of “resolving” a single triple pattern.

The response of the triple pattern resolution is a fragment. That is, a partial response composed of

metadata about the result, the dataset, the service, and the system behind
hypermedia controls for understanding and navigating the results
triples matching the request pattern

The intermediate layer acts as a kind of query coordinator and federation engine: it destructures the original queries, asks for the resolution of any resulting pattern, receives the responses from the pattern resolution servers, applies a merge logic, and returns the response to the caller.

FragLink

FragLink is a framework for building Linked Data Fragments servers. In other words, FragLink enables Linked Data Fragments capabilities to your server application.

That means it’s not a server itself. Instead, it comes as a SpringBoot autoconfigure module that you can easily plug into your application. Once FragLink is plugged in, everything in terms of Linked Data Fragments Web API is enabled (i.e., HTTP endpoint, metadata, controls): of course, the concrete data binding is up to you.

Let’s see how it works.

Step 1: SpringBoot App Skeleton

This will be your Linked Data Fragment Server. It is strongly recommended to use Spring initializr to define the initial shape of the module (e.g., components, dependencies, frameworks, starters).

Step 2: FragLink Dependencies

Once the project skeleton is created, open the pom.xml (in case you’re using Gradle, there’s a corresponding configuration) and add the following section:

				
					<repositories>
  <repository>
    <id>fraglink-package-registry</id>
    <url>https://gitlab.com/api/v4/projects/52914288/packages/maven</url>
  </repository>
</repositories>

The snippet above declares the coordinates to the maven repository where the FragLink artifacts are hosted. Then, in the dependencies section:

Step 3: Configuration

Assuming you already set up everything the SpringBoot module (e.g., dependencies and so on), here’s the minimal configuration required by FragLink:

				
					<dependency>
  <groupId>com.spaziocodice.labs.rdf</groupId>
  <artifactId>fraglink-starter</artifactId>
  <version>1.1.1</version>
</dependency>

Checkout the latest available version! 1.1.1 is the latest version at the time of writing this post: please always check in the official repository (https://github.com/spaziocodice/FragLink) the availability of other versions.

				
					fraglink:
  base:
    url: https://fragments.yourproject.org  (this is an example)
  page:
    maxStatements: 50 (the maximum number of statements returned in response)
  dataset:
    name: "The dataset / project name" 
    description: "An optional description about the project"

Step 4: Start

Start you server, after few seconds you should see the following messages:

				
					... : <FRAGLINK-00001> : FragLink v1.1.1 has been enabled on this server.

The server is running: great! Linked Data Fragments are exposed through the root (/) REST endpoint. The endpoint template is

http://fragments.yourproject.org{subject, predicate, object, graph, page}

However, being a triple/quad pattern resolver, it doesn’t still know how to fetch data. The default implementation is a simple NoOp, meaning no data is returned in response, only metadata. Here’s an example of such a response:

				
					<https://fragments.yourproject.org/fragments#metadata> {
    <https://fragments.yourproject.org/fragments#dataset>
            a <http://rdfs.org/ns/void#Dataset> , <http://www.w3.org/ns/hydra/core#Collection>;
            <http://rdfs.org/ns/void#subset>
                    <https://fragments.yourproject.org/fragments>;
            <http://www.w3.org/ns/hydra/core#search>
                    [ <http://www.w3.org/ns/hydra/core#mapping>
                              [ <http://www.w3.org/ns/hydra/core#property>
                                        <http://www.w3.org/1999/02/22-rdf-syntax-ns#subject>;
                                <http://www.w3.org/ns/hydra/core#variable>
                                        "subject"
                              ];
                      <http://www.w3.org/ns/hydra/core#mapping>
                              [ <http://www.w3.org/ns/hydra/core#property>
                                        <http://www.w3.org/1999/02/22-rdf-syntax-ns#predicate>;
                                <http://www.w3.org/ns/hydra/core#variable>
                                        "predicate"
                              ];
                      <http://www.w3.org/ns/hydra/core#mapping>
                              [ <http://www.w3.org/ns/hydra/core#property>
                                        <http://www.w3.org/1999/02/22-rdf-syntax-ns#object>;
                                <http://www.w3.org/ns/hydra/core#variable>
                                        "object"
                              ];
                      <http://www.w3.org/ns/hydra/core#template>
                              "https://fragments.yourproject.org/fragments{?subject,predicate,object,page}";
                      <http://www.w3.org/ns/hydra/core#variableRepresentation>
                              <http://www.w3.org/ns/hydra/core#ExplicitRepresentation>
                    ] .
    
    <https://fragments.yourproject.org#dataset>
            a <http://www.w3.org/ns/hydra/core#Collection>;
            <http://purl.org/dc/elements/1.1/description>
                    "An optional description of the dataset.";
            <http://purl.org/dc/elements/1.1/title>
                    "The Dataset project/name";
            <http://www.w3.org/ns/hydra/core#member>
                    <https://fragments.yourproject.org/fragments#dataset> .
    
    <https://fragments.yourproject.org/fragments#metadata>
            <http://xmlns.com/foaf/0.1/primaryTopic>
                    [ a <https://fragments.yourproject.org/fragments> ] .
    
    <https://fragments.yourproject.org/fragments>
            a <http://www.w3.org/ns/hydra/core#PartialCollectionView>;
            <http://purl.org/dc/elements/1.1/description>
                    "Linked Data Fragment of Share-VDE dataset containing triples matching the pattern {?s ?p ?o ?q}"@en;
            <http://purl.org/dc/elements/1.1/source>
                    "https://fragments.yourproject.org/fragments#dataset";
            <http://purl.org/dc/elements/1.1/title>
                    "Linked Data Fragment of The Share-VDE Project Dataset"@en;
            <http://rdfs.org/ns/void#subset>
                    <https://fragments.yourproject.org/fragments>;
            <http://rdfs.org/ns/void#triples>
                    "0"^^<http://www.w3.org/2001/XMLSchema#long>;
            <http://www.w3.org/ns/hydra/core#firstPage>
                    <https://fragments.yourproject.org/fragments?page=1>;
            <http://www.w3.org/ns/hydra/core#itemsPerPage>
                    "50"^^<http://www.w3.org/2001/XMLSchema#int>;
            <http://www.w3.org/ns/hydra/core#totalItems>
                    "0"^^<http://www.w3.org/2001/XMLSchema#long> .
}

Step 5: Linked Data Fragment Resolver

To create a binding tied to a specific data source, you must create an implementation of

com.spaziocodice.labs.fraglink.service.impl.LinkedDataFragmentResolver

The interface contains a single method, which takes a triple/quad pattern in input and expects the list of matching triples in output. The FragLink framework uses Apache Jena, a powerful and open-source set of tools for dealing with RDF data.

Although implementing a resolver is trivial, we will soon publish an example repository with a sample resolver. Stay Tuned!

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

FragLink: A Linked Data Fragments Server Development Kit

How To Store RDF Data? The RDF Store

How To Provide RDF Data?

Data Dump

SPARQL Server

Linked Data Fragments

FragLink

Step 1: SpringBoot App Skeleton

Step 2: FragLink Dependencies

Step 3: Configuration

Step 4: Start

Step 5: Linked Data Fragment Resolver

Links

Resource Description Frameworks

Linked Data Fragments

Fraglink

Apache Jena

Share this post

Leave a ReplyCancel reply

SpazioCodice SRL

Services

Useful Links

Contact Us

FragLink: A Linked Data Fragments Server Development Kit

How To Store RDF Data? The RDF Store

How To Provide RDF Data?

Data Dump

SPARQL Server

Linked Data Fragments

FragLink

Step 1: SpringBoot App Skeleton

Step 2: FragLink Dependencies

Step 3: Configuration

Step 4: Start

Step 5: Linked Data Fragment Resolver

Links

Resource Description Frameworks

Linked Data Fragments

Fraglink

Apache Jena

Share this post

Leave a ReplyCancel reply

Discover more from SpazioCodice