One of the areas I've been focusing on lately is the so called "Semantic Web", in particular Open Data as a way to make governments more transparent and provide data to citizens. From a technical point of view, these data are redistributed using the  RDF/LD format.

I’m particularly excited of having worked on the release of what I think is a very important data set that helps understand how decisions are taken in the Council of European Union.

The Council of European Union published how member states have voted in since 2010

In April 2015, the Council of European Union released as open data how Member States vote on legislative acts. In other words, it means that when the Council votes to adopt a legislative act (ie a regulation or a directive), the votes of each country are stored and made publicly visible. This means that you can see how your country voted when a given law was adopted, or you could get more aggregate data on trends and voting patterns.

Recently, the Council has also released two additional open datasets containing the metadata of all Council documents and metadata on requests for Council documents.

DiploHack, Open Data Hackathon

The Council will also organise for tomorrow 29 and 30 of April, together with the Dutch Presidency, DiploHack, an hackaton about open data, in Brussels. The goal of the hackaton is to make use of Council’s opendata sets, linking them with all the other datasets available from other EU institutions, and build something useful for citizens. You can still register for the hackathon.

This post will show you how to access the votes using SPARQL, which is a query language for data published in RDF format, and how to access those data using AngularJS.

A brief introduction to RDF/LF and SPARQL

In the context of Semantic Web, entities and relation between entities are represented in triples which are serialized in a format called “Turtle” or in RDF/XML (which is what is usually referred as RDF) and many others formats.

You can imagine a “triple” as a database with 3 columns: subject, predicate, object. And each of those is represented with a URI. This is a very flexible format that can be used to represent anything. For example you can say that the author of this blog is myself (univoquely identified by my github account url and with the name “Simone Chiaretta”) and that the topic of this blog is Web Development. The corresponding serialization in Turtle (using the simple notation) of these three information will be:

<http://codeclimber.net.nz/>
  <http://purl.org/dc/elements/1.1/creator>
  <https://github.com/simonech> .

<http://codeclimber.net.nz/>
  <http://purl.org/dc/elements/1.1/subject>
  "Web Development" .

<https://github.com/simonech>
  <http://xmlns.com/foaf/0.1/name>
  "Simone Chiaretta" .

Notice the use of the URI to represent entities, which gives them an unique identifier. In this case the http://purl.org/dc/elements refers to an URI defined by the Dublin Core’s  Metadata Terms. Another possible solution to represent the topic, could have been to refer to another URI coming from a managed taxonomy. This way it would have been possible to make “links” with other datasets.

But  how to query these data? We use SPARQL.

SPARQL uses a syntax very similar to Turtle, and uses SQL-like keywords like SELECT and WHERE.  Using the bibliographic example, one could query for all publications written by Simone Chiaretta. The syntax would be:

  SELECT ?publication
  WHERE {
    ?publication <http://purl.org/dc/elements/1.1/creator> <https://github.com/simonech> . 
  }

Basically the query is done by putting a variable in the element you want as result, and by specifying the other two elements of the tuple: a kind of query by example. The other 2 elements of the tuple can also be variables, in case you want to “join” different tuples. For example, if we want to search for all publications written by Simone Chiaretta, identified by his name instead of the URI, the query will be:

  SELECT ?publication
  WHERE {
    ?publication <http://purl.org/dc/elements/1.1/creator> ?author . 
    ?author <http://xmlns.com/foaf/0.1/name> "Simone Chiaretta" . 
  }

With these basic knowledge, we can now look at how to access the data released by the Council of European Union about votes on legislative acts.

How the data is modelled and how to query it

Data released include the information about an act (title, act number, various document numbers, policy area, etc…), the session in which it’s been voted (its date, the Council configuration, the number of the Council session) and how each country voted.

Instead of being modeled as hiearchical graph, in order to make it easier to analyze it and get aggregated data, we’ve modelled it as a Data Cube: an “observation” includes all the information in a flat and denormalized structure. So, a “line” includes how a country voted for a given act, followed by all the information about act and session, which are then replicated for how many countries voted in the act. This approach make it less space efficient (all acts and council information are replicated every time) but easier and faster to query as there is no need for “linking” different entities with “joins” in order to compute aggregated results.

Simple queries

For example, if you want to know all acts about fishery, you do:

  SELECT DISTINCT ?act
  where {
    ?observation
    <http://data.consilium.europa.eu/data/public_voting/qb/dimensionproperty/policyarea>
    <http://data.consilium.europa.eu/data/public_voting/consilium/policyarea/fisheries> .
    
    ?observation
    <http://data.consilium.europa.eu/data/public_voting/qb/dimensionproperty/act>
    ?act .
  }

The query basically asks: give me all the “observations” whose policy area is fisheries, and then, for these observations, give me their “act”. 

Notice the clause DISTINCT: this is important because, given the “data cube” approach, every act it replicated 28 times (there are usually 28 countries voting), so we need to take it only once.

The result will be 27 acts, each one identified by it’s URI. You can also execute the query directly in the interactive query tool online, and you will get the results as HTML.

all-acts-on-fisheries

If you want the title of the act, you also need to ask the “definition” for that URI, which has been mapped using the predicate http://www.w3.org/2004/02/skos/core#definition. So, the query will become:

  SELECT DISTINCT ?act ?title
  where {
    ?observation
    <http://data.consilium.europa.eu/data/public_voting/qb/dimensionproperty/policyarea>
    <http://data.consilium.europa.eu/data/public_voting/consilium/policyarea/fisheries> .
    
    ?observation
    <http://data.consilium.europa.eu/data/public_voting/qb/dimensionproperty/act>
    ?act .
  
    ?act
    <http://www.w3.org/2004/02/skos/core#definition>
    ?title .
  }

The result is as shown in the following screenshot (or can be seen online directly).

all-acts-on-fisheries-with-title

More complex aggregation queries

Now that you have the graps of it, let’s do some more interesting aggregated queries. Actually, given the modelling done, they are conceptually more complex, but easier to implement.

For example, you want to know how many time countries voted against the adoption of an act?

  PREFIX eucodim: <http://data.consilium.europa.eu/data/public_voting/qb/dimensionproperty/>
  PREFIX eucoprop: <http://data.consilium.europa.eu/data/public_voting/qb/measureproperty/>
  PREFIX eucovote: <http://data.consilium.europa.eu/data/public_voting/consilium/vote/>
  
  SELECT COUNT(?act) as ?count ?country
  from <http://data.consilium.europa.eu/id/dataset/votingresults>
  where {
    ?observation eucodim:country ?country .
    ?observation eucoprop:vote eucovote:votedagainst .
    ?observation eucodim:act ?act .
  }
  ORDER BY DESC(?count)

To keep the query more concise and readable, I used another SPARQL keywork, PREFIX, to avoid writing the whole URI all the times. Here is the countries that voted against the adoption of an act, sorted by who voted no the most (using the ORDER BY DESC keyword).

who-voted-no

If you want to see how a country voted in all the acts? It’s enough to switch country with vote, and you “pivot” the view of the data, aggregating by vote instead of by country:

  PREFIX eucodim: <http://data.consilium.europa.eu/data/public_voting/qb/dimensionproperty/>
  PREFIX eucoprop: <http://data.consilium.europa.eu/data/public_voting/qb/measureproperty/>
  PREFIX eucocountries: <http://data.consilium.europa.eu/data/public_voting/consilium/country/>
  
  SELECT COUNT(?act) as ?count ?vote
  from <http://data.consilium.europa.eu/id/dataset/votingresults>
  where {
    ?observation eucodim:country eucocountries:uk .
    ?observation eucoprop:vote ?vote .
    ?observation eucodim:act ?act .
  }
  ORDER BY DESC(?count)

And you see the country of the example voted 554 in favor of the adoption, 45 against, 42 abstained from voting and 39 didn’t participate in the voting (this happens because countries outside of the Eurozone do not vote in Euro-related matters).

how-country-voted

Council’s Github repository contains more information on the model itself as well as a list of other SPARQL queries.

How to exploit all these information from code

Now you know how to query the dataset via the interactive query tool, you probably want to do something with the data.

There are a few JavaScript libraries that make it easier to interact with SPARQL endpoints and also can navigate graphs, like RDFSTORE-JS or rdflib.js. Or dotNetRDF if you are looking to do some processing on the server-side in .NET.

But if you want just to query a SPARQL endpoint you can just make a standard http GET request, passing the SPARQL query as parameter. In return you can get the results in a variety of formats, including JSON. The format of this JSON is a W3C standard (like all the other format decribed on the page): SPARQL 1.1 Query Results JSON Format.

The last query, in JSON format, would have returned the following code.

json-result

Basically this JSON format has an head which tells which variables have been used, followed by the results, which contain a small set of metadata about the query (was it a distinct, was it sorted), followed by all the results, inside a bindings array. For each variable, the type, URI and value are specified.

Sample request with Angular

Using AngularJS, you can send SPARQL queries using the standard $http.get method. The following sample is part of the open source demo we published on Council’s Github repository. The demo allows searching of acts by specifying some properties. It is available online at: http://eucouncil.github.io/CouncilVotesOnActsDatasetSample/

First I built an AngularJS Factory to encapsulate the query to the SPARQL endpoint (http://data.consilium.europa.eu/sparql) and the manipulation of results.

angular.module('opendataApp', []).factory('sparqlQuery',function($http){
      return function(query){
        var baseAPI="http://data.consilium.europa.eu/sparql?";
        var requestUrl = baseAPI + "query="+query+"&format=application%2Fsparql-results%2Bjson";
  
        return $http.get(requestUrl)
        .then(function successCallback(response) {
          console.log(response.data.results.bindings);
          var acts = [];
          var bindings = response.data.results.bindings;
          for (var i = 0; i < bindings.length; i++) {
            var variable = bindings[i];
            // Does some processing to put together all properties of an act
          }
          return acts;
          }, function errorCallback(response) {
          });
      };
    })

Then, with this in place and using another service for concatenating the SPARQL string, I can send the query to the server and get back the results and display them in the page.

  vm.performSearch = function() {
    vm.searching=true;
    vm.noresults=false;
    vm.acts=[];
    vm.sparqlQuery = sparqlGenerator(vm.search); //concatenates string
    sparqlQuery(vm.sparqlQuery).then(function (data){
      vm.acts = data;
      vm.searching=false;
      if(vm.acts.length==0)
       vm.noresults=true;
    });
  };

You can play around with the demo online at: http://eucouncil.github.io/CouncilVotesOnActsDatasetSample/

So, come to the hackathon and even if you cannot, play with the data and make some nice analysis of them. If you do, please post your links in the comment section.

Voting Simulator Application

On a slightly related topic, if you want to see how agreements are reached and how the actual voting happens, you can play around with the Council Voting Calculator, availabe on the website, but also as iOS app and Android app (in both versions, phone and tablet). Following is a screenshot from the iPad version of the app.

Disclaimer: The views expressed are solely those of the writer and may not be regarded as stating an official position of the Council of the EU

Clause de non-responsabilité: Les avis exprimés n'engagent que leur auteur et ne peuvent être considérés comme une position officielle du Conseil de l'UE