DBTune blog

To content | To menu | To search

Thursday 29 January 2009

Prolog message queue

It's been a long time since I last posted anything here, but things have been pretty hectic recently (I am a doctor, now!! I'll post my thesis here soon).

I've just hacked a really small implementation of an HTTP-driven SWI-Prolog message queue. I've often find myself doing quite expensive computation in Prolog and the best way to easily distribute it is to have a message queue on which you post messages to process (in that case, Prolog terms), and a pool of workers pick messages and process them. Then, if you find your program is still too slow, you can easily add a couple of workers to help going faster.

Monday 22 December 2008

New server for DBTune

I completed the move of DBTune to a new shiny server yesterday. Things should go way faster, and the server should have a much better uptime. Overall, our experience with 1and1 hosting has been pretty bad: random server reboots, configuration files erased for no reason, and extremely long delays in getting customer support...

Many many thanks to the Centre for Digital Music for hosting the new DBTune!

Monday 15 December 2008


Last week, around 10 people from BBC A&Mi, including myself, gathered for two days of hardware hacking. The goal was to build a Rockterscale -- a device that was able to measure how much a band rocks. Since I haven't done any real-time audio processing in a long time, I decided to give that a go - analysing a live audio input and extract some of its characteristics. I used Paul Brossier's Aubio library to do so, as it seemed relatively easy to hack, and was already doing something we thought was great for visualisation purposes: beat tracking from a live audio input. After the first day, we had a bit of C code that extracted the loudness, the spectral centroid and the spectral spread from the live audio input. Then, we sent over the normalised data using Open Sound Control to the visualisation components.

But, of course, the audio signal is not the only thing to consider in order to determine how much a band rocks! We used a number of sensors to capture the reactions of the crowd:

  • The Hat of Rock, capturing some headbanging data:


  • An accelerometer under the dance-floor/mosh-pit, and a force sensor hooked on the crash barrier:


  • A webcam capturing how much movement there is in the crowd:


All the data fed by these different components was visualised on a screen: C

and on a physical rockterscale (yes, it does go up to 11 :-)) C

Here is a small video of all that into action! (I think the best part is the BBC A&Mi people dancing on Ace of Spades to try out the system :-) ).

Friday 14 November 2008

Reuters OpenCalais joins the linked data cloud

Still more fancy linked data to play with - just a couple of weeks after Freebase announced that they publish linked data, OpenCalais just announced that they are going to publish linked data as well, by joining up the results of their entity extraction service to DBpedia URIs.

Wednesday 29 October 2008

SPARQLing a funk legend

I just came across this awesome blog post from Kurt. It starts from a real music question (he saw Maceo Parker live, and wanted to know if he wrote one of the song he played in the encore: Pass the Peas), and finds an answer to it using Semantic Web technologies, in particular SPARQL.

Great stuff!

Freebase does linked data!

Just a small post, live from ISWC: Freebase does linked data!

You can try it there, and you can try this instance, for example.

Freebase linked data

Added to the wonderful David Huynh's Parallax, that's a lot of great news coming from the other side of the Atlantic :-)

Now, to see whether their linked data actually use the Web! Do they link to other web identifiers, available outside Freebase?

I just noticed something weird, also: the read/write permissions are attached to the tracks/films/whatever resources, instead of being attached to the RDF document itself.

Friday 17 October 2008

Next week conferences

I'll be traveling next week to Vienna, for the Web of Data practitioners days in Vienna, where Keith Alexander and I will be giving the first talk (3 hours, yay!). We plan to do quite an exhaustive introduction to linked data and to what happened over the last few years, with quite a few interesting (hopefully!) examples. I'll also give a small introduction to the Music Ontology and to DBTune in the Multimedia session, on the Thursday. The other speakers are truly amazing, so if you're in Vienna next week, please come along! :-)

Next, Patrick and I will be travelling to Karlsruhe, to attend ISWC 2008. This will be my first ISWC, so I am really looking forward to it! And I just noticed the SWI-Prolog folks are presenting a paper there (Thesaurus-based search in large heterogeneous collections), so this will be the perfect occasion for thanking them for the software framework underlying DBTune :-)

Tuesday 7 October 2008

Vocabulary interlinkage diagram

The UMBEL people (Fred and Mike) just released a new interlinkage diagram. This time, it doesn't represent the different links amongst datasets made available within the Linking Open Data project, but rather a map of the links amongst vocabularies used on the data web.

Vocabulary interlinkage

The Music Ontology sits right between FOAF, FRBR and the Event ontology (although I would have added the Timeline ontology as well). There is also the Programmes Ontology, at the bottom (which is also interlinked with the Event ontology, btw).

This diagram really helps to see how the current web ontology landscape is structured, and I hope we can keep use it to keep track of the evolution of web ontologies, a bit like what has been done for the available datasets and interlinks.

Monday 29 September 2008

D2R server, SNORQL and Firefox 3

In case it might be useful for someone else (I've had several requests for it offline), here is a small patch to make D2R server work with the latest versions of ARQ, in order to make the SNORQL SPARQL explorer work in Firefox 3. I sent the patch to Richard some time ago, so hopefully the newest D2R should work with latest versions of ARQ.

Oh, and I've finished (well, almost, just a couple more lines to add to the conclusion) writing up, and started this morning at the BBC!

Sunday 7 September 2008

DBTune wins the second prize in the Triplify challenge!

I submitted DBTune to the Triplify challenge, a couple of months ago. The text of the submission is there. The results of the challenge were given on Friday, at the i-semantics conference. Many many thanks to Michael Hausenblas for representing DBTune there!

And, DBTune won the second prize! Here is a picture of the prize ceremony:

Congratulations to the winners, LinkedMDB, for their amazing work and well-deserved prize, and many thanks to Sören Auer for organizing the challenge!

Wednesday 3 September 2008

Good-bye C4DM, hello BBC!

I've been rather quiet for the last month: intense PhD writing. I have been trying to get it fully written by the end of September. Indeed, I will be joining BBC Audio & Music at the end of the month. I am really really excited about that! Of course, I am a bit sad to leave the Centre for Digital Music, after three fantastic years spent there: great people, great work, great projects, great art and great beer :-)

Thursday 31 July 2008

Semantic search on aggregated music data

I just moved the semantic search demo to a faster server, so it should hopefully be a lot more reliable. This demo uses the amazing ClioPatria on top of an aggregation of music-related data. This aggregation was simply constructed by taking a bunch of Creative Commons MP3s, running GNAT on them, and crawling linked data starting from the web identifiers outputted by GNAT.

I also set up the search tab to work correctly. For example, when you search for "punk", you get the following results.

Punk search 1

Punk search 2

Note that the results are explained: "punk" might be related to the title, the biography, a tag, the lyrics, content-based similarity to something tagged as punk (although it looks like Henry crashed in the middle of the aggregation, so not a lot of such data is available yet), etc. Moreover, you get back different types of resources: artists, records, tracks, lyrics, performances etc.

For example, if you click on one of the records, you get the following.

Punk search 3

This record is available under a Creative Commons license, so you can get a direct access to the corresponding XSPF playlist, Bittorrent items etc., by following the Music Ontology "available as" property. For example, you can click on an XSPF playlist, and listen to the selected record.

Punk search 4

Of course, you can still do the previous things - plotting music artists (or search results, just take a look at the "view" drop-down box) on a map, on a time-line, browse using facets, etc.

Btw, if you like DBTune, please vote for it in the Triplify Challenge! :-)

Wednesday 30 July 2008

Last.fm events and DBpedia mobile

For a recent event at the Dana Centre, I was asked to make a small demo of some nice things you can do with Semantic Web technologies. As it is not funny to re-use demos, I decided to go for something new. So after two hours hacking and skyping with Christian Becker, we added to the last.fm linked data exporter a support for recommended events. I also implemented a bit of geo-coding on the server side (although, with the new last.fm API, I guess this part is becoming useless).

Then, thanks to RDF goodness, it was really straight-forward to make that work with DBpedia mobile. DBpedia mobile is a service getting your geo-location from your mobile device, and displaying you a map with nearby sights, using data from DBpedia. DBpedia mobile also uses the RDF cache of a really nice linked data browser called Marbles.

So, after browsing your DBTune last-fm URI in Marbles, you can go to DBpedia mobile and see recommended events alongside nearby sights. To do so, select the Performances (by moustaki) filter. Here is what I get for my profile, when at the university:

DBpedia mobile and last.fm events

Monday 28 July 2008

Music Ontology linked data on BBC.co.uk/music

Just a couple of minutes ago on the Music Ontology mailing list, Nicholas Humfrey from the BBC announced the availability of linked data on BBC Music.

$ rapper -o turtle \

   a mo:MusicGroup;
   foaf:name "Coldplay";
   owl:sameAs <http://dbpedia.org/resource/Coldplay>;

This is just really, really, really great... Congratulations to the /music team!

Update: Tom Scott just wrote a really nice post about the new BBC music site, explaining what the BBC is trying to achieve by going down the linked data path.

Sunday 27 July 2008

Musicbrainz RDF updated

Well, I guess everything is in the title :-) The dump used is now of the 26th of July. I also moved everything to a much faster server. Also, the D2R mapping is still not 100% complete - I am really slowly getting through it, as PhD writing takes almost all my time these days. I added recently owl:sameAs links to the DBTune Myspace service, so you can easily get from Musicbrainz artists to the corresponding MP3s available on MySpace and their social networks. See for example Madonna, linked through owl:sameAs to the corresponding DBpedia artist and to the corresponding Myspace artist.

Friday 25 July 2008

List of accepted ISMIR 2008 papers

Just spotted through Paul's blog: the list of accepted ISMIR 2008 papers is now available online. All the papers sound really interesting, so I guess it will be a really good ISMIR!! I am especially glad to see that the Variations3 people will present their work on FRBR-based musical metadata. They seem to have done a lot of interesting things over the last year! I also hope we can make things connect in some ways with MO, thanks to this common FRBR backbone.

Anyway, I can't wait for the actual proceedings which, apparently, will be available online prior to the conference. Quite a few of the selected papers are already available on the Web as pre-prints, though (this really interesting one from Patrick Rabbat and Francois Pachet, for example).

I should have uploaded it earlier, but here is the paper we wrote with Mark Sandler. It describes all the structured data publishing and interlinking work we've been doing over the last year, based on the Music Ontology framework we described last year. We tried to illustrate that by (hopefully) fun examples (Mozart and Metallica are closer than you think... :-) ). It also describes a SPARQL-based web service for feature extraction, driven by workflows written in N3.

Thursday 17 July 2008

Literal search using the Jamendo SPARQL end-point

I just wrote a small SWI-Prolog module for literal search using the ClioPatria SPARQL end-point. It uses the rdf_litidex module, and performs a metaphone search on existing literals in the database. All of that is triggered through a built-in RDF predicate.

Here is an example query you can perform on the Jamendo SPARQL end-point (make sure you select lit as the entailment - it will be the default one soon):

{"punk jazz" <http://purl.org/ontology/swi#soundslike> ?o}

This query binds ?o to all resources within the end-point that are associated with matching literals. For example, you would get back:

The module is available there.

Wednesday 9 July 2008


We learned yesterday that DBTune was nominated for the Triplify Challenge! The other seven projects are really interesting as well, so I guess the competition will be really high! The final results will be given at the I-Semantics conference in early September.

Also, Tim Berners-Lee made a great talk about linked data and the semantic web on Radio 4 earlier today. The first use-case he mentions sounds quite familiar: finding bands based on geo-location data. He already mentioned that in one of his blog posts, linking to this screencast.

An interesting discussion took place on the Linking Open Data mailing list just afterwards, to gather use-cases for explaining to a general public what linked data can be useful for.

Tuesday 1 July 2008

Echonest Analyze XML to Music Ontology RDF

I wrote a small XSL stylesheet to transform the XML results of the Echonest Analyze API to Music Ontology RDF. The Echonest Analyze API is a really great (and simple) web service to process audio files and get back an XML document describing some of their features (rhythm, structure, pitch, timbre, etc.). A lot of people already did really great things with it, from collection management to visualisation.

The XSL is available on that page. The resulting RDF can be queried using SPARQL. For example, the following query selects the boundaries of structural segments (chorus, verse, etc.):

PREFIX af: <http://purl.org/ontology/af/>
PREFIX event: <http://purl.org/NET/c4dm/event.owl#>
PREFIX tl: <http://purl.org/NET/c4dm/timeline.owl#>

SELECT ?start ?duration
FROM <http://dbtune.org/echonest/analyze-example.rdf>
?e      a af:StructuralSegment;
        event:time ?time.
?time   tl:start ?start;
        tl:duration ?duration.

I also added on that page the small bit to add to the Echonest Analyze XML to make it GRDDL-ready. That means that the XML document can be automatically translated to actual RDF data (which can then be aggregated, stored, linked to, queried, etc.).

<Analysis    xmlns:grddl="http://www.w3.org/2003/g/data-view#" 

This provides a lot more data to aggregate for describing my music collection !

If there is one thing I really wish could be integrated in the Echonest API, it would be a Musicbrainz lookup... Right now, I have to manually link the data I get from it to the rest of my aggregated data. If the Echonest results could include a link to the corresponding Musicbrainz resource, it would really simplify this step :-)

Wednesday 25 June 2008

Linking Open Data: BBC playcount data as linked data

For the Mashed event this week end, the BBC released some really interesting data. This includes playcount data, stating how much an artist is featured within a particular BBC programmes (at the brand or episode level).

During the event, I wrote some RDF translators for this data, linking web identifiers in the DBTune Musicbrainz linked data to web identifiers in the BBC Programmes linked data. We used it with Kurt and Ben in our hack. Ben made a nice write-up about it. By finding web identifiers for tracks in a collection and following links to the BBC Programmes data, and finally connecting this Programmes data to the box holding all recorded BBC radio programmes over a year that was available at the event, we can quite easily generate playlists from an audio collection. Two python scripts implementing this mechanism are available there. The first one uses solely brands data, whereas the second one uses episodes data (and therefore helps to get fewer and more accurate items in the resulting playlist). Finally, the thing we spent the most time on was the SQLite storage for our RDF cache :-)

This morning, I published the playcount data as linked data. I wrote a new DBTune service for that. It publishes a set of web identifiers for playcount data, interlinking Musicbrainz and BBC Programmes. I also put online a SPARQL end-point holding all this playcount data along with aggregated data from Musicbrainz and the BBC Programmes linked data (around 2 million triples overall).

For example, you can try the following SPARQL query:

SELECT ?brand ?title ?count
   ?artist a mo:MusicArtist;
      foaf:name "The Beatles". 
   ?pc pc:object ?artist;
       pc:count ?count.
   ?brand a po:Brand;
       pc:playcount ?pc;
       dc:title ?title 
    FILTER (?count>10)}

This will return every BBC brand that has featured The Beatles more than 10 times.

Thanks to Nicholas and Patrick for their help!

- page 2 of 4 -