DBTune blog

To content | To menu | To search

Monday 15 June 2009

And another fun BBC SPARQL query

This query returns BBC programmes featuring artists originating from France (this is just a straight adaptation of the last query in my previous post).

The results are quite fun! Apparently, the big French hits on the BBC are from Jean-Michel Jarre, Air, Modjo, Phoenix (are they known in France? I've only heard of them in the UK) and Vanessa Paradis.

Note that the tracklisting data we expose in our RDF just goes back a couple of months, so that might explain why the list is not bigger.

Thursday 11 June 2009

BBC SPARQL end-points

We recently announced on the BBC backstage blog the availability of two SPARQL end-points, one hosted by Talis and one by OpenLink. These two companies aggregated the RDF data we publish at http://www.bbc.co.uk/programmes and http://www.bbc.co.uk/music. This opens up quite a lot of fascinating SPARQL queries. Talis already compiled a small list, and here are a couple I just designed:

  • Give me programmes that deal with the fictional character James Bond - results
PREFIX po: <http://purl.org/ontology/po/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
SELECT ?uri ?label
WHERE {
  ?uri po:person 
    <http://www.bbc.co.uk/programmes/people/bmFtZS9ib25kLCBqYW1lcyAobm8gcXVhbGlmaWVyKQ#person> ; rdfs:label ?label
}
  • GIve me artists that were featured in the same programme as the Foo Fighters - results
PREFIX po: <http://purl.org/ontology/po/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX mo: <http://purl.org/ontology/mo/>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX event: <http://purl.org/NET/c4dm/event.owl#>
PREFIX tl: <http://purl.org/NET/c4dm/timeline.owl#>
SELECT DISTINCT ?artist2 ?label2
WHERE {
  ?event1 po:track ?track1 .
  ?track1 foaf:maker <http://www.bbc.co.uk/music/artists/67f66c07-6e61-4026-ade5-7e782fad3a5d#artist> .
  ?event2 po:track ?track2 .
  ?track2 foaf:maker ?artist2 .
  ?artist2 rdfs:label ?label2 .
  ?event1 po:time ?t1 .
  ?event2 po:time ?t2 .
  ?t1 tl:timeline ?tl .
  ?t2 tl:timeline ?tl .
  FILTER (?t1 != ?t2)
}
  • Give me programmes that featured both Al Green and the Foo Fighters (yes! there is one result!!) - results
PREFIX po: <http://purl.org/ontology/po/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX mo: <http://purl.org/ontology/mo/>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX event: <http://purl.org/NET/c4dm/event.owl#>
PREFIX tl: <http://purl.org/NET/c4dm/timeline.owl#>
SELECT DISTINCT ?programme ?label
WHERE {
  ?event1 po:track ?track1 .
  ?track1 foaf:maker <http://www.bbc.co.uk/music/artists/67f66c07-6e61-4026-ade5-7e782fad3a5d#artist> .
  ?event2 po:track ?track2 .
  ?track2 foaf:maker <http://www.bbc.co.uk/music/artists/fb7272ba-f130-4f0a-934d-6eeea4c18c9a#artist> .
  ?event1 po:time ?t1 .
  ?event2 po:time ?t2 .
  ?t1 tl:timeline ?tl .
  ?t2 tl:timeline ?tl .
  ?version po:time ?t .
  ?t tl:timeline ?tl .
  ?programme po:version ?version .
  ?programme rdfs:label ?label .
}
  • All programmes that featured an artist originating from Northern Ireland - results
PREFIX po: <http://purl.org/ontology/po/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX mo: <http://purl.org/ontology/mo/>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX event: <http://purl.org/NET/c4dm/event.owl#>
PREFIX tl: <http://purl.org/NET/c4dm/timeline.owl#>
PREFIX dbprop: <http://dbpedia.org/property/>
PREFIX owl: <http://www.w3.org/2002/07/owl#>
SELECT DISTINCT ?programme ?label ?artistlabel ?dbpmaker
WHERE {
  ?event1 po:track ?track1 .
  ?track1 foaf:maker ?maker .
  ?maker rdfs:label ?artistlabel .
  ?maker owl:sameAs ?dbpmaker .
  ?dbpmaker dbprop:origin <http://dbpedia.org/resource/Northern_Ireland> .
  ?event1 po:time ?t1 .
  ?t1 tl:timeline ?tl .
  ?version po:time ?t .
  ?t tl:timeline ?tl .
  ?programme po:version ?version .
  ?programme rdfs:label ?label .
}

(Note that we just need the owl:sameAs in the above query as the Talis end-point doesn't support inference)

Let us know what kind of query you can come up with this data! :-)

Tuesday 12 May 2009

Yahoo Hackday 2009

C

We went to the Yahoo Hackday this week end, with a couple of people from the C4DM and the BBC. Apart from a flaky wireless connection on the Saturday, it was a really great event, with lots of interesting talks and interesting hacks.

On the Saturday, we learned about Searchmonkey. I tried to create a small searchmonkey application during the talk, but eventually got frustrated. Apparently, Searchmonkey indexes RDFa and eRDF , but doesn't follow <link rel="alternate"/> links towards RDF representations (neither does it try to do content negotiation). So in order to create a searchmonkey application for BBC Programmes, I needed to either include RDFa in all the pages (which, hem, was difficult to do in an hour :-) ) or write an XSLT against our RDF/XML representations, which would just be Wrong, as there are lots of different ways to serialise the same RDF in an RDF/XML document.

We also learned about the Guardian Open Platform and Data Store, which holds a huge amount of interesting information. The license terms are also really permissive, even allowing commercial uses of this data. I can't even imagine how useful this data would be if it were linked to other open datasets, e.g. DBpedia, Geonames or Eurostat.

I got also a bit confused by YQL, which seems to be really similar to SPARQL, at least in the underlying concept ("a query language for the web"). However, it seems to be backed by lots of interesting data: almost all of Yahoo services, and a few third-party wrappers, e.g. for Last.fm. I wonder how hard it would be to write a SPARQL end-point that would wrap YQL queries?

Finally, on Saturday evening and Sunday morning, we got some time to actually hack :-) Kurt made a nice MySpace hack, which does an artist lookup on MySpace using BOSS and exposes relevant information extracted using the DBTune RDF wrapper, without having to look at an overloaded MySpace page. It uses the Yahoo Media Player to play the audio files this page links to.

At the same time, we got around to try out some of the things that can be built using the linked data we publish at the BBC, especially the segment RDF I announced on the linked data mailing list a couple of weeks ago. We built a small application which, from a place, gives you BBC programmes that feature an artist that is related in some way to that place. For example, Cardiff, Bristol, London or Lancashire. It might be bit slow (and the number of results are limited) as I didn't have time to implement any sort of caching. The application is crawling from DBpedia to BBC Music to BBC Programmes at each request. I just put the (really hacky) code online.

And we actually won the Backstage price with these hacks! :-)

This last hack illustrates to some extent the things we are investigating as part of the BBC use-cases of the NoTube project. Using these rich connections between things (programmes, artists, events, locations, etc.), it begins to be possible to provide data-rich recommendations backed by real stories (and not only "if you like this, you may like that"). I mentioned these issues in the last chapter of my thesis, and will try to follow up on that here!

Friday 17 April 2009

Brands, series, categories and tracklists on the new BBC Programmes

I just posted a small article on the BBC Radio Labs blog about the new features of the BBC Programmes website. Hopefully that makes some sense and highlights some of the things we've been working on over the last six months! Spoiler: lots of nice nice RDF :-)

Tuesday 24 March 2009

A sneak peek at the BBC Music RDF

The new BBC Music website was launched yesterday, with a lot of Linked Data and RDF goodness. BBC Music provides a truly REST API. Congratulations to the whole team, they did an amazing work! In short, that means that you can easily build applications on top of BBC music data quite easily.

For example, each artist in BBC Music has an RDF representation. For example, Nirvana has an RDF representation, which exposes the aggregated BBC data about this band. The site also supports content negotiation, so doing

$ curl -L -H "Accept: application/rdf+xml" http://www.bbc.co.uk/music/artists/5b11f4ce-a62d-471e-81fc-a69a8278c7da

will lead you to the RDF representation.

Note that this representation includes links to further URIs, allowing you to discover more data, e.g. about members of that band. It also includes a owl:sameAs link to the corresponding DBpedia resource, allowing you to aggregate more data about that band, extracted from Wikipedia's infoboxes.

As an example of a "linked data journey", you can get from Nirvana to Krist Novoselic to the corresponding Krist Novoselic in DBpedia to Compton, California to N.W.A. Lots of really rich data to do interesting thing, like, say, a music recommender :-)

BBC Music also includes RDF representation of reviews, e.g. that one. It also includes an RDF representation of the A to Z, and a search interface returning RDF links to matched artists. For example, here are the results of a search for "Bad Religion", which include a link to an RDF document about it on BBC Music.

Congrats again to Patrick and Nicholas, who did this work on the RDF side of BBC Music!

Tuesday 10 February 2009

Thesis uploaded!

I just uploaded my PhD thesis entitled A Distributed Music Information System, which I defended on the 22nd of January. My examiners were David de Roure from University of Southampton and Nicolas Gold from King's College. My PhD supervisor was Mark Sandler.

Here is the abstract:

Information management is an important part of music technologies today, covering the man- agement of public and personal collections, the construction of large editorial databases and the storage of music analysis results. The information management solutions that have emerged for these use-cases are still isolated from each other. The information one of these solutions manages does not benefit from the information another holds.

In this thesis, we develop a distributed music information system that aims at gathering music- related information held by multiple databases or applications. To this end, we use Semantic Web technologies to create a unified information environment. Web identifiers correspond to any items in the music domain: performance, artist, musical work, etc. These web identifiers have structured representations permitting sophisticated reuse by applications, and these representations can quote other web identifiers leading to more information.

We develop a formal ontology for the music domain. This ontology allows us to publish and interlink a wide range of structured music-related data on the Web. We develop an ontology evaluation methodology and use it to evaluate our music ontology. We develop a knowledge representation framework for combining structured web data and analysis tools to derive more information. We apply these different technologies to publish a large amount of pre-existing music-related datasets on the Web. We develop an algorithm to automatically relate such datasets among each other. We create automated music-related Semantic Web agents, able to aggregate musical resources, structured web data and music processing tools to derive and publish new information. Finally, we describe three of our applications using this distributed information environment. These applications deal with personal collection management, enhanced access to large audio streams available on the Web and music recommendation.

So far, just a PDF is available, as I am still fighting with LaTeX2HTML, but there will be an HTML version some time soon :-) I am also planning to upload, at the same place, some extra annexes and extra results I didn't include in the main document. I think I will also blog here about some of the things included in this thesis.

In case you just want to jump to a particular chapter, I will just give some keywords to the different thesis chapters below:

  1. Introduction
  2. Knowledge Representation and Semantic Web technologies: FOL, Description Logics, RDF, Linked Data, OWL, N3.
  3. Conceptualisation of music-related information: web ontologies, music ontology, time ontology, event ontology, workflow-based modelling
  4. Evaluation of the Music Ontology framework: ontology evaluation, data-driven evaluation, task-based evaluation, latent dirichlet allocation
  5. Music processing workflows on the Web: workflows, concurrent transaction logic, N3, N3-Tr, DLP, publication of dynamically generated results, Semantic Web Services
  6. A web of music-related data: linking open data, dbtune, automated interlinking, quantification of structured web data
  7. Automated music processing agents: N3-Tr, Henry, music analysis, workflows, prolog
  8. Case studies: gnat, gnarql, personal music collection management, zempod, music recommendation
  9. Conclusion

Thursday 29 January 2009

Prolog message queue

It's been a long time since I last posted anything here, but things have been pretty hectic recently (I am a doctor, now!! I'll post my thesis here soon).

I've just hacked a really small implementation of an HTTP-driven SWI-Prolog message queue. I've often find myself doing quite expensive computation in Prolog and the best way to easily distribute it is to have a message queue on which you post messages to process (in that case, Prolog terms), and a pool of workers pick messages and process them. Then, if you find your program is still too slow, you can easily add a couple of workers to help going faster.

Monday 22 December 2008

New server for DBTune

I completed the move of DBTune to a new shiny server yesterday. Things should go way faster, and the server should have a much better uptime. Overall, our experience with 1and1 hosting has been pretty bad: random server reboots, configuration files erased for no reason, and extremely long delays in getting customer support...

Many many thanks to the Centre for Digital Music for hosting the new DBTune!

Monday 15 December 2008

Rockterscale!

Last week, around 10 people from BBC A&Mi, including myself, gathered for two days of hardware hacking. The goal was to build a Rockterscale -- a device that was able to measure how much a band rocks. Since I haven't done any real-time audio processing in a long time, I decided to give that a go - analysing a live audio input and extract some of its characteristics. I used Paul Brossier's Aubio library to do so, as it seemed relatively easy to hack, and was already doing something we thought was great for visualisation purposes: beat tracking from a live audio input. After the first day, we had a bit of C code that extracted the loudness, the spectral centroid and the spectral spread from the live audio input. Then, we sent over the normalised data using Open Sound Control to the visualisation components.

But, of course, the audio signal is not the only thing to consider in order to determine how much a band rocks! We used a number of sensors to capture the reactions of the crowd:

  • The Hat of Rock, capturing some headbanging data:

C

  • An accelerometer under the dance-floor/mosh-pit, and a force sensor hooked on the crash barrier:

C

  • A webcam capturing how much movement there is in the crowd:

C

All the data fed by these different components was visualised on a screen: C

and on a physical rockterscale (yes, it does go up to 11 :-)) C

Here is a small video of all that into action! (I think the best part is the BBC A&Mi people dancing on Ace of Spades to try out the system :-) ).

Friday 14 November 2008

Reuters OpenCalais joins the linked data cloud

Still more fancy linked data to play with - just a couple of weeks after Freebase announced that they publish linked data, OpenCalais just announced that they are going to publish linked data as well, by joining up the results of their entity extraction service to DBpedia URIs.

Wednesday 29 October 2008

SPARQLing a funk legend

I just came across this awesome blog post from Kurt. It starts from a real music question (he saw Maceo Parker live, and wanted to know if he wrote one of the song he played in the encore: Pass the Peas), and finds an answer to it using Semantic Web technologies, in particular SPARQL.

Great stuff!

Freebase does linked data!

Just a small post, live from ISWC: Freebase does linked data!

You can try it there, and you can try this instance, for example.

Freebase linked data

Added to the wonderful David Huynh's Parallax, that's a lot of great news coming from the other side of the Atlantic :-)

Now, to see whether their linked data actually use the Web! Do they link to other web identifiers, available outside Freebase?

I just noticed something weird, also: the read/write permissions are attached to the tracks/films/whatever resources, instead of being attached to the RDF document itself.

Friday 17 October 2008

Next week conferences

I'll be traveling next week to Vienna, for the Web of Data practitioners days in Vienna, where Keith Alexander and I will be giving the first talk (3 hours, yay!). We plan to do quite an exhaustive introduction to linked data and to what happened over the last few years, with quite a few interesting (hopefully!) examples. I'll also give a small introduction to the Music Ontology and to DBTune in the Multimedia session, on the Thursday. The other speakers are truly amazing, so if you're in Vienna next week, please come along! :-)

Next, Patrick and I will be travelling to Karlsruhe, to attend ISWC 2008. This will be my first ISWC, so I am really looking forward to it! And I just noticed the SWI-Prolog folks are presenting a paper there (Thesaurus-based search in large heterogeneous collections), so this will be the perfect occasion for thanking them for the software framework underlying DBTune :-)

Tuesday 7 October 2008

Vocabulary interlinkage diagram

The UMBEL people (Fred and Mike) just released a new interlinkage diagram. This time, it doesn't represent the different links amongst datasets made available within the Linking Open Data project, but rather a map of the links amongst vocabularies used on the data web.

Vocabulary interlinkage

The Music Ontology sits right between FOAF, FRBR and the Event ontology (although I would have added the Timeline ontology as well). There is also the Programmes Ontology, at the bottom (which is also interlinked with the Event ontology, btw).

This diagram really helps to see how the current web ontology landscape is structured, and I hope we can keep use it to keep track of the evolution of web ontologies, a bit like what has been done for the available datasets and interlinks.

Monday 29 September 2008

D2R server, SNORQL and Firefox 3

In case it might be useful for someone else (I've had several requests for it offline), here is a small patch to make D2R server work with the latest versions of ARQ, in order to make the SNORQL SPARQL explorer work in Firefox 3. I sent the patch to Richard some time ago, so hopefully the newest D2R should work with latest versions of ARQ.

Oh, and I've finished (well, almost, just a couple more lines to add to the conclusion) writing up, and started this morning at the BBC!

Sunday 7 September 2008

DBTune wins the second prize in the Triplify challenge!

I submitted DBTune to the Triplify challenge, a couple of months ago. The text of the submission is there. The results of the challenge were given on Friday, at the i-semantics conference. Many many thanks to Michael Hausenblas for representing DBTune there!

And, DBTune won the second prize! Here is a picture of the prize ceremony:

Congratulations to the winners, LinkedMDB, for their amazing work and well-deserved prize, and many thanks to Sören Auer for organizing the challenge!

Wednesday 3 September 2008

Good-bye C4DM, hello BBC!

I've been rather quiet for the last month: intense PhD writing. I have been trying to get it fully written by the end of September. Indeed, I will be joining BBC Audio & Music at the end of the month. I am really really excited about that! Of course, I am a bit sad to leave the Centre for Digital Music, after three fantastic years spent there: great people, great work, great projects, great art and great beer :-)

Thursday 31 July 2008

Semantic search on aggregated music data

I just moved the semantic search demo to a faster server, so it should hopefully be a lot more reliable. This demo uses the amazing ClioPatria on top of an aggregation of music-related data. This aggregation was simply constructed by taking a bunch of Creative Commons MP3s, running GNAT on them, and crawling linked data starting from the web identifiers outputted by GNAT.

I also set up the search tab to work correctly. For example, when you search for "punk", you get the following results.

Punk search 1

Punk search 2

Note that the results are explained: "punk" might be related to the title, the biography, a tag, the lyrics, content-based similarity to something tagged as punk (although it looks like Henry crashed in the middle of the aggregation, so not a lot of such data is available yet), etc. Moreover, you get back different types of resources: artists, records, tracks, lyrics, performances etc.

For example, if you click on one of the records, you get the following.

Punk search 3

This record is available under a Creative Commons license, so you can get a direct access to the corresponding XSPF playlist, Bittorrent items etc., by following the Music Ontology "available as" property. For example, you can click on an XSPF playlist, and listen to the selected record.

Punk search 4

Of course, you can still do the previous things - plotting music artists (or search results, just take a look at the "view" drop-down box) on a map, on a time-line, browse using facets, etc.

Btw, if you like DBTune, please vote for it in the Triplify Challenge! :-)

Wednesday 30 July 2008

Last.fm events and DBpedia mobile

For a recent event at the Dana Centre, I was asked to make a small demo of some nice things you can do with Semantic Web technologies. As it is not funny to re-use demos, I decided to go for something new. So after two hours hacking and skyping with Christian Becker, we added to the last.fm linked data exporter a support for recommended events. I also implemented a bit of geo-coding on the server side (although, with the new last.fm API, I guess this part is becoming useless).

Then, thanks to RDF goodness, it was really straight-forward to make that work with DBpedia mobile. DBpedia mobile is a service getting your geo-location from your mobile device, and displaying you a map with nearby sights, using data from DBpedia. DBpedia mobile also uses the RDF cache of a really nice linked data browser called Marbles.

So, after browsing your DBTune last-fm URI in Marbles, you can go to DBpedia mobile and see recommended events alongside nearby sights. To do so, select the Performances (by moustaki) filter. Here is what I get for my profile, when at the university:

DBpedia mobile and last.fm events

Monday 28 July 2008

Music Ontology linked data on BBC.co.uk/music

Just a couple of minutes ago on the Music Ontology mailing list, Nicholas Humfrey from the BBC announced the availability of linked data on BBC Music.

$ rapper -o turtle \
   http://www.bbc.co.uk/music/artists/cc197bad-dc9c-440d-a5b5-d52ba2e14234

[...]
<http://www.bbc.co.uk/music/artists/cc197bad-dc9c-440d-a5b5-d52ba2e14234#artist>
   a mo:MusicGroup;
   foaf:name "Coldplay";
   owl:sameAs <http://dbpedia.org/resource/Coldplay>;
   mo:member
<http://www.bbc.co.uk/music/artists/18690715-59fa-4e4d-bcf3-8025cf1c23e0#artist>,
<http://www.bbc.co.uk/music/artists/d156ceb2-fd90-4e82-baea-829bbdf1c127#artist>,
<http://www.bbc.co.uk/music/artists/6953c4db-7214-4724-a140-e87550bde420#artist>,
<http://www.bbc.co.uk/music/artists/98d1ec5a-dd97-4c0b-9c83-7928aac89bca#artist>
[...]

This is just really, really, really great... Congratulations to the /music team!

Update: Tom Scott just wrote a really nice post about the new BBC music site, explaining what the BBC is trying to achieve by going down the linked data path.

- page 1 of 4