We just presented yesterday at ISMIR a tutorial about Linked Data for music-related information. More information on the tutorial is available on the tutorial website, and the slides are also available.

In particular, we had two sets of slides dealing with the relationship between music recommendation and linked data. As this is something we're investigating within the NoTube project, I thought I would write up a bit more about it.

Let's focus on artist to artist recommendation for now. If we look at last.fm for recommendations for New Order, here is what we get.

Artists similar to New Order, from last.fm

Similarly, using the Echonest API for similar artists, we get back an ordered list of artists similar to New Order, including Orchestral Manoeuvres in the Dark, Depeche Mode, etc.

Now, let's play word associations for a few bands and musical genres. My colleague Michael Smethurst took the Sex Pistols, Acid House and Public Enemy, and draw the following associations:

Sex Pistols associated words

Acid House word associations

Pubic Enemy word associations

We can see that among the different terms in these diagrams, some refer to people, to TV programmes, to fashion styles, to drugs, to music hardware, to places, to laws, to political groups, to record labels, etc. Just a couple of these terms are actually other bands or tracks. If you were to describe these artists just in musical terms, you'd probably be missing the point. And all these things are also linked to each other: you could play word associations for any of them and see what are the connections between Public Enemy and the Sex Pistols. So how does that relate to recommendations? When recommending an artist from another artist, the context is key. You need to provide an explanation of why they actually relate to each other, whether it's through common members, drugs, belonging to the same independent record label, acoustically similar (if so, how exactly), etc. The main hypothesis here being that users are much more likely to be accepting a recommendation that is explicitly backed by some contextual information.

On the BBC website, we cover quite a few domains, and we try to create as much links as possible between these domains, by following the Linked Data principles. From our BBC Music site, we can explore much more information, from other BBC content (programmes, news etc.) to other Linked Data sources, e.g. DBpedia, Freebase and Musicbrainz. This provides us with a wealth of structured information that we would ultimately want to use for driving and backing up our recommendations.

The MusicBore I've described earlier on this blog kind of uses the same approach. Playlists are generated by following paths in Linked Data. Introduction of each artists is done by generating a sentence from the path leading from the seed artist to the target artist. The prototype described in that paper from the SDOW workshop last year also illustrates that approach.

So we developed a small prototype of these kind of ideas, rqommend (and when I say small, it is very small :) ). Basically, we define "relatedness rules" in the form of SPARQL queries, like "Two artists born in Detroit in the 60s are related". We could go for very general rules, e.g. "Any paths between two artists make them related", but it would be very hard to generate an accurate textual explanation for it, and might give some, hem, not very interesting connections. Then, we just go through these rules on an aggregation of Linked Data, and generate recommendations from them. Here is a greasemonkey script injecting such recommendations with BBC Music (see for example the Fugazi page). It injects Linked Data based recommendations, along with the associated explanation, within BBC artist pages. For example, for New Order:

BBC Music recs for New Order

To conclude, I think there is a really strong influence of traditional information retrieval systems on the music information retrieval community. But what makes Google, for example, particularly successful is to exploit links, not the documents themselves. We definitely need to go towards the same sort of model. Exploiting links surrounding music, and all the cross-domain information that makes it so rich, to create better music recommendation systems which combine the what is recommended with the why it is recommended.