DBTune blog

To content | To menu | To search

Tag - event

Entries feed

Thursday 10 September 2009

Linked Data London event screencasts and London Web Standards meetup

With Tom Scott, we presented a talk on contextualising BBC programmes using linked data for the Linked Data London event. For the occasion, I made a couple of screencasts.

The first one shows some browsing of the linked data we expose on the BBC website, using the Tabulator Firefox extension. I start by getting to a Radio 2 programme, to get to its segmentation in musical tracks, to get to another programme featuring one of the tracks, to get to another artist featured in that programme. The Tabulator ends up displaying data aggregated from BBC Programmes, BBC Music and DBpedia.

Exploring BBC programmes and music data using the Tabulator

The second one shows what you can do by using these programmes/artists and artists/programmes links. We built some very straight-forward programme to programme recommendation using them. On the right-hand side of the programme page, there are recommendations, based on artists played in common. The recommendations are scoped by the availability of the programme on iPlayer or by the fact it has an upcoming broadcast. If you hover over those recommendations, it will display what allowed us to derive it: here, a list of common artists played in the two programmes. This work is part of our investigations within the NoTube European project.

Artist-based programme to programme recommendations

Also, as Michael already posted on Radio Labs, we gave a presentation to the London Web Standards group on Linked Data. It was a very nice event, especially as mainly web developers turned up. Linked data events tend to be mostly about linked data evangelists talking to other linked data evangelists (which is great too!), so this was quite different :-) Lots of interesting questions about provenance and trustworthiness of data were asked, which are always a bit difficult to answer, apart from the usual it's just the Web, you can deal with it as you do (or don't) currently with Web data, e.g. by keeping track of provenance information and filtering based on that. Somebody raised that you could make some statistics on how many times a particular statement is repeated in order to derive its trustworthiness, but this sounds a bit harmful... Currently on the Linked Data cloud, lots of information gets repeated. For example, if a statement about an artist is available on DBpedia, there is a fair chance it will get repeated in BBC Music, just because we also use Wikipedia as an information source. The fact that this statements gets repeated doesn't make it more valid.

Skim-read introduction to linked data slides

Friday 17 October 2008

Next week conferences

I'll be traveling next week to Vienna, for the Web of Data practitioners days in Vienna, where Keith Alexander and I will be giving the first talk (3 hours, yay!). We plan to do quite an exhaustive introduction to linked data and to what happened over the last few years, with quite a few interesting (hopefully!) examples. I'll also give a small introduction to the Music Ontology and to DBTune in the Multimedia session, on the Thursday. The other speakers are truly amazing, so if you're in Vienna next week, please come along! :-)

Next, Patrick and I will be travelling to Karlsruhe, to attend ISWC 2008. This will be my first ISWC, so I am really looking forward to it! And I just noticed the SWI-Prolog folks are presenting a paper there (Thesaurus-based search in large heterogeneous collections), so this will be the perfect occasion for thanking them for the software framework underlying DBTune :-)

Friday 25 July 2008

List of accepted ISMIR 2008 papers

Just spotted through Paul's blog: the list of accepted ISMIR 2008 papers is now available online. All the papers sound really interesting, so I guess it will be a really good ISMIR!! I am especially glad to see that the Variations3 people will present their work on FRBR-based musical metadata. They seem to have done a lot of interesting things over the last year! I also hope we can make things connect in some ways with MO, thanks to this common FRBR backbone.

Anyway, I can't wait for the actual proceedings which, apparently, will be available online prior to the conference. Quite a few of the selected papers are already available on the Web as pre-prints, though (this really interesting one from Patrick Rabbat and Francois Pachet, for example).

I should have uploaded it earlier, but here is the paper we wrote with Mark Sandler. It describes all the structured data publishing and interlinking work we've been doing over the last year, based on the Music Ontology framework we described last year. We tried to illustrate that by (hopefully) fun examples (Mozart and Metallica are closer than you think... :-) ). It also describes a SPARQL-based web service for feature extraction, driven by workflows written in N3.

Wednesday 25 June 2008

Mashed!

I was at Mashed (the former Hack Day) this week-end - a really good and geeky event, organised by the BBC at Alexandra Palace. We arrived on the Saturday morning for some talks, detailing the different things we'd be able to play with over the week-end. Amongst these, a full DVB-T multiplex (apparently, it was the first time since 1956 that a TV signal was broadcasted from Alexandra Palace), lots of data from the BBC Programmes team and a box full of recorded radio content over the last year.

After these presentations, the 24 hours hacking session began. We sat down with Kurt and Ben and wrote a small hack which basically starts from a personal music collection and creates you a playlist of recorded BBC programmes. I will write a bit more about this later today

During the 24 hours hack, we had a Rock Band session on big screen, a real-world Tron game (basically, two guys running with GPS phones, guided by two persons watching their trail on a google satellite map :-) ), a rocket launching...

Finally, at 2pm on the Sunday, people presented their hacks. Almost 50 hacks were presented, all extremely interesting. Take a look at the complete list of hacks! On the music side, Patrick's recommender was particularly interesting. It used Latent Semantic Analysis on playcount data for artists in BBC brands and episodes to recommend brands from artists or artists from artists. It gave some surprising results :-) Jamie Munroe resurrected the FPFF Musicbrainz fingerprinting algorithm (which was apparently due to replace the old TRM one before MusicIP offered their services to Musicbrainz) to identify tracks played several times in BBC programmes. The WeDoID3 team talked about creating RSS feeds from embedded metadata in audio and video, but the demo didn't work.

My personal highlight was the hack (which actually won a prize) from Team Bob. Here is a screencast of it:


BBC Dylan - News 24 Revisited (Clip) from James Adam on Vimeo.

Thanks to Matthew Cashmore and the rest of the BBC backstage team for this great event! (and thanks to the sponsors for all the free stuff - I think I have enough T-shirts for about a year now :-))

Tuesday 3 June 2008

Sorted Sound at the Dana Centre

If you're around London on Thursday, a couple of people from the Centre for Digital Music in Queen Mary, University of London (including myself) will talk about the research we do in music technologies, at the Dana Centre in South Kensington.

The event description is mainly focused about search. Kurt will indeed demo Soundbite and Ben and Michela from Goldsmiths college will demo a fast content-based search on large music databases. However, Chris, Katy and Matthew will demo the Sonic Visualiser, a great open source software to analyse and visualise audio data. I will talk about Semantic Web technologies, in particular the Music Ontology and Linked Data. I will also be demoing some things related to organising music collection using Semantic Web data, and user interfaces to interact with them in unusual ways. As Tom puts it, the Semantic Web is not all about search :-)

Monday 12 May 2008

Linked Data on the Web 2008

I just got back from Beijing (I did a two weeks trip around China after the actual conference), where I attended the Linked Data on the Web workshop and the WWW conference.

The workshop was really good, gathering lots of people from the Linking Open Data community (it was the first time I met most of these people, after more than one year working with them :-) ).

The attendance was much higher than expected, with around 100 people registered for the workshop.

C

It started well with this sentence by Tim Berners-Lee in the workshop introduction:

Linked Data is the Semantic Web done right, and the Web done right.

That's a pretty good way to start a day :-) Then, Chris Bizer did a good overview of what the community has achieved in one year, illustrated by the different versions of Richard's diagram:

C

All the talks and papers were extremely high quality. I got particularly interested by some of them, including Tim's presentation on the new SPARQL/Update capabilities of the Tabulator data browser. This allows easy interaction with data wikis, where everyone can add or correct information.

C

I really liked Alexandre Passant's presentation on the Flickr exporter, which is highlighting a mechanism that I used for the Last.fm linked data exporter: linking several identities on several web-sites is just a owl:sameAs link away. Alexandre also did another presentation on MOAT (Meaning of a Tag), a really interesting project allowing to relate tags to Semantic Web URIs. For example, it allows to easily draw a link between my tag "paris texas" to the movie Paris, Texas in DBpedia.

I got a bit confused by Paul Miller's presentation about licensing open data. I have been aware of these efforts mainly by the work of the Open Knowledge Foundation and the Open Data Commons project, and I think these are truly crucial issues: we need open data, and explicit licensing. But perhaps the audience was not so well chosen: most (if not all) of us in the Linking Open Data community do not own the data they publish as RDF and interlink. DBpedia exports data extracted from Wikipedia, DBTune exports data from different music-related sources such as Jamendo or Last.fm, etc. The only data that we can possibly explicitly license are links (the only thing we actually own), and it does not have any values without any data :-) So I guess the outreach should mainly be done to raw data publishers rather than Semantic Web translators? But hopefully, in a near future, the two communities will be the same!

C

One of my personal highlights was also Christian Becker's presentation about DBpedia mobile: a location-enabled linked data browser for mobile devices., giving you nearby sights and detailed descriptions, restaurants, hotels, etc. We chatted a bit after the workshop with Alexandre and Christian about adding Last.fm events to the DBtune exporter to also display nearby gigs (with optional filtering based on your foaf:interests, of course :-) ).

Jun Zhao's presentation about linked data and provenance for biological resources was extremely interesting: they are dealing with problems strongly similar to ours in a Music Information Retrieval context. How to trust a particular statement (for example, a structural segmentation of a particular track) found on the web? We need to know whether it was written by a human, or derived through a set of algorithms, and in this case, we might want to choose timbre-based instead of chroma-based workflows in the case of Rock music, for example. This is the sort of things we implemented within our Henry software (more to come on that later, including online demo as soon as I put it on better hardware, and (hopefully) a PhD :-D ).

Wolfgang Halb did a presentation about our Riese project, but more on that later as I wrote the back-end software powering it and I'd like to give it a full blog entry soon.

I did a presentation about automatic interlinking algorithms on the data web, with a focus on music-related datasets. I detailed an algorithm we developed for this purpose, propagating similarity measures around web data as long as we can't take an interlinking (creating a bunch of owl:sameAs links) decision. This algorithm is good in the sense that it gives a really low rate of false-positives. On the test-set detailed in the paper, it made no wrong decisions. I blogged about this algorithm earlier.

C

Some people expressed concerns about the proliferation of owl:sameAs links (highlighted in this presentation by Paolo Bouquet). But I truly think it is a necessary thing, as long as web identifiers are tied to their actual representation. I need to be able to have a web identifier for a song in Jamendo and a web identifier for the same song in Musicbrainz, and I need a way to link these together: owl:sameAs is perfect for that. I wouldn't trust a centralised "identity" system (what actually is identity anyway? :-) ), as it would break the nice decentralised information paradigm we're implementing within the Linking Open Data project.

Anyway, lots of great people, a great time, lots of interesting discussions and new ideas... I am really looking forward for WWW 2009 in Madrid and the next workshop!!!

Sunday 17 February 2008

Yay for SemanticCamp!

I was at the SemanticCamp event this week-end - it was great fun! Lots and lots of Semantic-Web/Microformat geeks! We did a small C4DM session, on the Saturday, with Chris, Kurt and David, basically getting through the Music Ontology, interlinked music datasets (especially the new classical music composers one), softwares in the MOTOOLS sourceforge project (mainly GNAT, our music collection linker, and GNARQL, our music information aggregator, showing a live demo of this screencast). We finished on a SPARQL end-point providing access to content-based features, based on our Henry software.

Slides, code and demos are available here.

My personal highlights of the Saturday were the DBPedia presentation by Georgi, the Automatic indexing of science by Andrew, and the BBC /programmes presentation, where they finally unveiled their evil plans :-)

I did join them at the end to talk about one of the bubble on their pentagram of data, the RDF programmes data.

On the Sunday, we had a really great discussion about audio/video on the Semantic Web, with people from Joost, from the BBC, from Talis and from URIPlay. I guess one of the main achievement was the mapping of the URIPlay ontology and the BBC one (well, ok, it was just a owl:sameAs away :-) ). I did not actually play, but the Semantopoly game looked like great fun!

I really enjoyed Nicholas presentation about streaming RDF along with radio streams, also, with his neat hardware hacks to create a Wifi radio station out of a vintage Philips one! Then, I guess we had quite a geeky session with Tom, crawling the Linking Open Data cloud with CURL, and hand-editing a FOAF file to manage several online identities:

I found Premasagar presentation on compound microformats really interesting, as it made me realise a particular "limitation" of microformats (perhaps I am not using exactly the right word, here) that I really didn't get before.

Not to mention the great beers on Saturday, etc. etc. :-) It was a really great week-end! Thank you Tom and Daniel!!

Tuesday 30 October 2007

Specifications for the Event and the Timeline ontologies

It has been a long time since my last post, but I was busy traveling (ISMIR 2007, ACM Multimedia, and AES), and also took some holidays afterwards (first ones since last Xmas... it was great :-) ).

Anyway, in my slow process of getting back to work, I finally wrote specification documents for the Timeline ontology and the Event ontology, that Samer Abdallah and I worked on three years ago. These are really early documentation draft though, and might be a bit unclear, don't hesitate to send me comments about them!

The Timeline ontology, extending some OWL-Time concepts, allows to address time points and intervals on multiple timelines, backing signals, video, performances, works, scores, etc. For example, using this ontology, you can express "from 1 minute and 21 seconds to 1 minutes and 55 seconds on this signal".

Timeline ontology

The Event ontology allows to deal with, well, events. In it, events are seen as arbitrary classification of space/time regions. This definition makes it extremely flexible: it covers everything from music festivals to conferences, meeting notes or even annotations of a signal. It is extremely simple, and defines one single concept (event), and five properties (agent, factor, product, place and time).

Event ontology

The following representations are available for these ontology resources:

  • RDF/XML

$ curl -L -H "Accept: application/rdf+xml" http://purl.org/NET/c4dm/event.owl

  • RDF/Turtle

$ curl -L -H "Accept: text/rdf+n3" http://purl.org/NET/c4dm/event.owl

  • Default (XHTML)

curl -L http://purl.org/NET/c4dm/event.owl

And also, make sure you check out the Chord ontology designed by Chris Sutton, and the associated URI service (eg. A major with an added 7th). All the code (RDF, specification, specification generations cripts, URI parsing, 303 stuff, etc.) is available in the motools sourceforge project.

Monday 21 May 2007

"Music and the Web" workshop, AES 122 Vienna Convention

At the beginning of the month, I was invited to speak at the Music and the Web workshop, at the Audio Engineering Society convention, in Vienna.

The first talk was from Scott Cohen, co-founder of The Orchard (btw, I just noticed he was also talking at the WWW conference, last year). He spoke about The death of digital music sales (which is a bit ironic, from the founder of the leading digital music distributor). His main argument was that the music industry will never get enough money by selling digital music, and that it needs to understand the need for an alternative economic model, based on a global license (as was discussed by the French parliament for a really short time, during the DADVSI debates, last year).

Slides

The second talk was from Mark Sandler, the head of the Centre for Digital Music, in Queen Mary, University of London. He talked about the OMRAS2 project (OMRAS stands for Online Music Recognition and Searching), and some of the technologies that it will use. Basically, OMRAS2 is about creating a decentralised research environment for musicologists and music information retrieval researchers. Therefore, the Semantic Web definitely seems to fit quite nicely into it:-)

Slides

The third talk was from Oscar Celma, working at the Music Technology group in Barcelona. He is the creator of the FOAFing-the-music music recommender, which actually won the 2nd prize of last year Semantic Web Challenge. His talk was about music recommendation (the oh, if you like this, you should like that! problem), and the choice of different technologies (collaborative filtering, content-based) for different needs. He was terribly sick though, but succeeded to make his 40 minutes talk without his voice failing!

Slides

The fourth talk was, well, myself:-) I thought it would be a non-expert audience, so I tried to give a not too technical talk. I just did a quick introduction to some Semantic Web concepts, and then dived into the Music Ontology, explaining its basements (Timeline, Event, FRBR, FOAF), the different levels of expressiveness it allows, etc. Then, I talked about linked data. As a conclusion (not much time left), I just highlighted a few bullet points, all related to this Semantic media player which keeps taking a large space in my brain these days.

Slides.

I had some pretty good feedbacks, and I was really pleased to see a reference to the Music Ontology on Lucas Gonze slides, who was speaking just after me :-) Lucas (too many things to say about him, just check his website, and realise you surely use every day something that he developed) was doing his talk from California, through Skype, and was talking about the Semantic Album - new means of packaging and distributing complex, multi-facet, content. it was a really interesting talk, even though there were some bandwidth problems from time to time.

Slides

Finally, there were some time at the end of the workshop for some discussion, which went really well. There were a lot of discussion with someone from an intellectual property agency, mostly reacting to Scott Cohen's talk. Well, I won't go into details here, because I think this discussion deserves a post on its own...

Here is a picture of the audience during the panel.