Over the last couple of years, I’ve been lucky enough to get to know Music Information Retrieval researcher Oscar Celma. Oscar and I collaborated on a tutorial on music information retrieval that we presented at ISMIR 2007. We spent many, many hours on phone, email and IM sifting through every aspect of music recommendation.
This fall, Oscar completed his PhD Thesis. Oscar asked me to be the ‘external reader’ so I spent a good part of my Christmas break reading and re-reading the 230 page thesis. Oscar really has done a phenomenal job at looking at the issues and problems in music recommendation and in particular how they (or more accurately, how they don’t) help you find music in the long tail. Oscar’s analysis of how far different types of recommenders can push you deep into the tail.
Oscar has just published he’s thesis along with some supplementary info and code on the web site: Oscar Celma PhD. If you are involved in Music 2.0, I highly recommend reading it.
Some cool plots:
And the abstract …
Music consumption is biased towards a few popular artists. For instance, in 2007 only 1% of all digital tracks accounted for 80% of all sales. Similarly, 1,000 albums accounted for 50% of all album sales, and 80% of all albums sold were purchased less than 100 times. There is a need to assist people to filter, discover, personalise and recommend from the huge amount of music content available along the Long Tail.
Current music recommendation algorithms try to accurately predict what people demand to listen to. However, quite often these algorithms tend to recommend popular -or well-known to the user- music, decreasing the effectiveness of the recommendations. These approaches focus on improving the accuracy of the recommendations. That is, try to make accurate predictions about what a user could listen to, or buy next, independently of how useful to the user could be the provided recommendations.
In this Thesis we stress the importance of the user’s perceived quality of the recommendations. We model the Long Tail curve of artist popularity to predict -potentially- interesting and unknown music, hidden in the tail of the popularity curve. Effective recommendation systems should promote novel and relevant material (non-obvious recommendations), taken primarily from the tail of a popularity distribution.
The main contributions of this Thesis are: (i) a novel network-based approach for recommender systems, based on the analysis of the item (or user) similarity graph, and the popularity of the items, (ii) a user-centric evaluation that measures the user’s relevance and novelty of the recommendations, and (iii) two prototype systems that implement the ideas derived from the theoretical work. Our findings have significant implications for recommender systems that assist users to explore the Long Tail, digging for content they might like.