Posts Tagged recommendation
An interesting insight:
when people rent a movie that won’t arrive for a few days, they’re making a bet on what they want at some future point. And, people tend to have a more… optimistic viewpoint of their future selves. That is, they may be willing to rent, say, an “artsy” movie that won’t show up for a few days, feeling that they’ll be in the mood to watch it a few days (weeks?) in the future, knowing they’re not in the mood immediately. But when the choice is immediate, they deal with their present selves, and that choice can be quite different.
When I was a Netflix DVD subscriber the Seven Samurai sat on top of my TV for months. My present self never matched the optimistic view I had of my future self.
Xavier’s blog post on Netfix recommendation is worth the read. Dealing with a household with widely different tastes, the importance of the order of presentation of recommendations
In the Recommender Systems world there is a school of thought that says that it doesn’t matter what type of items you are recommending. For these folks, a recommender is a black box that takes in user behavior data and outputs recommendations. It doesn’t matter what you are recommending – books, music, movies, Disney vacations, or deodorant. According to this school of thought you can take the system that you use for recommending books and easily repurpose it to recommend music. This is wrong. If you try to build a recommender by taking your collaborative filtering book recommender and applying it to music, you will fail. Music is different. Music is special.
Here are 10 reasons why music is special and why your off-the-shelf collaborative filtering system won’t work so well with music.
Huge item space – There is a whole lot of music out there. Industrial sized music collections typically have 10 million songs or more. The iTunes music store boasts 18 million songs. The algorithms that worked so wonderfully on the Netfix Dataset (one of the largest CF datasets released, contain user data for 17,770 movies) will not work so well when having to deal with a dataset that is three orders of magnitude larger.
Very low cost per item – When the cost per item is low, the risk of a bad recommendation is low. If you recommend to me a bad Disney Vacation I am out $10,000 and a week of my time. If you recommend a bad song, I hit the skip button and move on to the next.
Many item types – In the music world, there are many things to recommend: tracks, albums, artists, genres, covers, remixes, concerts, labels, playlists, radio stations other listeners etc.
Low consumption time – A book can take a week to read, a movie may take a few hours to watch, a song may take 3 minutes to listen to. Since I can consume music so quickly, I need lots of recommendations (perhaps 30 an hour) to keep my queue filled, whereas 30 book recommendations may keep me reading for a whole year. This has implications for scaling of a recommender. It also ties in with the low cost per item issue. Because music is so cheap and so quick to consume, the risk of a bad recommendation is very low. A music recommender can afford to be more adventurous than other types of recommenders.
Very high per-item reuse – I’ve read my favorite book perhaps half-a-dozen times, I’ve seen my favorite movie 3 times and I’ve probably listened to my favorite song thousands of times. We listen to music over and over again. We like familiar music. A music recommender has to understand the tension between familiarity and novelty. The Netflix movie recommender will never recommend The Bourne Identity to me because it knows that I already watched it, but a good music playlist recommender had better include a good mix of my old favorites along with new music.
Highly passionate users -There’s no more passionate fan than a music fan. This is a two-edged sword. If your recommender introduce a music fan to new music that they like they will transfer some of their passion to your music service. This is why Pandora has such a vocal and passionate user base. On the other hand, if your recommender adds a Nickelback track to a Led Zeppelin playlist you will have to endure the wrath of the slighted fan.
Highly contextual usage – We listen to music differently in different contexts. I may have an exercising playlist, a working playlist, a driving playlist etc. I may make a playlist to show my friends how cool I am when I have them over for a social gathering. Not too many people go to Amazon looking for a list of books that they can read while jogging. A successful music recommender needs to take context into account.
Consumed in sequences – Listening to songs in order has always been a big part of the music experience. We love playlists, mixtapes, DJ mixes, albums. Some people make their living putting songs into interesting order. Your collaborative filtering algorithm doesn’t have the ability to create coherent, interesting playlists with a mix of new music and old favorites
Large Personal Collections – Music fans often have extremely large personal collections – making it easier for recommendation and discovery tools to understand the detailed music taste of a listener. A personalized movie recommender may start with a list of a dozen rated movies, while a music recommender may be able to recommend music based upon many thousands of plays, ratings skips and bans.
Highly Social – Music is social. People love to share music. They express their identity to others by the music they listen to. They give each other playlists and mixtapes. Music is a big part of who we are.
Music is special – but of course, so are books, movies and Disney vacations – every type of item has its own special characteristics that should be taken into account when building recommendation and discovery tools. There’s no one-size-fits-all recommendation algorithm.
I’m excited! Next week I travel to Austin for a week long computer+music geek-fest at SXSW. A big part of SXSW is the music – there are nearly 2,000 different artists playing at SXSW this year. But that presents a problem – there are so many bands going to SXSW (many I’ve never heard of) that I find it very hard to figure out which bands I should go and see. I need a tool to help me find sift through all of the artists – a tool that will help me decide which artists I should add to my schedule and which ones I should skip. I’m not the only one who was daunted by the large artist list. Taylor McKnight, founder of SCHED*, was thinking the same thing. He wanted to give his users a better way to plan their time at SXSW. And so over a couple of weekends Taylor built (with a little backend support from us) The Unofficial Artist Discovery Guide to SXSW.
The Unofficial Artist Discovery Guide to SXSW is a tool that allows you to explore the many artists attending this year’s SXSW. It lets you search for artists, browse popularity, music style, ‘buzzworthiness’, or similarity to your favorite artists – and it will make recommendations for you based on your music taste (using your Last.fm, Sched* or Hype Machine accounts) . The Artist Guide supplies enough context (bios, images, music, tag clouds, links) to help you decide if you might like an artist.
Here’s the guide:
Here’s a quick tour of some of the things you can do with the guide. First off, you can Search for artists by name, genre/tag or location. This helps you find music when you know what you are looking for.
However, you may not always be sure what you are looking for – that’s where you use Discover. This gives you recommendations based on the music you already like. Type in the name of a few artists (even artists that are not playing at SXSW) or your SCHED*, Hype Machine or Last.fm user name, and ‘Discover’ will give you a set of recommendations for SXSW artists based on your music taste. For example, I’ve been listening to Charlotte Gainsbourg lately so I can use the artist guide to help me find SXSW artists that I might like:
If I see an artist that looks interesting I can drill down and get more info about the artist:
I use Last.fm quite a bit, so I can enter my Last.fm name and get SXSW recommendations based upon my Last.fm top artists. The artist guide tries to mix things up a little bit so if I don’t like the recommendations I see, I can just ask again and I can get a different set. Here are some recommendations based on my recent listening at Last.fm:
If you’ve been using the wonderful SCHED* to keep track of your SXSW calendar you can use the guide to get recommendations based on artists that you’ve already added to your SXSW calendar.
In addition to search and discovery, the guide gives you a number of different ways to browse the SXSW Artist space. You can browse by ‘buzzworthy’ artists – these are artists that are getting the most buzz on the web:
Or the most well-known artists:
You can browse by the style of music via a tag cloud:
And by venue:
Building the guide was pretty straightforward. Taylor used the Echo Nest APIs to get the detailed artist data such as familiarity, popularity, artist bios, links, images, tags and audio. The only data that was not available at the Echo Nest was the venue and schedule info which was provided by Arkadiy (one of Taylor’s colleagues). Even though SXSW artists can be extremely long tail (some don’t even have Myspace pages), the Echo Nest was able to provide really good coverage for these sets (There was coverage for over 95% of the artists). Still there are a few gaps and I suspect there may be a few errors in the data (my favorite wrong image is for the band Abe Vigoda). If you are in a band that is going to SXSW and you see that we have some of your info wrong, send me an email (firstname.lastname@example.org) and I’ll make it right.
We are excited to see the this Artist Discovery guide built on top of the Echo Nest. It’s a great showcase for the Echo Nest developer platform and working with Taylor was great. He’s one of these hyper-creative, energetic types – smart, gets things done and full of new ideas. Taylor may be adding a few more features to the guide before SXSW, so stay tuned and we’ll keep you posted on new developments.
When I test-drive a new music recommender I usually start by getting recommendations based upon ‘The Beatles’ (If you like the Beatles, you make like XX). Most recommenders give results that include artists like John Lennon, Paul McCartney, George Harrison, The Who, The Rolling Stones, Queen, Pink Floyd, Bob Dylan, Wings, The Kinks and Beach Boys. These recommendations are reasonable, but they probably won’t help you find any new music. The problem is that these recommenders rely on the wisdom of the crowds and so an extremely popular artist like The Beatles tends to get paired up with other popular artists – the results being that the recommender doesn’t tell you anything that you don’t already know. If you are trying to use a recommender to discover music that sounds like The Beatles, these recommenders won’t really help you – Queen may be an OK recommendation, but chances are good that you already know about them (and The Rolling Stones and Bob Dylan, etc.) so you are not finding any new music.
At The Echo Nest we don’t base our artist recommendations solely on the wisdom of crowds, instead we draw upon a number of different sources (including a broad and deep crawl of the music web). This helps us avoid the popularity biases that lead to ineffectual recommendations. For example, looking at some of the Echo Nest recommendations based upon the Beatles we find some artists that you may not see with a wisdom of the crowds recommender – artists that actually sound like the Beatles – not just artists that happened to be popular at the same time as the Beatles. Echo Nest recommendations include artists such as The Beau Brummels , The Dukes of Stratosphear, Flamin’ Groovies and an artist named Emitt Rhodes. I had never ever seen Emitt Rhodes occur in any recommendation based on the Beatles, so I was a bit skeptical, but I took a listen and this is what I found:
Update: Don Tillman points to this Beatle-esque track:
Emitt could be the sixth Beatles. I think it’s a pretty cool recommendation
This year ISMIR concludes with the 1st Workshop on the Future of MIR. The workshop is organized by students who are indeed the future of MIR.
MIR, where we are, where we are going
Session Chair: Amélie Anglade Program Chair of f(MIR)
Meaningful Music Retrieval
Frans Wiering – [pdf]
- Some unfortunate tendencies: anatomical view of music – a dead body that we do autopsies, time is the loser Traditional production-oriented/
- Measure of similarity: relevance, surprise
- Few interesting applications for end-users
- bad fit to present-day musicological themes
- We are in the world of ‘pure applied research’ – no truth interdisciplinary between music domain knowledge and computer science.
- Music is meaningful (and the underlying personal motivation of most MIR researchers).
- Meaning in musicology – traditionally a taboo suject
- Subjectivity: an indivds. disposition to engage in social and cultural interactions
- Meaning generation process – we have a long-term memory for music –
- Can musical meaning provide the ‘big story line’ for MIR?
The Discipline Formerly Known As MIR
Perfecto Herrera, Joan Serrà, Cyril Laurier, Enric Guaus, Emilia Gómez and Xavier Serra
Intro: Our exploration is not a science-fiction essay. We do not try to imagine how music will be conceptualized, experienced and mediated by our yet-to-come research, technological achievements and music gizmos. Alternatively, we reflect on how the discipline should evolve to become consolidated as such, in order it may get an effective future instead of becoming, after a promising start, just a “would-be” discipline.Our vision addresses different aspects: the discipline’s object of study, the employed methodologies, social and cultural impacts (which are out of this long abstract because of space restrictions), and we finish with some (maybe) disturbing issues that could be taken as partial and biased guidelines for future research.
Notes: One motivation for advancing MIR – more banquets!
- MIR is no more about retrieval than computer science is about computers
- Music Information Retrieval – it’s too narrow
- Music Information or Information about Music?
- Interested in the interaction with music information
- We should be asking more profound questions
- content tresasures in short musical exceprts, tracks performances etc.
- music understanding systems
- Most metadata will be generated in the creation / production phase (hmm.. don’t agree necessarily, all the good metadata (tags, who likes what) is based on context and use which is post-hoc)
- Instead of automatic analysis – build systems to help humans help humans
- Music like water? or Music as dog!!! – a friend – companion –
- Personalization, Findability
- Music turing test
Good, provocative talk
Oral Session 2: Potential future MIR applications
Session Chair: Jason Hockman (McGill University), Program Chair of f(MIR)
Machine Listening to Percussion: Current Approaches and Future Directions – [pdf]
Abstract: approaches have been taken to detect and classify percussive events within music signals for a variety of purposes with differing and converging aims. In this paper an overview of those technologies is presented and a discussion of the issues still to overcome and future possibilities in the field are presented. Finally a system capable of monitoring a student drummer is envisaged which draws together current approaches and future work in the field.
- Challengs: Onset detection of isolated drum strokes
- Onset detection and classification of overlapping drum sounds
- Onset detection and classification in the presence of other instruments
- Variability in Percussive sounds . Dozens of criteria effect the sounds produced (strike velocity, angle, position etc.)
- Future Research Areas
- Extension of recognition to include the wide variety of strokes. (open hh, half-open hh, hh foot splash etc)
MIR When All Recordings Are Gone: Recommending Live Music in Real-Time – [pdf]
Marco Lüthy and Jean-Julien Aucouturier
Recommending live and short lived events. Bandsintown, Songkick, gigulate … pay attention to this paper.
- Recommendation for live music in real-time
- Coldplay -> free album when you get a ticket to a coldplay concert – give away the music
- NIN -> USB keys in the toilet – which had strange recording on the file – strange sounds – an FFT of the sounds showed phone number and GPS coordinates – turned into a treasure hunt to a NIN nails concert.
- Komuso Tokugawa – an avatar for a musiciaon in second life. Plays in second life, twitters concert announcements (playing wake for Les Paul in 3 minutes)
- ‘How do we get there in time?’
- JJ walked through how to implement a recommender system in second life
- Implicit preference inferred from how long your avatar listens to a concert (Nicole Yankelovich at Sun Labs should look at this stuff)
- Great talk by JJ – full of energy – neat ideas. Good work.
- Global Access to Ethnic Music: The Next Big Challenge?
Olmo Cornelis, Dirk Moelants and Marc Leman
- The Future of Music IR: How Do You Know When a Problem Is Solved?
Eric Nichols and Donald Byrd
At the core of just about everything we do here at the Echo Nest is what we call “The Knowledge”. This is big pile of data that represents everything we know about music. To build ‘The Knowledge’ we crawl the web looking for every bit of info about music. We find music blogs, artist news, album reviews, biographies, audio, images, videos, fan activity and on and on. This gives us a huge set of raw data that represents the global conversation about music. Next, we apply a set of statistical and natural language processing algorithms to this raw data to give us a deeper understanding of what all this data means. For instance, one fundamental algorithm tells us whether a particular web document is about a particular artist. This might be easy for an artist with a distinctive name like Metallica, but may not be so easy for The Rolling Stones (is it the band or the magazine?), and can be hard for bands with ambiguous names like Air and Yes, and can be extremely difficult for artists such as Torsten Pröfrock who tragically has chosen the stage name ‘Various Artists‘ (what was he thinking?). Another algorithm that we apply to music reviews is sentiment analysis. This helps us decide whether or not a reviewer has a positive opinion about the music being reviewed. We can take a review like this one written by Jennie, my 14 year old daughter, and learn whether or not she likes the new album by Beyoncé and whether or not she tends to like R&B and pop music.
In addition to analyzing what people are writing about music, we also try to extract as much meaning as we can from the music itself. We apply digital signal processing and machine learning algorithms to audio allowing us to extract information such as tempo, key, song structure, loudness, energy, harmonic content and timbre from every song.
Traditionally, “The Knowledge” has helped us build tools to help music fans explore and discover music – using all this data helps us predict what type of music a listener might like. For the last year, we’ve offered artist similarity and music recommendation web services around this data. But now we are going to turn this all upside down. Instead of using this data to help listeners find new music, we are going to use this data to help artists find new fans. That is what Fanalytics is all about.
For example, music blogs and review sites are becoming increasingly important way for an artist to build buzz around a new release. However, there are thousands of music blogs – each with its own specialty. This becomes a problem for the artist. How can she decide which blogs she should target for promoting her new album? This is one of the problems that Fanalytics tries to solve. With ‘The Knowledge’ we know quite a bit about thousands of music blogs. We know the reputation and the reach of a blog. We know what types of a music a particular author tends to write about, and we know what kinds of music they tend to like. With this knowledge we can make what is essentially a recommendation engine for music promotion. For any artist we can recommend a set blogs and writers that would most likely be interested in writing about the artist.
In addition to this recommendation engine tailored to music promotion, Fanalytics also provides a set of analytics tools that use ‘The Knowledge’ to help artists better understand their audience. For instance, an artist can track everything that is being said online about them – every blog post, news item, music review, video, as well as their online ‘buzz’ – a quantitative measure of how much attention the artist is receiving from reviewers, bloggers, fans, etc.
We have just launched Fanalytics, but apparently we are already seeing strong interest from the labels. (According the press release Interscope, Independent Label Group (WMG), RCA Music Group (Sony) and The Orchard are already on board). That’s not too surprising, the labels are looking for new ways to reach out to fans. As we continue to grow “The Knowledge” here at the Echo Nest I’m sure we will be creating more interesting tools like Fanalytics that are built around the data .
One of the ways that Music 2.0 has changed how we think about music is that there is so much interesting data available about how people are listening to music. Sites like Last.fm automatically track all sorts of interesting data that just was not available before. Forty years ago, a music label like Capitol would know how many copies the album Abbey Road sold in the U.S., but the label wouldn’t know how many times people actually listened to the album. Today, however, our iPods and desktop music players keep careful track of how many times we play each song, album and artist – giving us a whole new way to look at artist popularity. It’s not just sales figures anymore, its how often are people actually listening to an artist. If you go to Last.fm you can see that The Beatles have over 1.75 million listeners and 168 million plays. It makes it easy for us to see how popular the Beatles are compared to another band (the monkees, for instance have 2.5m plays and 285K listeners).
With all of this new data available, there are some new ways we can look at artists. Instead of just looking at artists in terms of popularity and sales rank, I think it is interesting to see which artists generate the most passionate listeners. These are artists that dominate the playlists of their fans. I think this ‘passion index’ may be an interesting metric to use to help people explore for and discovery music. Artists that attract passionate fans may be longer lived and worth a listeners investment in time and money.
How can we calculate a passion index? There are probably a number of indicators: the number of edits to the bands wikipedia page, the average distance a fan travels to attend a show by the artist, the number of fan sites for an artist. All of these may be a bit difficult to collect, especially for a large set of artists. One simple passion metric is just the average number of artist plays per listener. Presumably if an artist’s listeners are playing an artist’s songs more than average they are more passionate about the artist. One thing that I like about this approach to the passion index is that it is extremely easy to calculate – just divide the total artist plays by the total number of artist listeners and you have the passion index. Yes, there are many confounding factors – for instance, artists with longer songs are penalized – still I think it is a pretty good measure.
I calculated the passion index for a large collection of artists. I started with about a million artists (it is really nice to have all this data at the Echo Nest;), and filtered these down to the 50K most popular artists. I plotted the number of artist plays vs. the number of artist listeners for each of the 50 K listeners. The plot shows that most artists fall into the central band (normal passion), but some (the green points) are high passion artists and some (the blue points) are low passion artists.
For the 50K artists, the average track plays per artist/listener is just 11 plays (with a std deviation of about 11.5). Considering that there are a substantial number of artists in my iTunes collection that I’ve played only once, this seems pretty resaonable.
So who are the artists with the highest passion index? Here are the top ten:
I didn’t recognize any of these artists (and I’m not even sure if 上海アリス幻樂団 is really an artist – according to the Japanese wikipedia it is a fan club in Japan to produce a music game coterie – whatever that means). Belo is a Brazilian pop artist that does indeed seem to have some rather passionate fans.
It is not surprising that it is hard for popular artists to rank at the very top of the passion index. Popular artists are exposed to many, many listeners which can easily reduce the passion index. Here are the top passion-ranked artists drawn from the top-1000 most popular artists:
|75||269052||20293399||Mindless Self Indulgence|
|74||1056834||79135038||Nine Inch Nails|
|66||460518||30625121||Children of Bodom|
I find it interesting to see all of the heavy metal bands in the top 20. Metal fans are indeed true fans.
Going to the other end of passion, we find the 20 popular artists that have the least passionate fans:
|5||282095||1685959||The Isley Brothers|
|5||388183||2244878||Kool & The Gang|
I guess people are not too passionate about Soft Cell.
Here’s a passion chart for the top 100 most popular artists. Even the artists at the bottom of this chart are way above average on the passion index.
|74||1056834||79135038||Nine Inch Nails|
|61||1397442||85685015||System of a Down|
|60||1346298||81762621||Death Cab for Cutie|
|57||1060269||61127025||Fall Out Boy|
|55||1897332||104932225||Red Hot Chili Peppers|
|54||950416||52019102||My Chemical Romance|
|43||1011131||43930085||Kings of Leon|
|40||1023666||41288978||Queens of the Stone Age|
|39||1266502||49492511||The White Stripes|
|36||1326946||48738588||The Smashing Pumpkins|
|34||955876||33376744||Jimmy Eat World|
|30||1178755||35600916||Rage Against the Machine|
|29||1030982||30044419||Yeah Yeah Yeahs|
|28||985715||28485679||The Postal Service|
|28||1305984||37807059||Guns N’ Roses|
|26||1503035||40161219||The Rolling Stones|
|23||976745||22557111||3 Doors Down|
|20||1057288||22084785||The Chemical Brothers|
|19||968885||19219364||Simon & Garfunkel|
|16||996649||16234996||Black Eyed Peas|
I think it would be really interesting to incorporate the passion index into a recommender, so instead of just recommending artists that are similar to artists that a listener already likes, filter the similar artists with a passion filter and offer up the artists that listeners are most passionate about. I think these recommendations would be more valuable to the listener.