Archive for category recommendation
Brian just posted ’How Music Recommendation works – and doesn’t work‘ over at his Variogr.am blog. It is a must-read for anyone interested in the state of the art in music recommendation. Here’s an excerpt:
Try any hot new artist in Pandora and you’ll get the dreaded:
Pandora not knowing about YUS
This is Pandora showing its lack of scale. They won’t have any information for YUS for some time and may never unless the artist sells well. This is bad news and should make you angry: why would you let a third party act as a filter on top of your very personal experiences with music? Why would you ever use something that “hid” things from you?
Grab a coffee, sit back and read Brian’s post. Highly recommended.
Oscar and I just finished giving our tutorial on music recommendation and discovery at ACM RecSys 2011. Here are the slides:
In the Recommender Systems world there is a school of thought that says that it doesn’t matter what type of items you are recommending. For these folks, a recommender is a black box that takes in user behavior data and outputs recommendations. It doesn’t matter what you are recommending – books, music, movies, Disney vacations, or deodorant. According to this school of thought you can take the system that you use for recommending books and easily repurpose it to recommend music. This is wrong. If you try to build a recommender by taking your collaborative filtering book recommender and applying it to music, you will fail. Music is different. Music is special.
Here are 10 reasons why music is special and why your off-the-shelf collaborative filtering system won’t work so well with music.
Huge item space – There is a whole lot of music out there. Industrial sized music collections typically have 10 million songs or more. The iTunes music store boasts 18 million songs. The algorithms that worked so wonderfully on the Netfix Dataset (one of the largest CF datasets released, contain user data for 17,770 movies) will not work so well when having to deal with a dataset that is three orders of magnitude larger.
Very low cost per item – When the cost per item is low, the risk of a bad recommendation is low. If you recommend to me a bad Disney Vacation I am out $10,000 and a week of my time. If you recommend a bad song, I hit the skip button and move on to the next.
Many item types - In the music world, there are many things to recommend: tracks, albums, artists, genres, covers, remixes, concerts, labels, playlists, radio stations other listeners etc.
Low consumption time – A book can take a week to read, a movie may take a few hours to watch, a song may take 3 minutes to listen to. Since I can consume music so quickly, I need lots of recommendations (perhaps 30 an hour) to keep my queue filled, whereas 30 book recommendations may keep me reading for a whole year. This has implications for scaling of a recommender. It also ties in with the low cost per item issue. Because music is so cheap and so quick to consume, the risk of a bad recommendation is very low. A music recommender can afford to be more adventurous than other types of recommenders.
Very high per-item reuse - I’ve read my favorite book perhaps half-a-dozen times, I’ve seen my favorite movie 3 times and I’ve probably listened to my favorite song thousands of times. We listen to music over and over again. We like familiar music. A music recommender has to understand the tension between familiarity and novelty. The Netflix movie recommender will never recommend The Bourne Identity to me because it knows that I already watched it, but a good music playlist recommender had better include a good mix of my old favorites along with new music.
Highly passionate users -There’s no more passionate fan than a music fan. This is a two-edged sword. If your recommender introduce a music fan to new music that they like they will transfer some of their passion to your music service. This is why Pandora has such a vocal and passionate user base. On the other hand, if your recommender adds a Nickelback track to a Led Zeppelin playlist you will have to endure the wrath of the slighted fan.
Highly contextual usage - We listen to music differently in different contexts. I may have an exercising playlist, a working playlist, a driving playlist etc. I may make a playlist to show my friends how cool I am when I have them over for a social gathering. Not too many people go to Amazon looking for a list of books that they can read while jogging. A successful music recommender needs to take context into account.
Consumed in sequences – Listening to songs in order has always been a big part of the music experience. We love playlists, mixtapes, DJ mixes, albums. Some people make their living putting songs into interesting order. Your collaborative filtering algorithm doesn’t have the ability to create coherent, interesting playlists with a mix of new music and old favorites
Large Personal Collections – Music fans often have extremely large personal collections – making it easier for recommendation and discovery tools to understand the detailed music taste of a listener. A personalized movie recommender may start with a list of a dozen rated movies, while a music recommender may be able to recommend music based upon many thousands of plays, ratings skips and bans.
Highly Social – Music is social. People love to share music. They express their identity to others by the music they listen to. They give each other playlists and mixtapes. Music is a big part of who we are.
Music is special – but of course, so are books, movies and Disney vacations – every type of item has its own special characteristics that should be taken into account when building recommendation and discovery tools. There’s no one-size-fits-all recommendation algorithm.
I’m off to Chicago to attend the 5th ACM Conference on Recommender Systems. I’m giving a talk with Òscar Celma called Music Recommendation and Discovery Revisited. It is a reprise of the talk we gave 4 years ago at ISMIR 2007 in Austria. Quite a bit has happened in the music discovery space since then so there’s quite a bit of new material. Here’s one of my favorite new slides. 10 points if you can figure out what this slide is all about.
It should be a fun talk, and it is always great working with Oscar. We’ll post the slides on Monday.
I’m interested in learning more about how people are discovering new music. I hope that you will spend 2 mins and take this 3 question poll. I’ll publish the results in a few weeks.
There’s been quite a bit of turmoil around how IOS developers can sell products and subscriptions within their IOS application. Apple says, essentially, if you sell stuff within your app you have to give Apple a 30% cut
and you can’t try to pass costs onto the customer by charging more for items purchased within an App. The cost for an item must be the same whether it was purchased through the app or through some other means. Update: In June, MacRumors reported that Apple updated its TOS so that content providers are now also free to charge whatever price they wish for content purchased outside of an App. Apple also says that you can no longer have a button or a link in your app to a website where a user can purchase content without giving Apple their 30% cut.
For most media industries,there is not enough left of the profit pie to allow Apple to take 30% of it. This has left most media companies in a quandary of how to continue to give their users a good experience, without bankrupting their company. Many folks looked toward Amazon to see how they would react. Amazon’s Kindle reader is used by millions of iPad and iPhone readers to purchase and read digital books. Amazon’s solution was simple. Last week they issued an update to their Kindle Reader IOS app that removed the Kindle Store button. After the update, The [Kindle Store] button is no longer present in the app. This means that users of the Kindle IOS app can no longer launch a book shopping session from within the Kindle app. Here’s the update:
Before the update, the Kindle app looks like this, with a very visible Kindle Store button that will take you to the Kindle web store, where you can buy Kindle books:
After the update, the Kindle App looks like this. The Kindle Store button is gone.
What are music services doing?
I was curious to see how various music subscription services were dealing with the same issue. I fired up the apps, checked for updates and this is what I found.
Spotify updated their app to get rid of any in-app purchases or subscription links just like the Kindle. You can only listen to Spotify mobile if you already have a Spotify mobile account.
When you login to Spotify there is no option to register an account. Spotify just assumes that you have already registered and are ready to login in and start using the app:
Curiously, there is a ‘Get help at Spotify.com’ button on the More page of the app. This will open a web browser and bring you to the Spotify Help page, which puts you two clicks away from a ‘subscribe’ button. This must cut pretty close to Apple’s rules about links to web sites.
Same story for Rhapsody, there’s no way to get a subscription for Rhapsody within the Rhapsody Application.
MOG issued in update in July that removed links to the MOG subscription portal.
Interestingly enough, the very latest version of Napster happily allows you to register for Napster through the application. On the Sign In page there is a prominent Register for Napster button.
Pressing the button brings you to a Registration page where you can sign up for a 7-day free trial
I wonder what happens if a 7-day free trial user converts to a paying subscriber. Does Apple get 30% or is Napster hoping that no one notices?
Update - A Napster update was released one day after this post was published that eliminates the direct signup link:
Slacker’s $3.99 a month Radio Plus product is included as a prominent upgrade in the Slacker app. If you hit the upgrade button you will get a form to fill out with all of your credit card info so they can start charging you the 4 bucks. The question is whether or not Apple is getting $1.20 of that 4 bucks.
With Pandora you can create a free account through the mobile app, but there is no mention of a premium account, nor are there any links to Pandora.com as far as I can tell.
Just like Pandora, the Last.fm app will let you sign up for a non-premium account via the app and makes no mention or attempt to upsell you to a paid account:
Rdio takes a similar approach to Pandora and Last.fm. It allows users to sign up for a 7 day free trial account via the app. It makes no mention and has no links to a premium subscription page. It is not clear to me what happens at the end of the trial period, whether they will prompt you to visit Rdio, or if they just say “Your free trial is over, thanks for listening”.
Update - It is a moving target out there. Rdio issued an update yesterday that now allows you to purchase a monthly subscription in the app. With the new version you can now click on the ‘Subscribe to Rdio Unlimited’. When you do you receive this confirmation dialog:
This allows you to purchase the Rdio subscription for $14.99, which just happens to be 33% more than an Rdio Unlimited subscription would cost if purchased directly from the web. Rdio is taking advantage of Apple’s recent relaxation of the rules and seeing how effective in-app subscription purchases stack up against cheaper out-of-app purchases. There’s a good LA Times article Rdio attempts to survive Apple’s subscription tax that describes Rdio’s approach to dealing with this issue.
The latest version of Playme doesn’t have a button or link that brings you to the Playme subscription page. It does, however, display http://www.playme.com prominently on the sign in page so you can type the URL directly into your browser. I guess technically the words http://www.playme.com are not a link if you can’t click or tap it to go there.
Grooveshark has never been timid of walking up to the line and stepping across it. The only way to get Grooveshark on an IOS device is to Jailbreak your device. With a Jailbroken version, Grooveshark doesn’t need to pay anyone for anything.
Apple has always been a company that prides itself on encouraging an excellent user experience. However, when Apple had to weigh a good user experience against potentially making 30% of every music subscription they decided to screw over the user and go for the pot of money. The reality, is, however, that no music streaming company will ever be able to afford to give Apple a 30% cut. The result is that these apps have to work around Apple’s rules, the result being a poor user experience, and no money for Apple. Hopefully, by the end of the year, Apple will look at the bottom line and realize that they’ve made no extra money from the 30% rule, and instead have encouraged the creation of a big streaming pile of music apps that make the user jump through all sorts of unnecessary hoops for no good reason. Note however, that the story isn’t over. Rdio is experimenting with in-app subscription purchases. If they are successful at this, in a few months time, perhaps we’ll see Spotify, Mog, Rhapsody and the others try the same thing.
Ethan Kaplan over at hypebot had a problem with how hard it is to find soundtracks by John Williams on music services like Spotify and Rdio. Here’s what he said:
Try going to Spotify and browsing movie soundtracks. I’ll wait.
Try searching for John Williams. He is not a guitarist, but that is what comes up mixed in with all of the soundtrack work he has done.
And this is not something unique to Spotify, but also endemic to Rdio and Mog. Mog at least has a page of curated soundtracks, but its just as hard to find them “in the wild” as it is on Spotify. The same applies to Rdio.
Well, of course, if you search for John Williams you’ll get music by both the movie composer and by the guitarist. That is only natural, because, you may really want the music by the guitarist and not music by the composer. Let’s see what happens if you go one step further than Ethan did and search for “john williams soundstracks”. Here are the results on Spotify:
Not surprisingly, there are hundreds of matches of John Williams and soundtracks. Similar results with Rdio:
Lots of John Williams soundtrack results. Rdio even offers human curated playlists filled with soundtracks. What could be better? Likewise, if you just search for soundtracks there are lots of hits:
So I don’t buy Ethan’s premise that it is hard to find soundtracks or music by the movie composer John Williams. However, Ethan’s point still stands: finding new music on current generation music services really sucks. The next generation music services need to do much better to help people explore and discover new music. Music exploration should be fun and yet we are doomed to try to explore and discover music using a tool that looks like an accountant’s spreadsheet.
Back in February I wrote a post about the KDD Cup ( an annual Data Mining and Knowledge Discovery competition), asking whether this year’s cup was really music recommendation since all the data identifying the music had been anonymized. The post received a number of really interesting comments about the nature of recommendation and whether or not context and content was really necessary for music recommendation, or was user behavior all you really needed. A few commenters suggested that it might be possible de-anonymize the data using a constraint propagation technique.
Many voiced an opinion that such de-anonymizing of the data to expose user listening habits would indeed be unethical. Malcolm Slaney, the researcher at Yahoo! who prepared the dataset offered the plea:
As far as I know, no one has de-anonymized the KDD Cup dataset, however, researcher Matthew J. H. Rattigan of The University of Massachusetts at Amherst has done the next best thing. He has published a paper called Reidentification of artists and genres the KDD cup that shows that by analyzing at the relational structures within the dataset it is possible to identify the artists, albums, tracks and genres that are used in the anonymized dataset. Here’s an excerpt from the paper that gives an intuitive description of the approach:
For example, consider Artist 197656 from the Track 1 data. This artist has eight albums described by diﬀerent combinations of ten genres. Each album is associated with several tracks, with track counts ranging from 1 to 69. We make the assumption that these albums and tracks were sampled without replacement from the discography of some real artist on the Yahoo! Music website. Furthermore, we assume that the connections between genres and albums are not sampled; that is, if an album in the KDD Cup dataset is attached to three genres, its real-world counterpart has exactly three genres (or “Categories”, as they are known on the Yahoo! Music site).
Under the above assumptions, we can compare the unlabeled KDD Cup artist with real-world Yahoo! Music artists in order to ﬁnd a suitable match. The band Fischer Z, for example, is an unsuitable match, as their online discography only contains seven albums. An artist such as Meatloaf certainly has enough albums (56) to be a match, but none of those albums contain more than 31 tracks. The entry for Elvis Presley contains 109 albums, 17 of which boast 69 or more tracks; however, there is no consistent assignment of genres that satisﬁes our assumptions. The band Tool, however, is compatible with Artist 197656. The Tool discography contains 19 albums containing between 0 and 69 tracks. These albums are described by exactly 10 genres, which can be assigned to the unlabeled KDD Cup genres in a consistent manner. Furthermore, the match is unique: of the 134k artists in our labeled dataset, Tool is the only suitable match for Artist 197656.
Of course it is impossible for Matthew to evaluate his results directly, but he did create a number of synthetic, anonymized datasets draw from Yahoo and was able to demonstrate very high accuracy for the top artists and a 62% overall accuracy.
The motivation for this type of work is not to turn the KDD cup dataset into something that music recommendation researchers could use, but instead is to get a better understanding of data privacy issues. By understanding how large datasets can be de-anonymized, it will be easier for researchers in the future to create datasets that won’t be easily yield their hidden secrets. The paper is an interesting read – so since you are done doing all of your reviews for RecSys and ISMIR, go ahead and give it a read:
. Thanks to @ocelma for the tip.
For the last year we’ve heard rumors of how both Apple and Google were getting close to releasing music locker services that allow music listeners to upload their music collection to the cloud giving them the ability to listen to their music everywhere. So it was a big surprise when the first major Internet player to launch a music locker service wasn’t Google or Apple, but instead was Amazon. Last week, with little fanfare, Amazon released its Amazon Cloud Drive, a cloud-based music locker that includes the Amazon Cloud Player allowing people to listen to their music anywhere. Amazon’s entry into the music locker is a big deal and should be particularly worrisome for Google and Apple. Amazon brings some special sauce to the music locker world that will make them a formidable competitor:
- Amazon can keep a secret - For the last year, we’ve heard much about the rumored Google and Apple locker services, but not a peep about the Amazon service. The first time people heard about the Amazon Locker service was when Amazon announced it on its front page. It says a lot about a large organization that can launch a major new product without rumors circulating in the industry.
- Amazon isn’t afraid to say “F*ck You” to the labels. While Apple and Google are negotiating licensing rights for the locker service, Amazon just went ahead and released their locker without any special music license. Amazon Director of Music Craig Pape told Billboard.biz “We don’t believe we need licenses to store the customers’ files. We look at it the same way as if someone bought an external hard drive and copy files on there for backup.”
- Amazon knows how to do the ‘cloud thing’ – Amazon has been leading the pack in cloud computing for years. They know how to build reliable, cost-effective cloud-based solutions, they’ve been doing it longer than anyone. Thousands of applications have been deployed in the Amazon cloud from big corporations to successful startups like dropbox. Compare to Apple’s track record for MobileMe. Of course Google knows how to do this stuff too, but they haven’t been immune to problems.
- Amazon knows about discovery – Amazon’s focus on discovery makes them a much better online bookstore than any other bookstore. They use all sorts of ways to connect a reader with a book. Collaborative filtering, book reviews, customer lists, content search, best seller lists , special deals. These techniques help get their readers deep into the long tail of books. Discovery is in Amazon’s genes. Contrast that to how Youtube helps you find videos, or how well Apple’s Genius helps you find music. Currently Amazon is providing no discovery tools yet with the Amazon Cloud Music Player, but you can bet that they will be adding these features soon.
- Amazon understands the importance of metadata – Amazon has always placed a premium on collecting high quality metadata about their media. That’s why they bought IMDB, and created SoundUnwound. That’s why when I uploaded 700 albums to the Amazon cloud, Amazon found album art and metadata for every single one of them. Compare that to iTunes which after nearly 10 years, still can’t seem to find album art for 90% of my music collection.
- Amazon does APIs – this is what I’m most excited about. Imagine if and when Amazon releases the Amazon Cloud Music API that lets a developer build applications around the content stored in a music locker. This will open the door for a myriad of applications from music visualizers, playlisting engines, event recommenders, and taste sharing, on our phones, on our set top boxes, on our computers.. Amazon has lead the way in making everything they do available via APIs. When they release the Amazon Cloud Music API, I think we’ll see a new level of creativity around music exploration, discovery, organization and listening.
- Amazon has done this before – The Kindle platform has already allowed you to do for books what the Amazon music locker does for music. You can buy content in the Amazon store, keep it in your locker and consume it on any device. This is not new tech for Amazon, they’ve been doing this for years already.
- Amazon has lots of customers – Last month Steve said he thought that Apple had more customer accounts than Amazon. Of course that was just a guess and Steve is not impartial. Amazon doesn’t say how many customer accounts they have, but we know its a lot. Amazon is clever in how they use the Music Locker to promote music purchases. Music you purchase from Amazon is stored for free in your locker, and when you buy an album your locker storage gets upgraded to 20GB for free.
- Amazon seems to care - Google has accidentally built the largest music destination on the Internet, but try to use YouTube to as a place to go and find music and you are faced with the challenge of separating the good music from the many covers, remixes, parodies and just plain crap that seem to fill the channel. iTunes has gone from a pretty good way to play music to becoming something that I only use to sync new content to my phone. It is bloated, slow and painful to use. In the ten years that Apple has been king of the digital music hill they’ve done little to help improve the music listening experience. Apple has moved on to video and Apps. Music is just another feature. Contrast that with what Amazon has done with the Kindle – they’ve made a device that arguably improves the reading experience. They chose eInk over color display, they keep the non-reading features to a minimum, they give a reader great discovery tools like the ability to sample the first few chapters of any book. I’m hopeful that Amazon will apply their same since of care for books to the world of music.
Amazon’s music locker is not perfect by any means. There’s no iPhone app. The storage is too expensive, there are no discovery or automatic playlisting features in the player. But what they’ve built is solid and usable. I’m also not bullish on music lockers. I’d rather pay $10 bucks a month to listen to any of 5 million tracks than to buy tracks at a dollar each. But I’m glad to see Amazon position itself so aggressively in this space. The competition between Google, Apple and Amazon will lead to a better music experience for us all.
Kurt Jacobson is a recent additions to the staff here at The Echo Nest. Kurt has built a music exploration site called catfish smooth that allows you to explore the connections between artists. Kurt describes it as: all about connections between music artists. In a sense, it is a music artist recommendation system but more. For each artist, you will see the type of “similar artist” recommendations to which you are accustomed – we use last.fm and The Echo Nest to get these. But you will also see some other inter-artist connections catfish has discovered from the web of linked data. These include things like “artists that are also English Male Singers” or “artists that are also Converts To Islam” or “artists that are also People From St.Louis, Missouri”. And, hopefully, you’ll get some media for each artist so you can have a listen.
It’s a really interesting way to explore the music space, allowing you to stumble upon new artists based on a wide range of parameters.
For example take a look at the many categories and connections catfish smooth exposes for James Brown.
Kurt is currently conducting a usability survey for catfish smooth, so take a minute to kick the tires and then help Kurt finish his PhD and take the survey.