Archive for category recommendation
Here at the Echo Nest just added a new feature to our APIs called Personal Catalogs. This feature lets you make all of the Echo Nest features work in your own world of music. With Personal Catalogs (PCs) you can define application or user specific catalogs (in terms of artists or songs) and then use these catalogs to drive the behavior of other Echo Nest APIs. PCs open the door to all sorts of custom apps built on the Echo Nest platform. Here are some examples:
Create better genius-style playlists – With PCs I can create a catalog that contains all of the songs in my iTunes collection. I can then use this catalog with the Echo Nest Playlist API to generate interesting playlists based upon my own personal collection. I can create a playlist of my favorite, most danceable songs for a party, or I can create a playlist of slow, low energy, jazz songs for late night reading music.
Create hyper-targeted recommendations – With PCs I can make a catalog of artists and then use the artist/similar APIs to generate recommendations within this catalog. For instance, I could create an artist catalog of all the bands that are playing this weekend in Boston and then create Music Hack Day recommender that tells each visitor to Boston what bands they should see in Boston based upon their musical tastes.
Get info on lots of stuff – people often ask questions about their whole music collection. Like, ‘what are all the songs that I have that are at 113 BPM?‘, or ‘what are the softest songs?’ Previously, to answer these sorts of questions, you’d have to query our APIs one song at a time – a rather tedious and potentially lengthy operation (if you had, say, 10K tracks). With PCs, you can make a single catalog for all of your tracks and then make bulk queries against this catalog. Once you’ve created the catalog, it is very quick to read back all the tempos in your collection.
Represent your music taste – since a Personal Catalog can contain info such as playcounts, skips, and ratings for all of the artists and songs in your collection, it can serve as an excellent proxy to your music taste. Current and soon to be released APIs will use personal catalogs as a representation of your taste to give you personalized results. Playlisting, artist similarity, music recommendations all personalized based on you listening history.
These examples just scratch the surface. We hope to see lots of novel applications of Personal Catalogs. Check out the APIs, and start writing some code.
Here’s a ‘sponsored link’ purchased by Amazon on the popular social news site Reddit. The text of the ad is a excerpt from Roger Ebert’s scathing review of the movie Caligula (the review opens with “Caligula is sickening, utterly worthless, shameful trash” and it goes downhill from there).
I found it a bit curious to see Amazon using such a horrendous review in an ad, but those folks at Amazon are clever. The ad has over 300 comments by Reddit readers meaning that many thousands have probably clicked on the ad to see which movie Ebert was talking about. Hundreds of comments, thousands of visitors all from a 10 word excerpt of a scathing review of the movie. Not too shabby.
Update – the commenters point out that the sponsored link is not purchased by Amazon but by Reddit user qgyh2 who makes money via Amazon’s affiliate program. As Dan says – “he picks headlines that are likely to encourage people to click on the link and then he makes money from whatever they buy while they are at Amazon.” So, qgyh2 is the clever one (but Amazon gets cleverage points for encouraging this kind of stuff via their affiliate program).
Update 2 – flx points out that qgyh2 actually works for Reddit. Here’s more info – ‘He’s helping us experiment with new ways of supporting the site. We weren’t really ready to announce this one yet, or even decide if it’s going to be a permanent fixture. When we do, there will be a blog post about it.’
There’s an interesting piece in the New Yorker about the future of listening. The article focuses on Pandora and MOG and the challenges of making the online listening experience. Author Sasha Frere-Jones concludes with this:
While using these services, I kept thinking about an early-eighties drum machine called the Roland TR-808, which has seduced generations of musicians with its heavy kick-drum sound and the oddly human swing of its clock. Whoever programmed this box had more impact on dance music than the hundreds of better-known musicians who used the device. Similarly, the anonymous programmers who write the algorithms that control the series of songs in these streaming services may end up having a huge effect on the way that people think of musical narrative—what follows what, and who sounds best with whom. Sometimes we will be the d.j.s, and sometimes the machines will be, and we may be surprised by which we prefer.
Read the article:
[tweetmeme source= ‘plamere’ only_single=false] Playlists have long been a big part of the music experience. But making a good playlist is not always easy. We can spend lots of time crafting the perfect mix, but more often than not, in this iPod age, we are likely to toss on a pre-made playlist (such as an album), have the computer generate a playlist (with something like iTunes Genius) or (more likely) we’ll just hit the shuffle button and listen to songs at random. I pine for the old days when Radio DJs would play well-crafted sets – mixes of old favorites and the newest, undiscovered tracks – connected in interesting ways. These professionally created playlists magnified the listening experience. The whole was indeed greater than the sum of its parts.
The tradition of the old-style Radio DJ continues on Internet Radio sites like Radio Paradise. RP founder/DJ Bill Goldsmith says of Radio Paradise: “Our specialty is taking a diverse assortment of songs and making them flow together in a way that makes sense harmonically, rhythmically, and lyrically — an art that, to us, is the very essence of radio.” Anyone who has listened to Radio Paradise will come to appreciate the immense value that a professionally curated playlist brings to the listening experience.
I wish I could put Bill Goldsmith in my iPod and have him craft personalized playlists for me – playlists that make sense harmonically, rhythmically and lyrically, and customized to my music taste, mood and context . That, of course, will never happen. Instead I’m going to rely on computer algorithms to generate my playlists. But how good are computer generated playlists? Can a computer really generate playlists as good as Bill Goldsmith, with his decades of knowledge about good music and his understanding of how to fit songs together?
To help answer this question, I’ve created a Playlist Survey – that will collect information about the quality of playlists generated by a human expert, a computer algorithm and a random number generator. The survey presents a set of playlists and the subject rates each playlist in terms of its quality and also tries to guess whether the playlist was created by a human expert, a computer algorithm or was generated at random.
Bill Goldsmith and Radio Paradise have graciously contributed 18 months of historical playlist data from Radio Paradise to serve as the expert playlist data. That’s nearly 50,000 playlists and a quarter million song plays spread over nearly 7,000 different tracks.
The Playlist Survey also servers as a Radio DJ Turing test. Can a computer algorithm (or a random number generator for that matter) create playlists that people will think are created by a living and breathing music expert? What will it mean, for instance, if we learn that people really can’t tell the difference between expert playlists and shuffle play?
Ben Fields and I will offer the results of this Playlist when we present Finding a path through the Jukebox – The Playlist Tutorial – at ISMIR 2010 in Utrecth in August. I’ll also follow up with detailed posts about the results here in this blog after the conference. I invite all of my readers to spend 10 to 15 minutes to take The Playlist Survey. Your efforts will help researchers better understand what makes a good playlist.
Save the date: 26th September 2010 for The Workshop on Music Recommendation and Discovery being held in conjunction with ACM RecSys in Barcelona, Spain. At this workshop, community members from the Recommender System, Music Information Retrieval, User Modeling, Music Cognition, and Music Psychology can meet, exchange ideas and collaborate.
Topics of interest
Topics of interest for Womrad 2010 include:
- Music recommendation algorithms
- Theoretical aspects of music recommender systems
- User modeling in music recommender systems
- Similarity Measures, and how to combine them
- Novel paradigms of music recommender systems
- Social tagging in music recommendation and discovery
- Social networks in music recommender systems
- Novelty, familiarity and serendipity in music recommendation and discovery
- Exploration and discovery in large music collections
- Evaluation of music recommender systems
- Evaluation of different sources of data/APIs for music recommendation and exploration
- Context-aware, mobile, and geolocation in music recommendation and discovery
- Case studies of music recommender system implementations
- User studies
- Innovative music recommendation applications
- Interfaces for music recommendation and discovery systems
- Scalability issues and solutions
- Semantic Web, Linking Open Data and Open Web Services for music recommendation and discovery
[tweetmeme source=”plamere” only_single=false] I’ve been reading all my books lately using Kindle for iPhone. It is a great way to read – and having a library of books in my pocket at all times means I’m never without a book. One feature of the Kindle software is called Whispersync. It keeps track of where you are in a book so that if you switch devices (from an iPhone to a Kindle or an iPad or desktop), you can pick up exactly where you left off. Kindle also stores any bookmarks, notes, highlights, or similar markings you make in the cloud so they can be shared across devices. Whispersync is a useful feature for readers, but it is also a goldmine of data for Amazon. With Whispersync data from millions of Kindle readers Amazon can learn not just what we are reading but how we are reading. In brick-and-mortar bookstore days, the only thing a bookseller, author or publisher could really know about a book was how many copies it sold. But now with the Whispersync Amazon can get learn all sorts of things about how we are reading. With the insights that they gain from this data, they will, no doubt, find better ways to help people find the books they like to read.
I hope Amazon aggregates their Whispersync data and give us some Last.fm-style charts about how people are reading. Some charts I’d like to see:
- Most Abandoned – the books and/or authors that are most frequently left unfinished. What book is the most abandoned book of all time? (My money is on ‘A Brief History of Time’) A related metric – for any particular book where is it most frequently abandoned? (I’ve heard of dozens of people who never got past ‘The Council of Elrond’ chapter in LOTR).
- Pageturner – the top books ordered by average number of words read per reading session. Does the average Harry Potter fan read more of the book in one sitting than the average Twilight fan?
- Burning the midnight oil – books that keep people up late at night.
- Read Speed – which books/authors/genres have the lowest word-per-minute average reading rate? Do readers of Glenn Beck read faster or slower than readers of Jon Stewart?
- Most Re-read – which books are read over and over again? A related metric – which are the most re-read passages? Is it when Frodo claims the ring, or when Bella almost gets hit by a car?
- Mystery cheats – which books have their last chapter read before other chapters.
- Valuable reference – which books are not read in order, but are visited very frequently? (I’ve not read my Python in a nutshell book from cover to cover, but I visit it almost every day).
- Biggest Slogs – the books that take the longest to read.
- Back to the start – Books that are most frequently re-read immediately after they are finished.
- Page shufflers – books that most often send their readers to the glossary, dictionary, map or the elaborate family tree. (xkcd offers some insights)
- Trophy Books – books that are most frequently purchased, but never actually read.
- Dishonest rater – books that most frequently rated highly by readers who never actually finished reading the book
- Most efficient language – the average time to read books by language. Do native Italians read ‘Il nome della rosa‘ faster than native English speakers can read ‘The name of the rose‘?
- Most attempts – which books are restarted most frequently? (It took me 4 attempts to get through Cryptonomicon, but when I did I really enjoyed it).
- A turn for the worse – which books are most frequently abandoned in the last third of the book? These are the books that go bad.
- Never at night – books that are read less in the dark than others.
- Entertainment value – the books with the lowest overall cost per hour of reading (including all re-reads)
Whispersync is to books as the audioscrobbler is to music. It is an implicit way to track what you are really paying attention to. The data from Whispersync will give us new insights into how people really read books. A chart that shows that the most abandoned author is James Patterson may steer readers away from Patterson and toward books by better authors. I’d rather not turn to the New York Times Best Seller list to decide what to read. I want to see the Amazon Most Frequently Finished book list instead.
I’m excited! Next week I travel to Austin for a week long computer+music geek-fest at SXSW. A big part of SXSW is the music – there are nearly 2,000 different artists playing at SXSW this year. But that presents a problem – there are so many bands going to SXSW (many I’ve never heard of) that I find it very hard to figure out which bands I should go and see. I need a tool to help me find sift through all of the artists – a tool that will help me decide which artists I should add to my schedule and which ones I should skip. I’m not the only one who was daunted by the large artist list. Taylor McKnight, founder of SCHED*, was thinking the same thing. He wanted to give his users a better way to plan their time at SXSW. And so over a couple of weekends Taylor built (with a little backend support from us) The Unofficial Artist Discovery Guide to SXSW.
The Unofficial Artist Discovery Guide to SXSW is a tool that allows you to explore the many artists attending this year’s SXSW. It lets you search for artists, browse popularity, music style, ‘buzzworthiness’, or similarity to your favorite artists – and it will make recommendations for you based on your music taste (using your Last.fm, Sched* or Hype Machine accounts) . The Artist Guide supplies enough context (bios, images, music, tag clouds, links) to help you decide if you might like an artist.
Here’s the guide:
Here’s a quick tour of some of the things you can do with the guide. First off, you can Search for artists by name, genre/tag or location. This helps you find music when you know what you are looking for.
However, you may not always be sure what you are looking for – that’s where you use Discover. This gives you recommendations based on the music you already like. Type in the name of a few artists (even artists that are not playing at SXSW) or your SCHED*, Hype Machine or Last.fm user name, and ‘Discover’ will give you a set of recommendations for SXSW artists based on your music taste. For example, I’ve been listening to Charlotte Gainsbourg lately so I can use the artist guide to help me find SXSW artists that I might like:
If I see an artist that looks interesting I can drill down and get more info about the artist:
I use Last.fm quite a bit, so I can enter my Last.fm name and get SXSW recommendations based upon my Last.fm top artists. The artist guide tries to mix things up a little bit so if I don’t like the recommendations I see, I can just ask again and I can get a different set. Here are some recommendations based on my recent listening at Last.fm:
If you’ve been using the wonderful SCHED* to keep track of your SXSW calendar you can use the guide to get recommendations based on artists that you’ve already added to your SXSW calendar.
In addition to search and discovery, the guide gives you a number of different ways to browse the SXSW Artist space. You can browse by ‘buzzworthy’ artists – these are artists that are getting the most buzz on the web:
Or the most well-known artists:
You can browse by the style of music via a tag cloud:
And by venue:
Building the guide was pretty straightforward. Taylor used the Echo Nest APIs to get the detailed artist data such as familiarity, popularity, artist bios, links, images, tags and audio. The only data that was not available at the Echo Nest was the venue and schedule info which was provided by Arkadiy (one of Taylor’s colleagues). Even though SXSW artists can be extremely long tail (some don’t even have Myspace pages), the Echo Nest was able to provide really good coverage for these sets (There was coverage for over 95% of the artists). Still there are a few gaps and I suspect there may be a few errors in the data (my favorite wrong image is for the band Abe Vigoda). If you are in a band that is going to SXSW and you see that we have some of your info wrong, send me an email (firstname.lastname@example.org) and I’ll make it right.
We are excited to see the this Artist Discovery guide built on top of the Echo Nest. It’s a great showcase for the Echo Nest developer platform and working with Taylor was great. He’s one of these hyper-creative, energetic types – smart, gets things done and full of new ideas. Taylor may be adding a few more features to the guide before SXSW, so stay tuned and we’ll keep you posted on new developments.