Archive for category code
The Labyrinth of Genre
Posted by Paul in code, data, tags, The Echo Nest, visualization on January 16, 2011
I’m fascinated with how music genres relate to each other, especially how one can use different genres as stepping stones as a guide through the vast complexities of music. There are thousands of genres, some like rock or pop represent thousands of artists, while some like Celtic Metal or Humppa may represent only a handful of artists. Building a map by hand that represents the relationships of all of these genres is a challenge. Is Thrash Metal more closely related to Speed Metal or to Power Metal? To sort this all out I’ve built a Labyrinth of Genre that lets you explore the many genres. The Labyrinth lets you wander though about a 1000 genres, listening to samples from representative artists.
Click on a genre and the labyrinth will be expanded to show similar half a dozen similar genres and you’ll hear songs in the genre.
I built the labyrinth by analyzing a large collection of last.fm tags. I used the cosine distance of tf-idf weighted tagged artists as a distance metric for tags. When you click on a node, I attach the six closest tags that haven’t already been attached to the graph. I then use the Echo Nest APIs to get all the media.
Even though it’s a pretty simple algorithm, it is quite effective in grouping similar genre. If you are interested in wandering around a maze of music, give the Labyrinth of Genre a try.
A Genre Map
Inspired by an email exchange with Samuel Richardson, creator of ‘Know your genre‘ I created a genre map that might serve as a basis for a visual music explorer (perhaps something to build at one of the upcoming music hack days). The map is big and beautiful (in a geeky way). Here’s an excerpt, click on it to see the whole thing.
Update – I’ve made an interactive exploration tool that lets you wander through the genre graph. See the Labyrinth of Genre
Update 2 – Colin asked the question “What’s the longest path between two genres?” – If I build the graph by using the 12 nearest neighbors to each genre, find the minimum spanning tree for that graph and then find the longest path, I find this 31 step wonder:
Of course there are lots of ways to skin this cat – if I build the graph with just the nearest 6 neighbors, and don’t extract the minimum spanning tree, the longest path through the graph is 10 steps:
The Music Maze
Posted by Paul in code, fun, Music, The Echo Nest, video, visualization, web services on December 20, 2010
I wrote an application over the weekend called Music Maze. The Music Maze lets you wander through the maze of similar artists until you find something you like. You can give it a try here: The Music Maze (be forewarned, the app plays music upon loading).
We’ve seen the idea behind the Music Maze in other apps like Musicovery and Tuneglue’s Music Map. The nifty thing about the Music Maze is that I didn’t have to write a single line of server code to make it all happen. The Music Maze web app talks directly to The Echo Nest API. There’s no middle man. The artist graph, the album art, the links to audio – everything are pulled on demand from the Echo Nest API. This is possible because the Echo Nest API now supports JSONP requests (in beta, full release coming soon!). With JSONP an AJAX app can escape the Javascript sandbox and make calls to 3rd party web services. No need for me to set up a server to proxy calls to the Echo Nest, no Apache or Tomcat, no MySQL, no worries about scaling. This makes it incredibly easy for me to host and deploy this app. I just toss my HTML, Javascript and CSS files into an Amazon S3 bucket, make them world readable, and I’m done. It really has never been easier to create Music Apps. This whole app is less than 500 lines of javascript, written in a few hours on a Sunday morning while the rest of the family are still asleep. It is great to see all of these technologies coming together to make easy to create music apps.
(Be sure to check out the JavaScript InfoVis Toolkit . It does all of the the graphical heavy lifting in this app. It’s pretty neat.)
LastFM-ArtistTags2007
A few years back I created a data set of social tags from Last.fm. RJ at Last.fm graciously gave permission for me to distribute the dataset for research use. I hosted the dataset on the media server at Sun Labs. However, with the Oracle acquisition, the media server is no longer serving up the data, so I thought I would post the data elsewhere.
The dataset is now available for download here: Lastfm-ArtistTags2007
Here are the details as told in the README file:
The LastFM-ArtistTags2007 Data set
Version 1.0
June 2008
What is this?
This is a set of artist tag data collected from Last.fm using
the Audioscrobbler webservice during the spring of 2007.
The data consists of the raw tag counts for the 100 most
frequently occuring tags that Last.fm listeners have applied
to over 20,000 artists.
An undocumented (and deprecated) option of the audioscrobbler
web service was used to bypass the Last.fm normalization of tag
counts. This data set provides raw tag counts.
Data Format:
The data is formatted one entry per line as follows:
musicbrainz-artist-id<sep>artist-name<sep>tag-name<sep>raw-tag-count
Example:
11eabe0c-2638-4808-92f9-1dbd9c453429<sep>Deerhoof<sep>american<sep>14
11eabe0c-2638-4808-92f9-1dbd9c453429<sep>Deerhoof<sep>animals<sep>5
11eabe0c-2638-4808-92f9-1dbd9c453429<sep>Deerhoof<sep>art punk<sep>21
11eabe0c-2638-4808-92f9-1dbd9c453429<sep>Deerhoof<sep>art rock<sep>18
11eabe0c-2638-4808-92f9-1dbd9c453429<sep>Deerhoof<sep>atmospheric<sep>4
11eabe0c-2638-4808-92f9-1dbd9c453429<sep>Deerhoof<sep>avantgarde<sep>3
Data Statistics:
Total Lines: 952810
Unique Artists: 20907
Unique Tags: 100784
Total Tags: 7178442
Filtering:
Some minor filtering has been applied to the tag data. Last.fm will
report tag with counts of zero or less on occasion. These tags have
been removed.
Artists with no tags have not been included in this data set.
Of the nearly quarter million artists that were inspected, 20,907
artists had 1 or more tags.
Files:
ArtistTags.dat - the tag data
README.txt - this file
artists.txt - artists ordered by tag count
tags.txt - tags ordered by tag count
License:
The data in LastFM-ArtistTags2007 is distributed with permission of
Last.fm. The data is made available for non-commercial use only under
the Creative Commons Attribution-NonCommercial-ShareAlike UK License.
Those interested in using the data or web services in a commercial
context should contact partners at last dot fm. For more information
see http://www.audioscrobbler.net/data/
Acknowledgements:
Thanks to Last.fm for providing the access to this tag data via their
web services
Contact:
This data was collected, filtered and by Paul Lamere of The Echo Nest. Send
questions or comments to Paul.Lamere@gmail.com
Jennie’s ultimate road trip
Posted by Paul in code, fun, Music, The Echo Nest on October 20, 2010
Last weekend at Music Hack Day Boston, I teamed up with Jennie, my 15-year-old daughter, to build her idea for a music hack which we’ve called Jennie’s Ultimate Road Trip. The hack helps you plan a road trip so that you’ll maximize the number of great concerts you can attend along the way. You give the app your starting and ending city, your starting and ending dates, and the names of some of your favorite artists and Jennie’s Ultimate Road Trip will search through the many events to find the ones that fit your route schedule that you’d like to see and gives you an itinerary and map.
We used the wonderful SongKick API to grab events for all the nearby cities. I was quite surprised at the how many events SongKick would find. For just a single week, in the geographic area between Boston and New York City, SongKick found 1,161 events with 2,168 different artists. More events and more artists makes it easier to find a route that will give a satisfying set of concerts – but it can also make finding a route a bit more computationally challenging too (more on that later). Once we had the set of possible artists that we could visit, we needed to narrow down the list of artists to the ones would be of most interest to the user. To do this we used the new Personal Catalogs feature of the Echo Nest API. We created a personal catalog containing all of the potential artists (so for our trip to NYC from Boston, we’d create a catalog of 2,168 artists). We then used the Echo Nest artist similarity APIs to get recommendations for artists within this catalog. This yielded us a set of 200 artists that best match the user’s taste that would be playing in the area.
The next bit was the tricky bit – first, we subsetted the events to just include events for the recommended set of artists. Then we had to build the optimal route through the events, considering the date and time of the event, the preference the user has for the artist, whether or not we’ve already been to an event for this artist on the trip, how far out of our way the venue is from our ultimate destination and how far the event is from our previous destination. For anyone who saw me looking grouchy on Sunday morning during the hack day it was because it was hard trying to figure out a good cost function that would weigh all of these factors: artist preference, travel time and distance between shows, event history. The computer science folks who read this blog will recognize that this route finding is similar to the ‘travelling salesman problem‘ – but with a twist, instead of finding a route between cities, which don’t tend to move around too much, we have to find a path through a set of artist concerts where every night, the artists are in different places. I call this the ‘travelling rock star’ problem. Ultimately I was pretty happy with how the routing algorithm, it can find a decent route through a thousand events in less than 30 seconds.
Jennie joined me for a few hours at the Music Hack Day – she coded up the HTML for the webform and made the top banner – (it was pretty weird to look over on her computer and see her typing in raw HTML tags with attached CSS attributes – kids these days). We got the demo done in time – and with the power of caching it will generate routes and plot them on a map using the Google API. Unfortunately, if your route doesn’t happen to be in the cache, it can take quite a bit of time to get a route out of the app – gathering events from SongKick, getting the recommendations from the Echo Nest, and finding the optimal route all add up to an app that can take 5 minutes before you get your answer. When I get a bit of time, I’ll take another pass to speed things up. When it is fast enough, I’ll put it online.
It was a fun demo to write. I especially enjoyed working on it with my daughter. And we won the SongKick prize, which was pretty fantastic.
The Echo Nest gets Personal
Posted by Paul in code, Music, playlist, recommendation, remix, The Echo Nest, web services on October 15, 2010
Here at the Echo Nest just added a new feature to our APIs called Personal Catalogs. This feature lets you make all of the Echo Nest features work in your own world of music. With Personal Catalogs (PCs) you can define application or user specific catalogs (in terms of artists or songs) and then use these catalogs to drive the behavior of other Echo Nest APIs. PCs open the door to all sorts of custom apps built on the Echo Nest platform. Here are some examples:
Create better genius-style playlists – With PCs I can create a catalog that contains all of the songs in my iTunes collection. I can then use this catalog with the Echo Nest Playlist API to generate interesting playlists based upon my own personal collection. I can create a playlist of my favorite, most danceable songs for a party, or I can create a playlist of slow, low energy, jazz songs for late night reading music.
Create hyper-targeted recommendations – With PCs I can make a catalog of artists and then use the artist/similar APIs to generate recommendations within this catalog. For instance, I could create an artist catalog of all the bands that are playing this weekend in Boston and then create Music Hack Day recommender that tells each visitor to Boston what bands they should see in Boston based upon their musical tastes.
Get info on lots of stuff – people often ask questions about their whole music collection. Like, ‘what are all the songs that I have that are at 113 BPM?‘, or ‘what are the softest songs?’ Previously, to answer these sorts of questions, you’d have to query our APIs one song at a time – a rather tedious and potentially lengthy operation (if you had, say, 10K tracks). With PCs, you can make a single catalog for all of your tracks and then make bulk queries against this catalog. Once you’ve created the catalog, it is very quick to read back all the tempos in your collection.
Represent your music taste – since a Personal Catalog can contain info such as playcounts, skips, and ratings for all of the artists and songs in your collection, it can serve as an excellent proxy to your music taste. Current and soon to be released APIs will use personal catalogs as a representation of your taste to give you personalized results. Playlisting, artist similarity, music recommendations all personalized based on you listening history.
These examples just scratch the surface. We hope to see lots of novel applications of Personal Catalogs. Check out the APIs, and start writing some code.
Hacking on the Echo Nest at Boston Music Hack Day
Posted by Paul in code, events, fun, The Echo Nest on October 13, 2010
If you are going to the Music Hack Day Boston this weekend, you may want to consider creating an hack based on the Echo Nest APIs. The Echo Nest is offering a prize for the best hack that is built based upon Echo Nest technology. The prize is the much coveted Echo Nest Sweatsedo. The softness, the coolness and the ‘blueness’ of this casual attire is unsurpassed by the clothing offered by any other music technology company. However, we realize that not everyone can wear the sweatsedo with proper style. For those, who are not cool enough to wear the Echo Nest sweatsedo, they can opt for the alternate prize of $1,000 cash. So your choice is for a prize is a Kind of Blue, or a Kind of Green.
But, wait! There’s more. Since we are unveiling two new APIs at Music Hack Day weekend, we are going to offer not one, but two prizes, one to each of the two best hacks that use the Echo Nest APIs. If you create one of the two best hacks that use the Echo Nest, you will get to chose from the ‘Kind of Blue’ or the ‘Kind of Green’ prize. So get hacking!
Some like it loud …
Posted by Paul in code, The Echo Nest, web services on October 6, 2010
One of the nifty features that we’ve rolled out in the last 6 months here at the Echo Nest is an extremely flexible song search API. With this API you can search for songs based upon all sorts of criteria from tempo, key mode, duration. You can use this API to do things that would be really hard to do. For example, here’s a bit of python that will show you the loudest songs for an artist:
from pyechonest import song as songAPI
from pyechonest import artist as artistAPI
def find_loudest_songs(artist_name):
artists = artistAPI.search(artist_name, results=1)
if artists:
songs = songAPI.search(artist_id=artists[0].id, sort='loudness-desc')
for song in songs:
print song.get_audio_summary().loudness, song.title
Here are the loudest songs for some sample artists:
- The Beatles: Helter Skelter, Sgt Peppers Lonely Hearts Club Band
- Metallica: Cyanide, All Nightmare Long
- The White Stripes: Broken Bricks, Fell in love with a girl
- Led Zeppelin: Rock and Roll, Black Dog
We can easily change the code to help us find the softest songs for an artist, or the fastest, or the shortest. Some more examples:
- Shortest Beatles song: Her Majesty at 23.2 second
- Longest Beatles song: Revolution #9 at 8:35
- Slowest Beatles song: Julia at 57 BPMs
- Softest Beatles song: Julia at -27DB BPMs (Blackbird is at -25DB)
I think it is interesting to find the outliers. For instance, here’s the softest song by Muse (which is usually a very loud artist):
We can combine these attributes too so we can find the fastest loud Beatles song (I feel fine, at -7.5 DB and 180 BPM, or the slowest loud Beatles song (Don’t let me down, at -6.6 DB and 65 BPM).
The search songs api is a good example of the power of the Echo Nest platform. We have data on millions of songs that you can use to answer questions about music that have traditionally been very hard to answer.










