Posts Tagged last.fm
The Music Matrix – Exploring tags in the Million Song Dataset
Posted by Paul in code, data, Music, The Echo Nest on November 27, 2011
Last month Last.fm contributed a massive set of tag data to the Million Song Data Set. The data set includes:
- 505,216 tracks with at least one tag
- 522,366 unique tags
- 8,598,630 (track – tag) pairs
A popular track like Led Zep’s Stairway to Heaven has dozens of unique tags applied hundreds of times.
There is no end to the number of interesting things you can do with these tags: Track similarity for recommendation and playlisting, faceted browsing of the music space, ground truth for training autotagging systems etc.
I think there’s quite a bit to be learned about music itself by looking at these tags. We live in a post-genre world where most music no longer fits into a nice tidy genre categories. There are hundreds of overlapping subgenres and styles. By looking at how the tags overlap we can get a sense for the structure of the new world of music. I took the set of tags and just looked at how the tags overlapped to get a measure of how often a pair of tags co-occur. Tags that have high co-occurrence represent overlapping genre space. For example, among the 500 thousand tracks the tags that co-occur the most are:
- rap co-occurs with hip hop 100% of the time
- alternative rock co-occurs with rock 76% of the time
- classic rock co-occurs with rock 76% of the time
- hard rock co-occurs with rock 72% of the time
- indie rock co-occurs with indie 71% of the time
- electronica co-occurs with electronic 69% of the time
- indie pop co-occurs with indie 69% of the time
- alternative rock co-occurs with alternative 68% of the time
- heavy metal co-occurs with metal 68% of the time
- alternative co-occurs with rock 67% of the time
- thrash metal co-occurs with metal 67% of the time
- synthpop co-occurs with electronic 66% of the time
- power metal co-occurs with metal 65% of the time
- punk rock co-occurs with punk 64% of the time
- new wave co-occurs with 80s 63% of the time
- emo co-occurs with rock 63% of the time
It is interesting to see how the subgenres like hard rock or synthpop overlaps with the main genre and how all rap overlaps with Hip Hop. Using simple overlap we can also see which tags are the least informative. These are tags that overlap the most with other tags, meaning that they are least descriptive of tags. Some of the least distinctive tags are: Rock, Pop, Alternative, Indie, Electronic and Favorites. So when you tell someone you like ‘rock’ or ‘alternative’ you are not really saying too much about your musical taste.
The Music Matrix
I thought it might be interesting to explore the world of music via overlapping tags, and so I built a little web app called The Music Matrix. The Music Matrix shows the overlapping tags for a tag neighborhood or an artist via a heat map. You can explore the matrix, looking at how tags overlap and listening to songs that fit the tags.
With this app you can enter a genre, style, mood or other type of tag. The app will then find the 24 tags with the highest overlap with the seed and show the confusion matrix. Hotter colors indicate high overlap. Mousing over a cell will show you the percentage overlap between the two corresponding tags and clicking on a cell will play a track that has high tag counts for the two tags. I find that I can learn a lot about a genre of music by looking at the 24 tag neighborhood for a genre and listening to examples. Some interesting neighborhoods to explore are:
You can also explore by moods:
If you are not sure what genre or style is for an artist, you can just start with the top tags for the artist like so:
Use the Music Matrix to explore a new genre of music or to find music that matches a set of styles. Find out how genres overlap. Listen to prototypical examples of different styles. Click on things, have fun. Check it out:
The code for the Music Matrix is on Github. Thanks to Thierry for creating the Million Song Data Set (the best research data set ever created) and thanks to Last.fm for contributing a very nice set of tag data to the data set.
LastFM-ArtistTags2007
A few years back I created a data set of social tags from Last.fm. RJ at Last.fm graciously gave permission for me to distribute the dataset for research use. I hosted the dataset on the media server at Sun Labs. However, with the Oracle acquisition, the media server is no longer serving up the data, so I thought I would post the data elsewhere.
The dataset is now available for download here: Lastfm-ArtistTags2007
Here are the details as told in the README file:
The LastFM-ArtistTags2007 Data set Version 1.0 June 2008 What is this? This is a set of artist tag data collected from Last.fm using the Audioscrobbler webservice during the spring of 2007. The data consists of the raw tag counts for the 100 most frequently occuring tags that Last.fm listeners have applied to over 20,000 artists. An undocumented (and deprecated) option of the audioscrobbler web service was used to bypass the Last.fm normalization of tag counts. This data set provides raw tag counts. Data Format: The data is formatted one entry per line as follows: musicbrainz-artist-id<sep>artist-name<sep>tag-name<sep>raw-tag-count Example: 11eabe0c-2638-4808-92f9-1dbd9c453429<sep>Deerhoof<sep>american<sep>14 11eabe0c-2638-4808-92f9-1dbd9c453429<sep>Deerhoof<sep>animals<sep>5 11eabe0c-2638-4808-92f9-1dbd9c453429<sep>Deerhoof<sep>art punk<sep>21 11eabe0c-2638-4808-92f9-1dbd9c453429<sep>Deerhoof<sep>art rock<sep>18 11eabe0c-2638-4808-92f9-1dbd9c453429<sep>Deerhoof<sep>atmospheric<sep>4 11eabe0c-2638-4808-92f9-1dbd9c453429<sep>Deerhoof<sep>avantgarde<sep>3 Data Statistics: Total Lines: 952810 Unique Artists: 20907 Unique Tags: 100784 Total Tags: 7178442 Filtering: Some minor filtering has been applied to the tag data. Last.fm will report tag with counts of zero or less on occasion. These tags have been removed. Artists with no tags have not been included in this data set. Of the nearly quarter million artists that were inspected, 20,907 artists had 1 or more tags. Files: ArtistTags.dat - the tag data README.txt - this file artists.txt - artists ordered by tag count tags.txt - tags ordered by tag count License: The data in LastFM-ArtistTags2007 is distributed with permission of Last.fm. The data is made available for non-commercial use only under the Creative Commons Attribution-NonCommercial-ShareAlike UK License. Those interested in using the data or web services in a commercial context should contact partners at last dot fm. For more information see http://www.audioscrobbler.net/data/ Acknowledgements: Thanks to Last.fm for providing the access to this tag data via their web services Contact: This data was collected, filtered and by Paul Lamere of The Echo Nest. Send questions or comments to Paul.Lamere@gmail.com
Last.FM’s Listening clock
Posted by Paul in Music, visualization on September 7, 2010
Nifty new visualization at Last.fm that shows the time of day when you listen to music:
MeToo – a scrobbler for the room
Posted by Paul in code, fun, Music, The Echo Nest, web services on June 11, 2010
[tweetmeme source= ‘plamere’ only_single=false] One of the many cool things about working at the Echo Nest is that we have an Sonos audio system with single group playlist for the office. Anyone from the CEO to the greenest intern can add music to the listening queue for everyone to listen to. The office, as a whole has a rather diverse taste in music and as a result I’ve been exposed to lots of interesting music. However, the downside of this is that since I’m not listening to music being played on my personal computer, every day I have 10 hours of music listening that is never scrobbled, and as they say, if it doesn’t scrobble, it doesn’t count. Sure the Sonos system scrobbles all of the plays to the Echo Nest account on Last.fm but I’d also like it to scrobble it to my account so I can use nifty apps like Lee Byron’s Last.fm Listening History or Matt Ogle’s Bragging Rights on my own scrobbles.
This morning while listening to that nifty Emeralds album, I decided that I’d deal with those scrobble gaps once and for all. So I wrote a little python script called MeToo that keeps my scrobbles up to date. It’s really quite simple. Whenever I’m in the office, I fire up MeToo. MeToo watches the most recent tracks played on The Echo Nest account and whenever a new track is played, it scrobbles it to my personal account. In effect, my scrobbles will track the office scrobbles. When I’m not listening I just close my laptop and the scrobbling stops.
The script itself is pretty simple – I used pylast to do interfacing to Last.fm – the bulk of the logic is less than 20 lines of code. I start the script like so:
% python metoo.py TheEchoNest lamere
when I do that, MeToo will continuously monitor most recently played tracks on TheEchoNest and scrobble the plays on my account. When I close my laptop, the script is naturally suspended – so even though music may continue to play in the office, my laptop won’t scrobble it.
I suspect that this use case is relatively rare, and so there’s probably not a big demand for something like MeToo, but if you are interested in it, leave a comment. If I see some interest, I’ll toss it up on google code so anyone can use it.
It feels great to be scrobbling again!
Which band has the hotttnesss?
Posted by Paul in Music, The Echo Nest on April 9, 2010
Developer/musician Paul Barrett (aka echodeck) has created pop.ularity a nifty web-based music quiz based on last.fm and the Echo Nest APIs. In the quiz you try to guess which band is hotter on the web. The quiz uses Last.fm plays, Last.fm listeners, Echo Nest Hottttnesss and Echo Nest familiarity to measure popularity for each band.
It’s a fun game – give it a whirl! http://pop.ularity.co.uk/
Unofficial Artist Guide to SXSW
Posted by Paul in events, Music, recommendation, The Echo Nest on March 4, 2010
I’m excited! Next week I travel to Austin for a week long computer+music geek-fest at SXSW. A big part of SXSW is the music – there are nearly 2,000 different artists playing at SXSW this year. But that presents a problem – there are so many bands going to SXSW (many I’ve never heard of) that I find it very hard to figure out which bands I should go and see. I need a tool to help me find sift through all of the artists – a tool that will help me decide which artists I should add to my schedule and which ones I should skip. I’m not the only one who was daunted by the large artist list. Taylor McKnight, founder of SCHED*, was thinking the same thing. He wanted to give his users a better way to plan their time at SXSW. And so over a couple of weekends Taylor built (with a little backend support from us) The Unofficial Artist Discovery Guide to SXSW.
The Unofficial Artist Discovery Guide to SXSW is a tool that allows you to explore the many artists attending this year’s SXSW. It lets you search for artists, browse popularity, music style, ‘buzzworthiness’, or similarity to your favorite artists – and it will make recommendations for you based on your music taste (using your Last.fm, Sched* or Hype Machine accounts) . The Artist Guide supplies enough context (bios, images, music, tag clouds, links) to help you decide if you might like an artist.
Here’s the guide:
Here’s a quick tour of some of the things you can do with the guide. First off, you can Search for artists by name, genre/tag or location. This helps you find music when you know what you are looking for.
However, you may not always be sure what you are looking for – that’s where you use Discover. This gives you recommendations based on the music you already like. Type in the name of a few artists (even artists that are not playing at SXSW) or your SCHED*, Hype Machine or Last.fm user name, and ‘Discover’ will give you a set of recommendations for SXSW artists based on your music taste. For example, I’ve been listening to Charlotte Gainsbourg lately so I can use the artist guide to help me find SXSW artists that I might like:
If I see an artist that looks interesting I can drill down and get more info about the artist:
From here I can read the artist bio, listen to some audio, explore other similar SXSW artists or add the event to my SCHED* schedule.
I use Last.fm quite a bit, so I can enter my Last.fm name and get SXSW recommendations based upon my Last.fm top artists. The artist guide tries to mix things up a little bit so if I don’t like the recommendations I see, I can just ask again and I can get a different set. Here are some recommendations based on my recent listening at Last.fm:
If you’ve been using the wonderful SCHED* to keep track of your SXSW calendar you can use the guide to get recommendations based on artists that you’ve already added to your SXSW calendar.
In addition to search and discovery, the guide gives you a number of different ways to browse the SXSW Artist space. You can browse by ‘buzzworthy’ artists – these are artists that are getting the most buzz on the web:
Or the most well-known artists:
You can browse by the style of music via a tag cloud:
And by venue:
Building the guide was pretty straightforward. Taylor used the Echo Nest APIs to get the detailed artist data such as familiarity, popularity, artist bios, links, images, tags and audio. The only data that was not available at the Echo Nest was the venue and schedule info which was provided by Arkadiy (one of Taylor’s colleagues). Even though SXSW artists can be extremely long tail (some don’t even have Myspace pages), the Echo Nest was able to provide really good coverage for these sets (There was coverage for over 95% of the artists). Still there are a few gaps and I suspect there may be a few errors in the data (my favorite wrong image is for the band Abe Vigoda). If you are in a band that is going to SXSW and you see that we have some of your info wrong, send me an email (paul@echonest.com) and I’ll make it right.
We are excited to see the this Artist Discovery guide built on top of the Echo Nest. It’s a great showcase for the Echo Nest developer platform and working with Taylor was great. He’s one of these hyper-creative, energetic types – smart, gets things done and full of new ideas. Taylor may be adding a few more features to the guide before SXSW, so stay tuned and we’ll keep you posted on new developments.
Normalisr – Time-based charts of your last.fm data
Posted by Paul in Music, web services on December 14, 2009
Worth checking out: Normalisr
Genre of the week: whalecore
Saw this post by Nackster on the Last.fm brutal death metal forum
whalecore! Oh Yeah! Here is it:
And don’t forget this whalecore classic – really, it started the whole genre:
Top whalecore bands are: Gojira Mastodon
Ahab
Giant Squid
Yep, there’s a Wikipedia page on whalecore. Listen to Whalecore at Last.fm