Archive for category data
What really was the Song of the Summer?
It’s the time of the year when everyone is crowning the Song the Summer. Billboard has picked Robin Thicke’s Blurred Lines as their choice based upon radio airplay, audience impressions, sales data and streaming activity, but that’s not the final word. Other’s have chimed in with their own picks. MTV Video Music Awards Best Song of the Summer, based on online voting went to One Direction’s Best Song Ever, while Paste Magazine’s editors picked Daft Punk’s Get Lucky.
But do any of these songs really deserve the Song of the Summer crown? I really don’t like a metric like Billboard’s that uses radio airplay or sales data – that’s really a measure of how well a label’s marketing department is performing, not a measure of how well the song is liked. Online voting, such as is used to select the MTV Video Music Award winner, is easily hacked, manipulated and subject to the Tyranny of the Bored, while an editorial pick is just the opinion of a couple of writers on a deadline.
I think the best way to pick the Song of the Summer is see which song is actually played more by music listeners. Forget the song that is getting the most buzz, the Song of the Summer is the song that is getting the most plays. So, let’s look at song plays and pick our own Song of the Summer.
The following chart shows a plot of the top 750 songs played over the summer. The plot represents the song plays vs the song fans. Songs on the upper right are the songs that have the most fans and are getting the most plays
You can click on the above image to open an interactive version of the chart. You can mouse over the songs to see what they are, you can click on a song to hear it, and you can click on a genre in the legend to highlight songs within a particular genre.
Using this chart we can see that the top songs of the summer based on play data are:
- Can’t Hold Us – Macklemore & Ryan Lewis
- Radioactive – Imagine Dragons
- Blurred Lines – Robin Thicke
- When I Was Your Man – Bruno Mars
- Thrift Shop – Mackmore & Ryan Lewis
- Holy Grail – Jay Z
- Just Give Me A Reason – P!nk
- Treasure – Bruno Mars
- Mirrors – Justin Timberlake
- We Can’t Stop – Miley Cyrus
Daft Punk’s Get Lucky is at #13, and One Direction’s rank is way down at #74.
Blurred Lines is close at number three, but the clear winner of the Song of the Summer crown, based on play data is Macklemore’s Can’t Hold Us.
http://www.rdio.com/artist/Macklemore__Ryan_Lewis/album/The_Heist_1/track/Can%27t_Hold_Us_(feat._Ray_Dalton)/The songs with the most passionate fans
I like plotting songs on a plays vs fans plot. It not only shows what songs are most popular in terms of plays and fans, but it also helps us find songs that are attracting the most passionate fans. For example, in the plot below, I’ve highlighted certain songs that are getting more than their fair share of songs plays:
These are songs that fans are listening to over and over – a good indicator that the song is destined for greatness. Avicii and Lorde are already on the Billboard top 10. The Fifth Harmony Song Miss Movin’ On has an extremely high passion score. I expect we’ll be hearing a lot about Fifth Harmony over the next year.
Update – it turns out that the Fifth Harmony high passion score is not an honest score. The fans of Fifth Harmony (aka Harmonizers) have been organizing a continuous streaming of Fifth Harmony’s Miss Movin’ On to push it up the charts. Here’s a peek into the twitter campaign:

This campaign explains why the Fifth Harmony track is such an outlier, and is a reminder that any single metric used to pick winners can and will be manipulated. sigh.
Perhaps Blurred Lines is the Song of the Summer in that it best captured the vibe of 2013, but my vote, and the data say that the real song of the summer was Macklemore’s Can’t hold us. Now, since it is after labor day, we can put this topic to rest, and start thinking about how we feel about the Song of the Summer 2014 being by Fifth Harmony.
Which music services are growing, which are shrinking
Here’s a quick tour of google trends output for a number of music services with an eye for identifying which are growing and which are shrinking. Google trends tracks search interest. The number 100 represents the peak search interest in these graphs.
Updated (1) (2) (3) – added a number of new charts. Updated (4) – added a summary list
iTunes – ITunes looks relatively flat since 2010. Perhaps things will change with their Pandora competitor to be launched this month.
last.fm – Peaked in 2009, has now fallen back to where it was in 2006. The golden age of last.fm is over, sad to say.
Spotify – steady growth since launch in 2009
Pandora – steady growth since 2006. Perhaps leveling off.
Rhapsody – slow but steady shrinking interest
Rdio – steady growth since 2011 launch, steep growth in the last year
Deezer – steady shrinkage since 2009
Grooveshark – peaked in 2012, now shrinking
siriusxm – strong growth since 2011
iheartradio – strong growth since 2011
Google Music – slow steady growth
Slacker – slight decline in interest since its peak in 2009
Soundcloud – strong increase since 2009
Youtube – Youtube has always been one of the most popular destinations for music listeners
Songza – After a pivot in 2011, very strong growth
8tracks – strong growth since 2011
Bandcamp
Turntable – after the initial buzz, interest in turntable has declined dramatically.
Mixcloud – strong steady growth since 2009
MOG – peaked in 2012
Jango – peaked in January of this year, but have since dropped to 2010 interest levels
Playlist.com – peaked in 2009, now at its lowest interest since 2007.
soundhound – slightly off from its 2012 peak interest.
shazam – strong steady, rising interest
Beatport – holding steady at 2009 levels
Muve – steady growth since 2011
The Hype Machine – six years of decline
ex.fm – a jagged two year climb
Amazon MP3 – growing until 2011, when it flattens out, and perhaps drops a bit.
Walmart Music – at its lowest point ever
Yahoo Music – Once the biggest destination on the web, now at its lowest point.
Myspace Music – steady decline until there’s nothing left
Facebook Music – the only service where the downward trend started before the product was announced.
Twitter Music – perhaps the strangest graph at all. Lots of excitement at launch and then, almost instantly … meh.
Zune – bursts of activity with every Zune update, but a steady decline to irrelevance.
xbox music – modest decline since the October 2012 release, but too early to tell.
Radionomy – I’d never heard of them before, but they are gaining interest, especially in France.
Sony’s Music Unlimited – growing since 2010
Radio.com – Waning interest since 2009
Of course, these search trends are not the same as having an actual measure of activity. Millions of people play music on Spotify or iTunes every day without performing a search. However, until we can get raw user numbers from every music service, this is probably about the closest we can get to understanding which services are growing and which are shrinking.
Leave a comment if you think there are some music listening services that I’ve missed that I should include.
Are these the angriest tracks on the web?
I built a playlist of songs that most frequently appear in playlists with the words angry or mad with the Smart Playlist Builder. These are arguably some of the angriest tracks on the web.
http://www.rdio.com/people/plamere/playlists/5779446/Top_angry_songs_created_with_SPB/
It is interesting to compare these angry tracks to the top tracks tagged with angry at Last.fm.
I can’t decide whether the list derived from angry playlists is better or worse than the list driven by social tags. I’d love to hear your opinion. Take a look at these two lists and tell me which list is a better list of angry tracks and why.
yep, this is totally unscientific poll, but I’m still interested in what you think.
Using the wisdom of the crowds to build better playlists
At music sites like Rdio and Spotify, music fans have been creating and sharing music playlists for years. Sometimes these playlists are carefully crafted sets of songs for particular contexts like gaming or sleep and sometimes they are just random collections of songs. If I am looking for music for a particular context, it is easy to just search for a playlist that matches that context. For instance, if I am going on roadtrip there are hundreds of roadtrip playlists on Rdio for me to chose from. Similarly, if I am going for a run, there’s no shortage of running playlists to chose from. However, if I am going for a run, I will need to pick one of those hundreds of playlists, and I don’t really know if the one I pick is going to be of the carefully crafted variety or if it was thrown together haphazardly, leaving me with a lousy playlist for my run. Thus I have a problem – What is the best way to pick a playlist for a particular context?
Naturally, we can solve this problem with data. We can take a wisdom of the crowds approach to solving this problem. To create a running playlist, instead of relying on a single person to create the playlist, we can enlist the collective opinion of everyone who has ever created a running playlist to create a better list.
I’ve built a web app to do just this. It lets you search through Rdio playlists for keywords. It will then aggregate all of the songs in the matching playlists and surface up the songs that appear in the most playlists. So if Kanye West’s Stronger appears in more running playlists than any other song, it will appear first in the resulting playlist. Thus songs, that the collective agree are good songs for running get pushed to the top of the list. It’s a simple idea that works quite well. Here are some example playlists created with this approach:
Best Running Songs
http://www.rdio.com/people/plamere/playlists/5773579/Top_best_running_songs_via_SPB/
Coding
http://www.rdio.com/people/plamere/playlists/5773559/Top_coding_songs_via_SPB/
Sad Love Songs
http://www.rdio.com/people/plamere/playlists/5773508/Top_sad_love_songs_songs_via_SPB/
Chillout
http://www.rdio.com/people/plamere/playlists/5773867/Top_chillout_songs_via_SPB/
Date Night
http://www.rdio.com/people/plamere/playlists/5773474/Top_date_night_songs_via_SPB/
Sexy Time
http://www.rdio.com/people/plamere/playlists/5773535/Top_sexytime_songs_via_SPB/
This wisdom of the crowds approach to playlisting isn’t limited to contexts like running or coding, you can also use it to give you an introduction to a genre or artist as well.
Country
http://www.rdio.com/people/plamere/playlists/5773544/Top_country_songs_via_SPB/
Post Rock
http://www.rdio.com/people/plamere/playlists/5773642/Top_post_rock_songs_via_SPB/
Weezer
http://www.rdio.com/people/plamere/playlists/5773606/Top_weezer_songs_via_SPB/
The Smart Playlist Builder
The app that builds these nifty playlists is called The Smart Playlist Builder. You type in a few keywords and it will search Rdio for all the matching playlists. It will show you the matching playlists, giving you a chance to refine your query. You can search for words, phrases and you can exclude terms as well. The query sad “love songs” -country will search for playlists with the word sad, and the phrase love songs in the title, but will exclude any that have the word country.
When you are happy with your query you can aggregate the tracks from the matching playlists. This will give you a list of the top 100 songs that appeared in the matching playlists.
If you are happy with the resulting playlist, you can save it to Rdio, where you can do all the fine tuning of the playlist such as re-ordering, adding and deleting songs.
The Smart Playlist Builder uses the really nifty Rdio API. The Rdio folks have done a fantastic job of giving developers access to their music and data. Well done Rdio team!
Go ahead and give The Smart Playlist Builder a try to see how the wisdom of the crowds can help you make playlists.
The Most Replayed Songs
Posted by Paul in data, Music, The Echo Nest on August 27, 2013
I still remember the evening well. It was midnight during the summer of 1982. I was living in a thin-walled apartment, trying unsuccessfully to go to sleep while the people who lived upstairs were music bingeing on The B52’s Rock Lobster. They listened to the song continuously on repeat for hours, giving me the chance to ponder the rich world of undersea life, filled with manta rays, narwhals and dogfish.
We tend to binge on things we like – potato chips, Ben & Jerry’s, and Battlestar Galactica. Music is no exception. Sometimes we like a song so much, that as soon as it’s over, we want to hear it again. But not all songs are equally replayable. There are some songs that have some secret mysterious ingredients that makes us want to listen to the song over and over again. What are these most replayed songs? Let’s look at some data to find out.
The Data – For this experiment I used a week’s worth of song play data from the summer of 2013 that consists of user / song / play-timestamp triples. This data set has on the order of 100 million of these triples for about a half million unique users and 5 million unique songs. To find replays I looked for consecutive plays by a user of song within a time window (to ensure that the replays are in the same listening session). Songs with low numbers of plays or fans were filtered out.
For starters, I simply counted up the most replayed songs. As expected, this yields very boring results – the list of the top most replayed songs is exactly the same as the most played songs. No surprise here. The most played songs are also the most replayed songs.
Top Most Replayed Songs – (A boring result)
- Robin Thicke — Blurred Lines featuring T.I., Pharrell
- Jay-Z — Holy Grail featuring Justin Timberlake
- Miley Cyrus — We Can’t Stop
- Imagine Dragons — Radioactive
- Macklemore — Can’t Hold Us (feat. Ray Dalton)
To make this more interesting, instead of looking at the absolute number of replays, I adjusted for popularity by looking at the ratio of replays to the total number of plays for each song. This replay ratio tells us the what percentage of plays of a song are replays. If we plot the replay ratio vs. the number of fans a song has the outliers become quite clear. Some songs are replayed at a higher rate than others.
I made an interactive version of this graph, you can mouse over the songs to see what they are and click on the songs to listen to them.
Sorting the results by the replay ratio yields a much more interesting result. It surfaces up a few classes of frequently replayed songs: background noise, children’s music, soft and smooth pop and friday night party music. Here’s the color coded list of the top 20:
Top Replayed songs by percentage
- 91% replays White Noise For Baby Sleep — Ocean Waves
- 86% replays Eric West — Reckless (From Playing for Keeps)
- 86% replays Soundtracks For The Masters — Les Contes D’hoffmann: Barcarole
- 83% replays White Noise For Baby Sleep — Warm Rain
- 83% replays Rain Sounds — Relax Ocean Waves
- 82% replays Dennis Wilson — Friday Night
- 81% replays Sleep — Ocean Waves for Sleep – White Noise
- 74% replays White Noise Sleep Relaxation White Noise Relaxation: Ocean Waves 7hz
- 74% replays Ween — Ocean Man
- 73% replays Children’s Songs Music — Whole World In His Hands
- 71% replays Glee Cast — Friday (Glee Cast Version)
- 63% replays Rain Sounds — Rain On the Window
- 63% replays Rihanna — Cheers (Drink To That)
- 60% replays Group 1 Crew — He Said (feat. Chris August)
- 59% replays Karsten Glück Simone Sommerland — Schlaflied für Anne
- 56% replays Monica — With You
- 54% replays Jessie Ware — Wildest Moments
- 53% replays Tim McGraw — I Like It, I Love It
- 53% replays Rain Sounds — Morning Rain In Sedona
- 52% replays Rain Sounds — Rain Sounds
It is no surprise that the list is dominated by background noise. There’s nothing like ambient ocean waves or rain sounds to help baby go to sleep in the noisy city. A five minute track of ambient white noise may be played dozens of times during every nap. It is not uncommon to find 8 hour long stretches of the same five minute white noise audio track played on auto repeat.
The top most replayed song is Reckless by Eric West from the ‘shamelessly sentimental’ 2012 movie Playing for Keeps (4% rotten). 86% of the time this song is played it is a replay. This is the song that you can’t listen to just once. It is the Lays potato chip of music. Beware, if you listen to it, you may be caught in its web and you’ll never be able to escape. Listen at your own risk:
Luckily, most people don’t listen to this song even once. It is only part of the regular listening rotation of a couple hundred listeners. Still, it points to a pattern that we’ll see more of – overly sentimental music has high replay value.
Top Replayed Popular Songs
Perhaps even more interesting is to look at the top most replayed popular songs. We can do this by restricting the songs in the results to those that are by artists that have a significant fan base:
- 31% replays Miley Cyrus — The Climb
- 16% replays August Alsina — I Luv This sh*t featuring Trinidad James
- 15% replays Brad Paisley — Whiskey Lullaby
- 14% replays Tamar Braxton — The One
- 14% replays Chris Brown — Love More
- 14% replays Anna Kendrick — Cups (Pitch Perfect’s “When I’m Gone”)
- 13% replays Avenged Sevenfold — Hail to the King
- 13% replays Jay-Z — Big Pimpin’
- 13% replays Labrinth — Beneath Your Beautiful
- 13% replays Karmin — Acapella
- 12% replays Lana Del Rey — Summertime Sadness [Lana Del Rey vs. Cedric Gervais]
- 12% replays MGMT — Electric Feel
- 12% replays One Direction — Best Song Ever
- 12% replays Big Sean — Beware featuring Lil Wayne, Jhené Aiko
- 12% replays Chris Brown — Don’t Think They Know
- 11% replays Justin Bieber — Boyfriend
- 11% replays Avicii — Wake Me Up
- 11% replays 2 Chainz — Feds Watching featuring Pharrell
- 10% replays Paramore — Still Into You
- 10% replays Alicia Keys — Fire We Make
- 10% replays Lorde — Royals
- 10% replays Miley Cyrus — We Can’t Stop
- 10% replays Ciara — Body Party
- 9% replays Marc Anthony — Vivir Mi Vida
- 9% replays Ellie Goulding — Burn
- 9% replays Fantasia — Without Me
- 9% replays Rich Homie Quan — Type of Way
- 9% replays The Weeknd — Wicked Games (Explicit)
- 9% replays A$AP Ferg — Work REMIX
- 9% replays Jay-Z — Part II (On The Run) featuring Beyoncé
It is hard to believe, but the data doesn’t lie – More than 30% of the time after someone listens to Miley Cyrus’s The Climb they listen to it again right away – proving that there is indeed always going to be another mountain that you are going to need to climb. Miley Cyrus is well represented – her aptly named song We can’t Stop is the most replayed song of the top ten most popular songs.
Here are the top 30 most replayed popular songs in Spotify and Rdio playlists for you to enjoy, but I’m sure you’ll never get to the end of the playlist, you’ll just get stuck repeating The Best Song Ever or Boyfriend forever.
Here’s the Rdio version of the Top 30 Most Replayed popular songs:
http://www.rdio.com/people/plamere/playlists/5733386/Most_replayed/Most Manually Replayed
More than once I’ve come back from lunch to find that I left my music player on auto repeat and it has played the last song 20 times while I was away. The song was playing, but no one was listening. It is more interesting to find songs replays in which the replay is manually initiated. These are the songs that grabbed the attention of the listener enough to make them interact with their player and actually queue the song up again. We can find manually replayed songs by looking at replay timestamps. Replays generated by autorepeat will have a very regular timestamp delta, while manual replay timestamps will have more random delta between timestamps.
Here are the top manually replayed songs:
- Body Party by Ciara
- Still Into You by Paramore
- Tapout featuring Lil Wayne, Birdman, Mack Maine, Nicki Minaj, Future by Rich Gang
- Part II (On The Run) featuring Beyoncé by Jay-Z
- Feds Watching featuring Pharrell by 2 Chainz
- Royals by Lorde
- V.S.O.P. by K. Michelle
- Just Give Me A Reason by Pink
- Don’t Think They Know by Chris Brown
- Wake Me Up by Avicii
There’s an Rdio playlist of these songs: Most Manually Replayed
So what?
Why do we care which songs are most replayed? It’s part of our never ending goal to try to better understand how people interact with music. For instance, recognizing when music is being used in a context like helping the baby go to sleep is important – without taking this context into account, the thousands of plays of Ocean Waves and Warn Rain would dominate the taste profile that we build for that new mom and dad. We want to make sure that when that mom and dad are ready to listen to music, we can recommend something besides white noise.
Looking at replays can help us identify new artists for certain audiences. For instance, parents looking for an alternative to Miley Cyrus for their pre-teen playlists after Miley’s recent VMA performance, may look to an artist like Fifth Harmony. Their song Miss Movin’ On has similar replay statistics to the classic Miley songs:
http://www.rdio.com/artist/Fifth_Harmony/album/Miss_Movin%27_On/track/Miss_Movin%27_On/Finally, looking at replays is another tool to help us understand the music that people really like. If the neighbors play Rock Lobster 20 times in a row, you can be sure that they really, really like that song. (And despite, or perhaps because of, that night 30 years ago, I like the song too). You should give it a listen, or two…
http://www.rdio.com/artist/The_B-52%27s/album/Rock_Lobster_/_6060-842_(Digital_45)/track/Rock_Lobster/Using speechiness to make stand-up comedy playlists
Posted by Paul in code, data, The Echo Nest on March 20, 2013
One of the Echo Nest attributes calculated for every song is ‘speechiness’. This is an estimate of the amount of spoken word in a particular track. High values indicate that there’s a good deal of speech in the track, and low values indicate that there is very little speech. This attribute can be used to help create interesting playlists. For example, a music service like Spotify has hundreds of stand-up comedy albums in their collection. If you wanted to use the Echo Nest API to create a playlist of these routines you could create an artist-description playlist with a call like so:
However, this call wouldn’t generate the playlist that you want. Intermixed with stand-up routines would be comedy musical numbers by Tenacious D, The Lonely Island or “Weird Al”. That’s where the ‘speechiness’ attribute comes in. We can add a speechiness filter to our playlist call to give us spoken-word comedy tracks like so:
It is a pretty effective way to generate comedy playlists.
I made a demo app that shows this called The Comedy Playlister. It generates a Spotify playlist of comedy routines.
It does a pretty good job of finding comedy. Now I just need some way of filtering out Larry The Cable Guy. The app is on line here: The Comedy Playlister. The source is on github.
Girl Talk in a Box
Here’s my music hack from Midem Music Hack Day: Girl Talk in a Box. It continues the theme of apps like Bohemian Rhapsichord and Bangarang Boomerang. It’s an app that lets you play with a song in your browser. You can speed it up and slow it down, you can skip beats, you can play it backwards, beat by beat. You can make it swing. You can make breaks and drops. It’s a lot of fun. With Girl Talk in a Box, you can play with any song you upload, or you can select songs from the Gallery.
My favorite song to play with today is AWOLNATION’s Sail. Have a go. There’s a whole bunch of keyboard controls (that I dedicate to @eelstretching). When you are done with that you can play with the code.
The Stockholm Python User Group
Posted by Paul in code, data, events, The Echo Nest on January 25, 2013
In a lucky coincidence I happened to be in Stockholm yesterday which allowed me to give a short talk at the Stockholm Python user Group. About 80 or so Pythonistas gathered at Campanja to hear talks about Machine Learning and Python. The first two talks were deep dives into particular aspects of machine learning and Python. My talk was quite a bit lighter. I described the Million Song Data Set and suggested that it would be a good source of data for anyone looking for a machine learning research. I then went on to show a half a dozen or so demos that were (or could be) built on top of the Million Song Data Set. A few folks at the event asked for links, so here you go:
Core Data: Echo Nest analysis for a million songs
Complimentary Data
- Second Hand Songs – 20K cover songs
- MusixMatch – 237K bucket-of-words lyric sets
- Last.fm tags – song level tags for 500K tracks. plus 57 million sim. track pairs
- Echo Nest Taste profile subset – 1M users, 48M user/song/play count triples
Data Mining Listening Data: The Passion Index
Fun with Artist Similarity Graphs: Boil the Frog
Post about In Search of the Click Track and a web app for exploring click tracks
Turning music into silly putty – Echo Nest Remix
[youtube http://www.youtube.com/watch?v=2oofdoS1lDg]Interactive Music
I really enjoyed giving the talk. The audience was really into the topic and remained engaged through out. Afterwards I had lots of stimulating discussions about the music tech world. The Stockholm Pythonistas were a very welcoming bunch. Thanks for letting me talk. Here’s a picture I took at the very end of the talk:
Joco vs. Glee
With all the controversy surrounding Glee’s ripoff of Jonathan Coulton’s Baby Got back I thought I would makes a remix that combines the two versions. The remix alternates between the two songs, beat by beat.
[audio http://static.echonest.com.s3.amazonaws.com/audio/combo.mp3]At first I thought I had a bug and only one of the two songs was making it into the output, but nope, they are both there. To prove it I made another version that alternates the same beat between the two songs – sort of a call and answer. You can hear the subtle differences, and yes, they are very subtle.
[audio http://static.echonest.com.s3.amazonaws.com/audio/combo-t1.mp3]The audio speaks for itself.
Here’s the code.
[gist https://gist.github.com/4632416]





















































