Archive for category data
One of the Echo Nest attributes calculated for every song is ‘speechiness’. This is an estimate of the amount of spoken word in a particular track. High values indicate that there’s a good deal of speech in the track, and low values indicate that there is very little speech. This attribute can be used to help create interesting playlists. For example, a music service like Spotify has hundreds of stand-up comedy albums in their collection. If you wanted to use the Echo Nest API to create a playlist of these routines you could create an artist-description playlist with a call like so:
However, this call wouldn’t generate the playlist that you want. Intermixed with stand-up routines would be comedy musical numbers by Tenacious D, The Lonely Island or “Weird Al”. That’s where the ‘speechiness’ attribute comes in. We can add a speechiness filter to our playlist call to give us spoken-word comedy tracks like so:
It is a pretty effective way to generate comedy playlists.
I made a demo app that shows this called The Comedy Playlister. It generates a Spotify playlist of comedy routines.
Here’s my music hack from Midem Music Hack Day: Girl Talk in a Box. It continues the theme of apps like Bohemian Rhapsichord and Bangarang Boomerang. It’s an app that lets you play with a song in your browser. You can speed it up and slow it down, you can skip beats, you can play it backwards, beat by beat. You can make it swing. You can make breaks and drops. It’s a lot of fun. With Girl Talk in a Box, you can play with any song you upload, or you can select songs from the Gallery.
My favorite song to play with today is AWOLNATION’s Sail. Have a go. There’s a whole bunch of keyboard controls (that I dedicate to @eelstretching). When you are done with that you can play with the code.
In a lucky coincidence I happened to be in Stockholm yesterday which allowed me to give a short talk at the Stockholm Python user Group. About 80 or so Pythonistas gathered at Campanja to hear talks about Machine Learning and Python. The first two talks were deep dives into particular aspects of machine learning and Python. My talk was quite a bit lighter. I described the Million Song Data Set and suggested that it would be a good source of data for anyone looking for a machine learning research. I then went on to show a half a dozen or so demos that were (or could be) built on top of the Million Song Data Set. A few folks at the event asked for links, so here you go:
Core Data: Echo Nest analysis for a million songs
- Second Hand Songs – 20K cover songs
- MusixMatch – 237K bucket-of-words lyric sets
- Last.fm tags – song level tags for 500K tracks. plus 57 million sim. track pairs
- Echo Nest Taste profile subset – 1M users, 48M user/song/play count triples
Data Mining Listening Data: The Passion Index
Fun with Artist Similarity Graphs: Boil the Frog
Turning music into silly putty - Echo Nest Remix
I really enjoyed giving the talk. The audience was really into the topic and remained engaged through out. Afterwards I had lots of stimulating discussions about the music tech world. The Stockholm Pythonistas were a very welcoming bunch. Thanks for letting me talk. Here’s a picture I took at the very end of the talk:
With all the controversy surrounding Glee’s ripoff of Jonathan Coulton’s Baby Got back I thought I would makes a remix that combines the two versions. The remix alternates between the two songs, beat by beat.
At first I thought I had a bug and only one of the two songs was making it into the output, but nope, they are both there. To prove it I made another version that alternates the same beat between the two songs – sort of a call and answer. You can hear the subtle differences, and yes, they are very subtle.
The audio speaks for itself.
Here’s the code.
My Music Hack Day Stockholm hack is ‘Going Undercover‘. This hack uses the extensive cover song data from SecondHandSongs to construct paths between artists by following chains of cover songs. Type in the name of a couple of your favorite artists and Going Undercover will try to find a chain of cover songs that connects the two artists. The resulting playlist will likely contain familiar songs played by artists that you never heard of before. Here’s a Going Undercover playlist from Carly Rae Jepsen to Johnny Cash:
For this hack I stole a lot of code from my recent Boil the Frog hack, and good thing I could do that otherwise I would never have finished the hack in time. I spent many hours working to reconcile the Second Hand Songs data with The Echo Nest and Rdio data (Second Hand Songs is not part of Rosetta stone, so I had to write lots of code to align all the IDs up). Even with leveraging the Boil the Frog code, I had a very late night trying to get all the pieces working (and of course, the bug that I spent 2 hours banging my head on at 3AM was 5 minutes of work after a bit of sleep).
I am pretty pleased with the results of the hack. It is fun to build a path between a couple of artists and listen to a really interesting mix of music. Cover songs are great for music discovery, they give you something familiar to hold on to while listening to a new artist.
It’s the time of the year when music critics make their lists of artists that we passed away or bands that broke up in the last twelve months. To help them write their retrospectives I’ve put together two lists: One of well-known musicians that died during 2012, and one of well-known bands that called it quits during the year.
I made the list using the Echo Nest Artist Search API, restricting the results to artists that had a year ending in 2012.
Lots of great music left us in 2012. See the lists here: 2012 Music Memoriam
I spent Saturday at the New York Times attending the TimesOpen Hack Day. It was great fun, with lots of really smart folks creating neat (and not always musical stuff). Some great shots here:
Check out the wrap up at the Times: TimesOpen 2012 Hack Day Wrap-Up
The Infinite Jukebox generates plots of songs in which the most similar beats are connected by arcs. I call these plots cantograms. For instance, below is a labeled cantogram for the song Rolling in the Deep by Adele. The song starts at 3:00 on the circle and proceeds clockwise, beat by beat completely around the circle. I’ve labeled the plot so you can see how it aligns with the music. There’s an intro, a first verse, a chorus, a second verse, etc. until the outro and the end of the song.
One thing that’s interesting is that most of the beat similarity connections occur between the beats in the three instances of the chorus. This certainly makes intuitive sense. The verses have different lyrics, so for the most part they won’t be too similar to each other, but the choruses have the same lyrics, the same harmony, the same instrumentation. They may even be, for all we know may even be exactly the same audio, that perfect performance, cut and pasted three times by the audio engineer to make the best sounding version of the song.
Now take a look at the cantogram for another popular song. The plot below shows the beat similarities for the song Tik Tok by Ke$ha. What strikes me the most about this plot is how similar it looks to the plot for Rolling in the Deep. It has the characteristic longer intro+first verse, some minor inter-verse similarities and the very strong similarities between the three choruses.
As we look at more plots for modern pop music we see the same pattern over and over again. In this plot for Lady Gag’s Paparazzi a cantogram we again see the same pattern.
We see it in the plot for Justin Bieber’s Baby:
Taylor Swift’s Fearless has a two verses before the first chorus, shifting it further around the circle, but other than that the pattern holds:
Now compare and contrast the pop cantograms with those from other styles of music. First up is Led Zeppelin’s Stairway to heaven. There’s no discernable repeating chorus, or global song repetition, the only real long-arc repetition occurs during the guitar solo for the last quarter of the song.
Here’s another style of music. Deadmau5′s Raise your weapon. This is electronica (and maybe some dubstep). Clearly from the cantogram we can see that is is not a traditional pop song. Very little long arc repetition, with the densest cluster being the final dubstep break.
Dave Brubeck’s Take Five has a very different pattern, with lots of short term repetition during the first half of the song, while during the second half with Joe Morello’s drum solo there’s a very different pattern.
Green Grass and High Tides has yet a different pattern – no three choruses and out here. (By the way, the final guitar solo is well worth listening to in the Infinite Jukebox. It is the guitar solo that never ends).
The progressive rock anthem Roundabout doesn’t have the Pop Pattern
Nor does Yo-Yo Ma’s performance of the Cello suite No. 1.
Looking at the pop plots one begins to understand that pop music really could be made in a factory. Each song is cut from the same mold. In fact, one of the most successful pop songs in recent years, was produced by a label with factory in its name. Looking at Rebecca Black’s Friday we can tell right away that it is a pop song:
Compare that plot to this years Youtube breakout, Thanksgiving by Nicole Westbrook, (another Ark Music Factory assembly):
The plot has all the makings of the standard pop song for the 2010s.
In the music information retrieval research community there has been quite a bit of research into algorithmically extracting song structure, and visualizations are often part of this work. If you are interested in learning more about this research, I suggest looking at some of the publications by Meinard Müller and Craig Sapp.
Of course, not every pop song will follow the pattern that I’ve shown here. Nevertheless, I find it interesting that this very simple visualization is able to show us something about the structure of the modern pop song, and how similar this structure is across many of the top pop songs.
update: since publishing this post I’ve updated the layout algorithm in the Infinite Jukebox so that songs start and end at 12 Noon and not 3PM, so the plots you see in this post are rotated 90degrees clockwise from what you would see in the jukebox.
More weekend programming on ‘Hear Here’ the mobile version of the Road Trip mixtape. I’ve added a familiarity filter so you can chose whether you want to listen to only the most recognizable artists in a region, or you want to listen to everything. Screenshots:
I’ve taken the app on a few road tests. There were a few crashes (of the app kind, not the car kind, luckily). Stability and responsiveness with unreliable networking is a fun challenge. I’m taking a 1000 mile roadtrip in a week or two, hopefully will have it 100% ready by then.
Over the last few years I’ve made a number of 1,000+ mile road trips as I shuttle kids to colleges in far away places. Listening to music has always been a big part of these trips. I thought it’d be nice to be able to listen to music by local artists when driving through a particular region, so I spent a few weekends creating an app called Roadtrip Mixtape that populates a roadtrip playlist with artists that are from the region you are driving through.
To create a playlist, type your starting and ending cities for your roadtrip. The app will use Google’s directions to plan the best route between the two cities. The route will then be broken into 15 minute playlist legs. Each playlist leg is populated by 15 minutes worth of music by nearby artists.
The beginning of each leg is represented by a green ball. You can click on the ball to see what artists will be played during that leg. The app plays music via Rdio using their nifty Web Player API. If you are an Rdio subscriber you can listen to full streams, and if not you get to hear 30 second samples. One bit of interesting info that I show for a route is the ‘Avg distance’. This shows the average distance to each artist on the roadtrip. If this number is low, you are traveling through a musically dense part of the world, and if it is high, you are traveling in a sparse musical region. For instance, for a roadtrip from Boston to New York the average artist distance is 3 miles (about as low as it goes). However, if you are traveling from Omaha to Denver, the average artist distance is 81 miles.
You can also click anywhere on the map to see and listen to nearby artists. For example, if you click on Shreveport you’ll see something like this:
When you click the ‘Hear here’ button, you’ll get a playlist of the hotttest artists from Shreveport.
Listening to nearby artists is quite fun. There’s potential from some extreme sonic whiplash as you drive near a brutal death metal band and then a pop vocalist from the 1950s
The Technical Bits
To build the app I used the new artist location data from The Echo Nest. This (still in beta) feature, allows you to retrieve the location of any artist. Here’s an example API call that retrieves the artist location for Radiohead:
For this app, I collected the locations for the top 100,000 or so most popular artists in the Rdio catalog. These artists were from about 15,000 different cities. I used geopy along with the Yahoo Placefinder geocoder to find the latitude and longitude for each of these cities. For the mapping and route finding, I used version 3.9 of the Google maps API. For music playback I used the Rdio Web Playback API. With the tight integration between the Echo Nest and Rdio ID spaces it was easy to go from a geolocated Echo Nest artist to a list of Rdio track IDs for songs by that artist.
The Bad Bits
As a web app that relies on the flash-based Rdio web player, Roadtrip Mixtape is not really a mobile app. It won’t play music on an iPhone or iPad, so the best way to actually use this app on the road is probably to bring along your tethered laptop. Not the best user experience. Thus, my next weekend project will be to learn a little bit of iOS programming a make a version of this app that runs on an iPhone and an iPad. Stay tuned for the next version.