Posts Tagged fanalytics
Over the last 15 years or so, music listening has moved online. Now instead of putting a record on the turntable or a CD in the player, we fire up a music application like iTunes, Pandora or Spotify to listen to music. One interesting side-effect of this transition to online music is that there is now a lot of data about music listening behavior. Sites like Last.fm can keep track of every song that you listen to and offer you all sorts of statistics about your play history. Applications like iTunes can phone home your detailed play history. Listeners leave footprints on P2P networks, in search logs and every time they hit the play button on hundreds of music sites. People are blogging, tweeting and IMing about the concerts they attend, and the songs that they love (and hate). Every day, gigabytes of data about our listening habits are generated on the web.
With this new data come the entrepreneurs who sample the data, churn though it and offer it to those who are trying to figure out how best to market new music. Companies like Big Champagne, Bandmetrics, Musicmetric, Next Big Sound and The Echo Nest among others offer windows into this vast set of music data. However, there’s still a gap in our understanding of how to interpret this data. Yes, we have vast amounts data about music listening on the web, but that doesn’t mean we know how to interpret this data- or how to tie it to the music marketplace. How much is a track play on a computer in London related to a sale of that track in a traditional record store in Iowa? How do searches on a P2P network for a new album relate to its chart position? Is a track illegally made available for free on a music blog hurting or helping music sales? How much does a twitter mention of my song matter? There are many unanswered questions about how online music activity correlates with the more traditional ways of measuring artist success such as music sales and chart position. These are important questions to ask, yet they have been impossible to answer because the people who have the new data (data from the online music world) generally don’t talk to the people who own the old data and vice versa.
We think that understanding this relationship is key and so we are working to answer these questions via a research consortium between The Echo Nest, Yahoo Research and UMG unit Island Def Jam. In this consortium, three key elements are being brought together. Island Def Jam is contributing deep and detailed sales data for its music properties – sales data that is not usually released to the public, Yahoo! Research brings detailed search data (with millions and millions of queries for music) along with deep expertise in analyzing and understanding what search can predict while The Echo Nest brings our understanding of Intenet music activity such as playcount data, friend and fan counts, blog mentions, reviews, mp3 posts, p2p activity as well as second generation metrics as sentiment analysis, audio feature analysis and listener demographics . With the traditional sales data, combined with the online music activity and search data the consortium hopes to develop a predictive model for music by discovering correlations between Internet music activity and market reaction. With this model, we would be able to quantify the relative importance of a good review on a popular music website in terms of its predicted effect on sales or popularity. We would be able to pinpoint and rank various influential areas and platforms on the music web that artists should spend more of their time and energy to reach a bigger fanbase. Combining anonymously observable metrics with internal sales and trend data will give keen insight into the true effects of the internet music world.
There are some big brains working on building this model. Echo Nest co-founder Brian Whitman (He’s a Doctor!) and the team from Yahoo! Research that authored the paper “What Can Search Predict” which looks closely at how to use query volume to forecast openining box-office revenue for feature films. The Yahoo! research team includes a stellar lineup: Yahoo! Principal research scientist Duncan Watts whose research on the early-rater effect is a must read for anyone interested in recommendation and discovery; Yahoo! Principal Research Scientist David Pennock who focuses on algorithmic economics (be sure to read Greg Linden’s take on Seung-Taek Park and David’s paper Applying Collaborative Filtering Techniques to Movie Search for Better Ranking and Browsing); Jake Hoffman, expert in machine learning and data-driven modeling of complex systems; Research Scientist Sharad Goel (see his interesting paper on Anatomy of the Long Tail: Ordinary People with Extraordinary Tastes) and Research Scientist Sébastien Lahaie, expert in marketplace design, reputation systems (I’ve just added his paper Applying Learning Algorithms to Preference Elicitation to my reading list). This is a top-notch team
I look forward to the day when we have a predictive model for music that will help us understand how this:
At the core of just about everything we do here at the Echo Nest is what we call “The Knowledge”. This is big pile of data that represents everything we know about music. To build ‘The Knowledge’ we crawl the web looking for every bit of info about music. We find music blogs, artist news, album reviews, biographies, audio, images, videos, fan activity and on and on. This gives us a huge set of raw data that represents the global conversation about music. Next, we apply a set of statistical and natural language processing algorithms to this raw data to give us a deeper understanding of what all this data means. For instance, one fundamental algorithm tells us whether a particular web document is about a particular artist. This might be easy for an artist with a distinctive name like Metallica, but may not be so easy for The Rolling Stones (is it the band or the magazine?), and can be hard for bands with ambiguous names like Air and Yes, and can be extremely difficult for artists such as Torsten Pröfrock who tragically has chosen the stage name ‘Various Artists‘ (what was he thinking?). Another algorithm that we apply to music reviews is sentiment analysis. This helps us decide whether or not a reviewer has a positive opinion about the music being reviewed. We can take a review like this one written by Jennie, my 14 year old daughter, and learn whether or not she likes the new album by Beyoncé and whether or not she tends to like R&B and pop music.
In addition to analyzing what people are writing about music, we also try to extract as much meaning as we can from the music itself. We apply digital signal processing and machine learning algorithms to audio allowing us to extract information such as tempo, key, song structure, loudness, energy, harmonic content and timbre from every song.
Traditionally, “The Knowledge” has helped us build tools to help music fans explore and discover music – using all this data helps us predict what type of music a listener might like. For the last year, we’ve offered artist similarity and music recommendation web services around this data. But now we are going to turn this all upside down. Instead of using this data to help listeners find new music, we are going to use this data to help artists find new fans. That is what Fanalytics is all about.
For example, music blogs and review sites are becoming increasingly important way for an artist to build buzz around a new release. However, there are thousands of music blogs – each with its own specialty. This becomes a problem for the artist. How can she decide which blogs she should target for promoting her new album? This is one of the problems that Fanalytics tries to solve. With ‘The Knowledge’ we know quite a bit about thousands of music blogs. We know the reputation and the reach of a blog. We know what types of a music a particular author tends to write about, and we know what kinds of music they tend to like. With this knowledge we can make what is essentially a recommendation engine for music promotion. For any artist we can recommend a set blogs and writers that would most likely be interested in writing about the artist.
In addition to this recommendation engine tailored to music promotion, Fanalytics also provides a set of analytics tools that use ‘The Knowledge’ to help artists better understand their audience. For instance, an artist can track everything that is being said online about them – every blog post, news item, music review, video, as well as their online ‘buzz’ – a quantitative measure of how much attention the artist is receiving from reviewers, bloggers, fans, etc.
We have just launched Fanalytics, but apparently we are already seeing strong interest from the labels. (According the press release Interscope, Independent Label Group (WMG), RCA Music Group (Sony) and The Orchard are already on board). That’s not too surprising, the labels are looking for new ways to reach out to fans. As we continue to grow “The Knowledge” here at the Echo Nest I’m sure we will be creating more interesting tools like Fanalytics that are built around the data .