At the core of just about everything we do here at the Echo Nest is what we call “The Knowledge”. This is big pile of data that represents everything we know about music. To build ‘The Knowledge’ we crawl the web looking for every bit of info about music. We find music blogs, artist news, album reviews, biographies, audio, images, videos, fan activity and on and on. This gives us a huge set of raw data that represents the global conversation about music. Next, we apply a set of statistical and natural language processing algorithms to this raw data to give us a deeper understanding of what all this data means. For instance, one fundamental algorithm tells us whether a particular web document is about a particular artist. This might be easy for an artist with a distinctive name like Metallica, but may not be so easy for The Rolling Stones (is it the band or the magazine?), and can be hard for bands with ambiguous names like Air and Yes, and can be extremely difficult for artists such as Torsten Pröfrock who tragically has chosen the stage name ‘Various Artists‘ (what was he thinking?). Another algorithm that we apply to music reviews is sentiment analysis. This helps us decide whether or not a reviewer has a positive opinion about the music being reviewed. We can take a review like this one written by Jennie, my 14 year old daughter, and learn whether or not she likes the new album by Beyoncé and whether or not she tends to like R&B and pop music.
In addition to analyzing what people are writing about music, we also try to extract as much meaning as we can from the music itself. We apply digital signal processing and machine learning algorithms to audio allowing us to extract information such as tempo, key, song structure, loudness, energy, harmonic content and timbre from every song.
Traditionally, “The Knowledge” has helped us build tools to help music fans explore and discover music – using all this data helps us predict what type of music a listener might like. For the last year, we’ve offered artist similarity and music recommendation web services around this data. But now we are going to turn this all upside down. Instead of using this data to help listeners find new music, we are going to use this data to help artists find new fans. That is what Fanalytics is all about.
For example, music blogs and review sites are becoming increasingly important way for an artist to build buzz around a new release. However, there are thousands of music blogs – each with its own specialty. This becomes a problem for the artist. How can she decide which blogs she should target for promoting her new album? This is one of the problems that Fanalytics tries to solve. With ‘The Knowledge’ we know quite a bit about thousands of music blogs. We know the reputation and the reach of a blog. We know what types of a music a particular author tends to write about, and we know what kinds of music they tend to like. With this knowledge we can make what is essentially a recommendation engine for music promotion. For any artist we can recommend a set blogs and writers that would most likely be interested in writing about the artist.
In addition to this recommendation engine tailored to music promotion, Fanalytics also provides a set of analytics tools that use ‘The Knowledge’ to help artists better understand their audience. For instance, an artist can track everything that is being said online about them – every blog post, news item, music review, video, as well as their online ‘buzz’ – a quantitative measure of how much attention the artist is receiving from reviewers, bloggers, fans, etc.
We have just launched Fanalytics, but apparently we are already seeing strong interest from the labels. (According the press release Interscope, Independent Label Group (WMG), RCA Music Group (Sony) and The Orchard are already on board). That’s not too surprising, the labels are looking for new ways to reach out to fans. As we continue to grow “The Knowledge” here at the Echo Nest I’m sure we will be creating more interesting tools like Fanalytics that are built around the data .