Posts Tagged charts

Spying on how we read

I’ve been reading all my books lately using Kindle for iPhone.  It is a great way to read – and having a library of books in my pocket at all times means I’m never without a book.  One feature of the Kindle software is called Whispersync.  It keeps track of where you are in a book so that if you switch devices (from an iPhone to a Kindle or an iPad or desktop), you can pick up exactly where you left off.  Kindle also stores any bookmarks, notes, highlights, or similar markings you make in the cloud so they can be shared across devices.   Whispersync is a useful feature for readers, but it is also a goldmine of data for Amazon.  With Whispersync data from millions of Kindle readers Amazon can learn not just what we are reading but how we are reading.  In brick-and-mortar bookstore days, the only thing a bookseller, author or publisher could really know about a book was how many copies it sold.  But now with the Whispersync Amazon can get learn all sorts of things about how we are reading.  With the insights that they gain from this data, they will, no doubt,  find better ways to help people find the books they like to read.

I hope Amazon aggregates their Whispersync data and give us some Last.fm-style charts about how people are reading.  Some charts I’d like to see:

  • Most Abandoned - the books and/or authors that are most frequently left unfinished.  What book is the most abandoned book of all time? (My money is on ‘A Brief History of Time’) A related metric – for any particular book where is it most frequently abandoned?  (I’ve heard of dozens of people who never got past ‘The Council of Elrond’ chapter in LOTR).
  • Pageturner – the top books ordered by average number of words read per reading session.  Does the average Harry Potter fan read more of the book in one sitting than the average Twilight fan?
  • Burning the midnight oil – books that keep people up late at night.
  • Read Speed – which books/authors/genres have the lowest word-per-minute average reading rate?   Do readers of Glenn Beck read faster or slower than readers of Jon Stewart?
  • Most Re-read – which books are read over and over again?  A related metric – which are the most re-read passages?  Is it when Frodo claims the ring,  or when Bella almost gets hit by a car?
  • Mystery cheats – which books have their last chapter read before other chapters.
  • Valuable reference – which books are not read in order, but are visited very frequently? (I’ve not read my Python in a nutshell book from cover to cover, but I visit it almost every day).
  • Biggest Slogs – the books that take the longest to read.
  • Back to the start – Books that are most frequently re-read immediately after they are finished.
  • Page shufflers – books that most often send their readers to the glossary, dictionary, map or the elaborate family tree.  (xkcd offers some insights)
  • Trophy Books – books that are most frequently purchased, but never actually read.
  • Dishonest rater - books that most frequently rated highly by readers who never actually finished reading the book
  • Most efficient language – the average time to read books by language.  Do native Italians read ‘Il nome della rosa faster than native English speakers can read ‘The name of the rose‘?
  • Most attempts – which books are restarted most frequently?  (It took me 4 attempts to get through Cryptonomicon, but when I did I really enjoyed it).
  • A turn for the worse – which books are most frequently abandoned in the last third of the book?  These are the books that go bad.
  • Never at night – books that are read less in the dark than others.
  • Entertainment value – the books with the lowest overall cost per hour of reading (including all re-reads)

Whispersync is to books as the audioscrobbler is to music.  It is an implicit way to track what you are really paying attention to.  The data from Whispersync will give us new insights into how people really read books.  A chart that shows that  the most abandoned author is James Patterson may steer readers away from Patterson and toward  books by better authors.  I’d rather not turn to the New York Times Best Seller list to decide what to read.   I want  to see  the Amazon Most Frequently Finished book list instead.

, , , ,

43 Comments

Normalisr – Time-based charts of your last.fm data

Worth checking out: Normalisr

,

2 Comments

TechCrunch rickrolls the Hype Machine

Last week, on the Hype machine blog, Anthony indicated his increasing frustration in how easily charts could be manipulated – Anthony wanted a better way, one that was transparent, and gave more influence to the influential.  Anthony’s solution was to create a twitter chart that is based on the twittering activity of Hype Machine songs.  In this new chart Twitterers with more followers have more influence than those with few.

A number of commenters on Anthony’s blog pointed out how it would be easy for a single very popular twitter user to influence the charts.  And that is exactly what Erick Schonfeld of TechCrunch did. Erick used the power of TechCrunch for evil.

Evidence of Erick Schonfeld's rickroll

Evidence of Erick Schonfeld's rickroll

With one tweet from the TechCrunch twitter account (with its nearly 1 million-person reach) he was able to put Rick Astley’s Never Gonna Give you Up at the top of the Hype Machine Twitter chart.  Erick writesThe Hype Machine’s formula is flawed. No single person should be able to affect the rankings so easily“.

It’s arguable whether or not this is a dishonest manipulation of the charts.  TechCrunch really does have a reach of 1 million people – and so by tweeting Rick Astley they are potentially exposing  those millions to this song.  However, in reality, people don’t read TechCrunch for music recommendations – TechCruch is just not a music tastemaker (sorry Erick).  A tweet by TechCrunch counts much less than a tweet by Indie music guide Pitchfork.

Update - Note that the spammers are now starting to recognize the twitterverse as a place that they can target.  If you have $27 you can get the twittertrafficmachine to get you 20K followers in a month:

pay-for-followers

Anthony should adjust how he scores a tweet to not only include the reach of the tweet but  to also include the music reputation of the source.   It is not as easy to determine the music reputation as the number of followers for a source, but it is much more important.   Some indicators that a tweet has real influences are whether people actually click on the link and listen to the song and whether the poster actually  listens to music, especially new music, before it gets popular.

I suspect Anthony will be tweaking his scoring algorithms soon to make the charts better reflect what real music listeners are listening to, not just what popular people are listening to.

Update: Anthony has responded in he comments.

, , ,

3 Comments

The Billboard API

220px-billboard_logosvg1Billboard, the venerable maintainer of the Billboard Hot 100 and a bevy of other music charts, is now making this data available via an API.  The API “puts the entire rich history of the Billboard charts at your fingertips to sample and mix into your web pages and applications.”.  The API is in public beta -  but already it is supplying some really good information.

The first service that they’ve rolled out is the ‘Chart’ service, which lets you search and retrieve Billboard chart information.

For example, to find all appearances of The Beatles  on any of the Billboard charts during the first week of June in 1964, you could make the call:

http://api.billboard.com/apisvc/chart/v1/list?artist=The+Beatles&sdate=1964-06-01&edate=1964-06-08&api_key=your_key

With results:

<?xml version='1.0' encoding='UTF-8'?>
<searchResults firstPosition='1' totalReturned='6' totalRecords='6'>
    <chartItem id='8807769' rank='2' exrank='0'>
        <chart id='3070264'>
            <name>The Billboard Hot 100</name>
            <issueDate>1964-06-06</issueDate>
            <specId>379</specId>
            <specType>Singles</specType>
        </chart>
        <artist>The Beatles</artist>
        <writer />
        <song>Love Me Do</song>
        <producer />
        <catalogNo>9008</catalogNo>
        <promotion />
        <distribution>Tollie</distribution>
        <peak>1</peak>
        <weeksOn>14</weeksOn>
    </chartItem>
    <chartItem id='8715479' rank='4' exrank='0'>
        <chart id='3068613'>
            <name>The Billboard 200</name>
            <issueDate>1964-06-06</issueDate>
            <specId>305</specId>
            <specType>Albums</specType>
        </chart>
        <artist>The Beatles</artist>
        <writer />
        <song>The Beatles' Second Album</song>
        <producer />
        <catalogNo>2080</catalogNo>
        <promotion />
        <distribution>Capitol</distribution>
        <peak>1</peak>
        <weeksOn>55</weeksOn>
    </chartItem>
    <chartItem id='8715481' rank='6' exrank='0'>
        <chart id='3068613'>
            <name>The Billboard 200</name>
            <issueDate>1964-06-06</issueDate>
            <specId>305</specId>
            <specType>Albums</specType>
        </chart>
        <artist>The Beatles</artist>
        <writer />
        <song>Meet The Beatles!</song>
        <producer />
        <catalogNo>2047</catalogNo>
        <promotion />
        <distribution>Capitol</distribution>
        <peak>1</peak>
        <weeksOn>71</weeksOn>
    </chartItem>
    <chartItem id='8807803' rank='36' exrank='0'>
        <chart id='3070264'>
            <name>The Billboard Hot 100</name>
            <issueDate>1964-06-06</issueDate>
            <specId>379</specId>
            <specType>Singles</specType>
        </chart>
        <artist>The Beatles</artist>
        <writer />
        <song>Do You Want To Know A Secret</song>
        <producer />
        <catalogNo>587</catalogNo>
        <promotion />
        <distribution>Vee-Jay</distribution>
        <peak>2</peak>
        <weeksOn>11</weeksOn>
    </chartItem>
    <chartItem id='8715486' rank='11' exrank='0'>
        <chart id='3068613'>
            <name>The Billboard 200</name>
            <issueDate>1964-06-06</issueDate>
            <specId>305</specId>
            <specType>Albums</specType>
        </chart>
        <artist>The Beatles</artist>
        <writer />
        <song>Introducing...The Beatles</song>
        <producer />
        <catalogNo>1062</catalogNo>
        <promotion />
        <distribution>Vee-Jay</distribution>
        <peak>2</peak>
        <weeksOn>49</weeksOn>
    </chartItem>
    <chartItem id='8807777' rank='10' exrank='0'>
        <chart id='3070264'>
            <name>The Billboard Hot 100</name>
            <issueDate>1964-06-06</issueDate>
            <specId>379</specId>
            <specType>Singles</specType>
        </chart>
        <artist>The Beatles</artist>
        <writer />
        <song>P.S. I Love You</song>
        <producer />
        <catalogNo>9008</catalogNo>
        <promotion />
        <distribution>Tollie</distribution>
        <peak>10</peak>
        <weeksOn>8</weeksOn>
    </chartItem>
</searchResults>

You can restrict searches to various charts (Hot Country, Pop 100, Top Latin, etc.) , and you can search  by artist and/or song name over a range of dates.  (Unfortunately, but not too surprisingly, the data for the current month is not available in the searches).

The terms-of-service seem pretty reasonable- you are allowed to make 1,500 API calls per day at up to 2 queries per second.  Commercial use seems to be allowed (But I’m not a lawyer, so you should check for yourself).  However, according to the terms, you are not allowed to store any of the Billboard data. The services are well documented, support JSON as well as XML output and query times are fast.

I can think of all sorts of uses for this data  – to help create playlists for the 25 year high school reunion, tracking artist popularity over time, answering bar room music questions like “What was the highest charting instrumental-only single?” or “Did Ringo ever have a hit?”.  It is perfect data for the Music Alchemists that are trying to build  an automatic hit predictor.

The Billboard chart API is an excellent addition to the world of music web services.   It goes straight into my Top Ten Music APIs chart – with a bullet.

, ,

7 Comments

Follow

Get every new post delivered to your Inbox.

Join 91 other followers