Posts Tagged shilling
In yesterday’s post about the Hot Songs of Summer 2013, I noted that some songs were attracting a very passionate fan base. In particular, the song Miss Movin’ On by Fifth Harmony was an extreme outlier, attracting more than twice the number of plays per listener than any other song.
Based on this data I suggested that the Fifth Harmony was going places – such high passion among their listeners was surely indicative of future success. But now I am not so sure. Shortly after I made that post I learned that our crack data team here at The Echo Nest were already on to some Fifth Harmony shenanigans. Yes, Fifth Harmony is getting lots of plays, but many of these plays are due to an orchestrated campaign. Fifth Harmony fans are encouraged to go to music streaming sites such as Spotify and Rdio and stream Miss Movin’ On (aka MMO) 24/7. Here are some examples:
There are a number of twitter accounts that are prompting such MMO plays. The campaign seems to be working. 5H is moving up in the charts. Just take a look at the top songs on Rdio this week, Miss Movin’ On is number two on the list:
But what effect is this campaign really having on Fifth Harmony? Perhaps Fifth Harmony’s position on the charts is a natural outcome of their appeal, and is not a result of a small number of fans that stream MMO 24/7 with their computers and iPhones on mute. Can we see the effect that The Harmonizers are having? And if so, how substantial is this effect? The answer lies in the data, so that’s where we will go.
Can we see the effect of the Harmonizers?
The first thing to do is to take a look at the listener play data for MMO and compare it to other songs to see if there are any tell-tale signs of a shilling campaign. To do this, I selected 9 other songs with similar number of fans that appeal to a similar demographic as MMO. For each of these songs I ordered the listeners in descending play order (i.e. the first listener is the listener that has played the song the most) and plotted the number of plays per listener for the 10 songs.
As you can see, 9 out of 10 songs follow a similar pattern. The top listeners of a song have around a thousand plays. As we get deeper into the listener ranks, the number of plays per listener drops off at a very predictable rate. The one exception is Fifth Harmony’s Miss Movin’ On. The effect of the Harmonizers is clearly seen. The top plays are skewed to greatly inflate the total number of plays by two full orders of magnitude. We can also see that the number of listeners that are significantly skewing the data is relatively small. Beyond the top 200 most active listeners (less than 0.5 % of the Fifth Harmony listeners in the sample), the listening pattern for MMO falls in line with the rest of the songs. It is pretty clear that the Harmonizers are really having an effect on the number of plays. It is also clear that we can automate the detection of such shilling by looking for such non-standard listening patterns.
Update – a reader has asked that I include One Direction’s Best Song Ever on the plot. You can find it here.
How big of an impact do the Harmonizers have on the overall play count?
The Harmonizers are having a huge impact. 80% of all track plays of Miss Movin’ On are concentrated into just the top 1% of listeners. Compare that to the other 9 tracks in our sample:
Percentage of listeners that account for 80% of all plays
|Fifth Harmony – Miss Movin’ On||1.0|
|Lorde – Royals||14.0|
|Karmin – Acapella||16.0|
|Anna Kendrick – Cups||17.0|
|Taylor Swift – 22||14.0|
|Icona Pop – I love it||15.0|
|Birdy – Skinny Love||25.0|
|Lana Del Rey – Summertime Sadness||15.0|
|Christina Perri – A Thousand Years||21.0|
|Krewella – Alive||17.0|
A plot of this data makes the difference quite clear:
I estimate that at least 75% of all plays of Miss Movin’ On are overplays that are a direct result of the Harmonizer campaign.
What effect does the Fifth Harmony campaign have on chart position?
It is pretty easy to back out the overplays by finding another song that has a similarly-shaped plays vs listener rank curve once we get beyond past the first 1% of listeners (the ones that are overplaying the track). For instance, Karmin’s Acapella has a similar mid-tail and long-tail listener curve and has a similar audience size making it a good proxy. It’s Summer Time rank was 378. Based on this proxy, MMO’s real rank should be dropped from 45 to around 375. This means that a few hundred committed fans were able to move a song up more than 300 positions on the chart.
The bottom line here is that an organized campaign for very little cost has harnessed the most passionate fans to substantially bolster the apparent popularity of an artist, making the artist appear to be about 4 times more popular than it really is.
What does this all mean for music services?
Whenever there’s a high-stakes metric like chart position some people will try to find a way to game the system to get their stuff to the top of the chart. Twenty years ago, the only way to game the charts was either by spending lots of money buying copies of your record to boost the sales figures, or bribe radio DJs to play your songs to boost radio airplay. With today’s music subscription services, there’s a much easier way to game the system. Fans and shills need to simple play a song on autorepeat across a a few hundred accounts to boost the chart position of a song. Fifth Harmony proves that if you have a small, but committed fan base, you can radically boost your chart position for very little cost.
Obviously, a music service doesn’t like this. First, the music service has to pay for all those streams, even if no one is actually listening to them. Second, when a song gets to the top of a chart through shilling and promotion campaigns, it reduces the listening enjoyment for those who use the charts to find music. Instead of finding a new song that got to the top of the chart based solely (or at least mostly) on merit, they are listening to a song that is a product of a promotion machine. Finally, music services that rely on user play data to generate music recommendations via collaborative filtering have a significant problem trying to make sure that fake plays don’t improperly influence their recommendations.
So what can be done to limit the damage to music services? As we’ve seen, it is pretty easy to detect when a song is being overplayed via a campaign and these overplays can be removed. Perhaps even simpler though is to rely on metrics that are less easily gamed – such as the number of fans a song has instead of the total number of plays. For a music subscription service that has a credit card number associated with each user account, the number of fans a song has is a much harder metric to hack.
What does this say about Fifth Harmony fans ?
I am always happy when I see people getting excited about music. The Fifth Harmony fans are really excited about Miss Movin’ On, the tour and the upcoming album. Its great that the fans are so invested in the music that they want to help the band be successful. That’s what being a fan is all about. But I hope they’ll avoid trying to take their band to the top by a shortcut. As they say, it’s a long way to the top if you want to rock n’ roll. Let Fifth Harmony earn their position at the top of charts, don’t give them a free ride.
And finally, a special message to music labels or promoters: If you are trying to game the music charts by enlisting hundreds of pre-teens and teens to continuously stream your one song: screw you.
Update – I’ve received **lots** of feedback from Harmonizers – thanks. A common theme among this feedback is that the fan activities and organization really are a grassroots movement, and there really is no input from the labels. Many took umbrage with my suspicions that the label was pulling the strings. I remain suspicious, but less so than before. My parting ‘screw you’ comment was in no way directed at the 5H fans, it was reserved for the mythical music label marketeer who I imagined was pulling the strings. I’m hoping to dig in a bit deeper to understand the machinery behind the 5H fan movement. Expect a follow up article soon.
Last week, on the Hype machine blog, Anthony indicated his increasing frustration in how easily charts could be manipulated – Anthony wanted a better way, one that was transparent, and gave more influence to the influential. Anthony’s solution was to create a twitter chart that is based on the twittering activity of Hype Machine songs. In this new chart Twitterers with more followers have more influence than those with few.
A number of commenters on Anthony’s blog pointed out how it would be easy for a single very popular twitter user to influence the charts. And that is exactly what Erick Schonfeld of TechCrunch did. Erick used the power of TechCrunch for evil.
With one tweet from the TechCrunch twitter account (with its nearly 1 million-person reach) he was able to put Rick Astley’s Never Gonna Give you Up at the top of the Hype Machine Twitter chart. Erick writes “The Hype Machine’s formula is flawed. No single person should be able to affect the rankings so easily“.
It’s arguable whether or not this is a dishonest manipulation of the charts. TechCrunch really does have a reach of 1 million people – and so by tweeting Rick Astley they are potentially exposing those millions to this song. However, in reality, people don’t read TechCrunch for music recommendations – TechCruch is just not a music tastemaker (sorry Erick). A tweet by TechCrunch counts much less than a tweet by Indie music guide Pitchfork.
Update – Note that the spammers are now starting to recognize the twitterverse as a place that they can target. If you have $27 you can get the twittertrafficmachine to get you 20K followers in a month:
Anthony should adjust how he scores a tweet to not only include the reach of the tweet but to also include the music reputation of the source. It is not as easy to determine the music reputation as the number of followers for a source, but it is much more important. Some indicators that a tweet has real influences are whether people actually click on the link and listen to the song and whether the poster actually listens to music, especially new music, before it gets popular.
I suspect Anthony will be tweaking his scoring algorithms soon to make the charts better reflect what real music listeners are listening to, not just what popular people are listening to.
Update: Anthony has responded in he comments.
The very popular blog aggregator The Hype Machine has a ‘Popular Page‘ that shows the tracks that have been most favorited in the last 3 days. This is a great way to find out what the music zeitgeist is. However, Anthony (Mr. Hype Machine) recently discovered that a number of highly favorited artists seemed to have reached the popular page by nefarious means. According to Anthony, it appears that a number of artists became popular when many presumably fake accounts, created from the same IP address in a very short period of time all favorited a single artist in an apparent effort to get the artist to appear on the popular page. This type of hacking is not too surprising – whenever you have a chart or poll that relies on ‘the wisdom of crowds’ you are susceptible to the shill who will try to manipulate the chart in order to promote their interests. We see this in online polls, social news sites and popular music sites.
When Anthony became aware of how the Hype Machine was being manipulated, he and the rest of the Hype machine team fought back, instituting a Captcha mechanism to prevent automated account creation, ignoring favoriting activity for new accounts, and keeping a much closer eye on new account activity.
But Anthony didn’t stop there, he went one step further. He named names. He posted on his blog a list of all the artists that, according to Anthony have “attempted to manipulate the charts on the Hype Machine”. Anthony says he published the list to “let everyone make their own judgments about quality, integrity and marketing strategies:”. But really, I suspect that Anthony’s real motivation was to shame those that would attempt to try to enlist the Hype Machine to promote their band.
A commenter on that blog post that claims membership in one of the outed shilling bands protests that they absolutely did not create fake accounts and they had been unfairly defamed (literally) by the Hype Machine. But Anthony responds with a list 4 tracks by the band that had each been favorited from a single IP address by over 40 separate, newly created accounts. Anthony says “Given that this is a time-consuming activity that primarily benefits you, you can see how it appears likely that you or your team may have been involved”.
Should Anthony have outed these artists? Surely the excessive favoriting could have been an overzealous fan that decided to try out a new way to hype their favorite band (to put the ‘hype’ in Hype Machine, if you will), and the band is blameless. But from Anthony’s point of view it doesn’t really matter. Anthony is going to protect the integrity of the Hype Machine and he’s going to do it by pointing to any band that has benefited from ‘unnatural’ enthusiasm. Even if it means public humiliation for the blameless.
I suspect Anthony’s next problem will occur when some pranksters realize that they can get any band blacklisted at the Hype Machine by a bit of nefarious activity. By simply creating a set of sham accounts and favoriting tracks by the vicitim band from those sham acounts, the Hype Machine can be manipulated into blacklisting and humilating the band. Is your ex-girlfriend’s new boyfriend in a band? Get your dorm floor to create 50 Hype Machine accounts, favorite his tracks and watch the fun as he gets outed and shamed as a shill.
The lesson here is that charts that show popularity are hard to get right – they can be easily manipulated for fun or for profit. Anthony should be prepared to fight an escalating war against those that want to manipulate his charts. And the more popular the Hype Machine becomes, the bigger the target it will be for the hackers and the shills.