Talk Radio – control Rdio with the new Web Speech API

Control your radio with your mouth

My weekend hack at the Tufts Hackathon was to build a music player that you can control with speech. The hack uses the new Web Speech API that just started shipping this week with Chrome 25. It seemed like it would be fun to give it a spin. I created a playlisting app that you can control with speech. It is called Talk Radio

This app lets you control your music player with your words. Try saying something like:

Play music by Carly Rae Jepsen
Play music like Weezer
Play some brutal death metal
Play some christmas music
Play slow music by Beyoncé
Play fast music by Beyoncé
Play chill music in the style of smooth jazz
Play some screamo

Pro tip – the artist or genre should always be at the end of your utterance.

The hack is an exploration of how well an off-the-shelf speech large vocabulary speech recognizer would work in the music domain. Music has lots of hard names like deadmau5, p!nk, !!! and many domain-specific terms like ‘screamo’, ‘hip hop’, ‘shoegaze’. I am actually quite surprised at how well this works. The Google speech recognizer does a good job at understanding most of the neologism like ‘screamo’ and ‘shoegaze’, and does an excellent job at recognizing popular artist names like Jay-Z and Beyonce. For unusual artist names, The Echo Nest artist search does a really good job of finding what you meant. So when the speech recognizer returns “play music by chick chick chick”, The Echo Nest artist search can turn the artist search for “chick chick chick” into “!!!” with no problems. Similarly the speech recognizer will return “dead mouse” which The Echo Nest will resolve to ‘deadmau5’.

We can also field more general music queries. If a style query returns no results, it is re-submitted as a general artist-description query. This lets you find more esoteric music “big hair bands”.

Issues

You have to grant the app permission to access the microphone for every utterance. This can be alleviated in the near future after a few API issues are sorted out. Until then, the app is all Cancel or Allow. (And yes, it is incredibly annoying). This is all sorted now.

This hack was built at the Tufts Hackathon 2013. For me, it was a half-a-hackday with lots of time spent supporting The Echo Nest APIs to folks who had never used it before and traveling in the snow. Still, it was fun to use the nifty new Web Speech API that just shipped this week in Chrome Version 25.

This entry was posted on February 24, 2013, 1:04 pm and is filed under Music. You can follow any responses to this entry through RSS 2.0. Both comments and pings are currently closed.

Music Machinery