Podcast Analytics is Tricky

Mar 07, 2017

tags: podcasting analytics

You have a podcast and you are looking for sponsor. Pretty soon you will hear this questions: “how many downloads per episode does your podcast have on average?”. Nobody ever will ask:

how many bytes of your audio did your listeners download?
how fast can I download your episodes?
are your audio files encoded at 320 kbits/second?

Why? Because number of downloads is easy to measure. When you download a file, an analytics platform increases a number somewhere on a server. That’s it. There are no doubts.

Whether the downloaded file is listened once, twice or never is another story.

When are your podcast audio files downloaded? It usually goes like this. A listeners subscribes to your podcast using an application like iTunes. That application will periodically check for new episodes and download them when available. If you are self hosting your podcast website and your audio files (and you know where to find logs on your server) you’ll see something like (simplified):

...
2017-03-02T13:45:30 - application1 has downloaded episode1.mp3
2017-03-02T13:45:30 - application7 has downloaded episode12.mp3
...

From this log statement you can build stats for your podcast. That’s essentially what every hosting company does.

But …

Not everybody subscribes to your podcast using an application that downloads episodes. Some prefer to listen using a web browser. And a web browser doesn’t usually download a whole audio file before playing, but cleverly uses Progressive Download. As you press play, the audio player in the browser starts downloading the first chuncks of a file and stores them locally. After enough chunks have been collected the audio starts playing. While you play the first ten seconds the next ten are downloaded, so that you don’t experience any interruption. And so on until you get to the end of the file.

Something very similar happens in an application like iTunes when you press play instead of downloading the episode first.

Now put yourself on the side of the server hosting audio files. You’ll see something like:

browser1 wants to download bytes 0-21 of the file episode1.mp3
browser2 wants to download bytes 0-32 of the file episode2.mp3
browser1 wants to download bytes 21-55 of the file episode1.mp3
browser1 wants to download bytes 55-71 of the file episode1.mp3
browser2 wants to download bytes 32-93 of the file episode2.mp3

And so on. Messy uh? But you could stitch together all the small requests and say

if all the chunks of a file have been requested by a player then I can count that as a download, or a full play of an episode

A similar system is what OmnyStudio announced a few days ago.

They claim to have an algorithm that, given a pile of chunk requests generated by iTunes, can gain insights about listening habits, for example if/when listeners skip ads.

There’s still many details to work out:

how do you identify the same player? Same IP can’t always be the answer.
how do you keep track of interruptions? E.g. user clicks pause and resumes the play tomorrow.
how do you deal with events like “user skipped from minute 10:21 to 14:45”?

That said what they did is not impossible (or BS, as many cried out). It’s just complicated. But a welcome initiative. Maybe other hosting companies will follow up.

Are you a podcaster? Check out Podrover and save time collecting your podcast reviews.

← I wrote an application in Objective-c Winding down AppVersion →