Modelling and Predicting Viewership for Rooster Teeth Videos

So I got a bit tired from studying the other day so I decided to take a break and watch a Rooster Teeth video. As I was watching it though, I looked at the viewership (something around 2 million), and it got me thinking about the statistical modelling of viewership ratings on YouTube.

Basically there are a few different classes of video, I would say, on YouTube. There are viral videos like PSY’s Gangnam Style (which as of me writing this sits at 2,108,731,544 views). These videos have a crazy model because of all the various factor. I briefly looked at this post, but there seems to be some good analysis in the replies that I’ll probably look into a bit later.

Then there are your average videos that don’t really get many views ever. Maybe friends of friends, etc.

But then, you have your “dedicated audience” channels. This is the category which Rooster Teeth falls under. On these channels essentially you get what looks kind of like a logarithmic curve.

So, funny thing. I wrote a script that would mine video view stats of RT videos every hour/day so that I could actually see what this curve looked like. That was really dumb of me though, because obviously YouTube (i.e. Google) stores this data already… In fact, there’s this nifty little button below videos that says More and then if you click on it there’s a button that says Stats and that gives you cool little graphs.

Anyways, I’m looking a bit more at what kind of models you can get from the RT videos (how many viewers you’ll get per video, the retention over playlists, etc.). Later on I want to try and extract features from the videos (not video features, but maybe keywords from descriptions, like who’s in them, etc. and see how those affect viewer retention across a playlist or whatever). I’m sure RT or other channels have already done this, but I still want to put my ML and Stats knowledge to use and this is a fun little thing that I’ll probably expand on later.

I’m still mining data, so I’ll probably expand this post more when I’m not bogged down with interviews/schoolwork.

Written on October 17, 2014