Doing stats during an election is like talking in a crowded room. Me doing stats during an election is like whispering lines from the phonebook in a crowded room while everyone else talks through megaphones giving away Amazon gift card codes. I may not have banks of telethonners calling and polling choice constituencies and asking questions like “which party leader would you most like to meet in the smoking area of a club?” but one thing I do have is 47 hours of people tweeting about hating Nigel Farage et al.
What I’ve got: 47 hours of tweets mentioning any of the parties or party leaders with the words ‘hate’ or ‘love’.
What I’m going to do with it: Construct a way of quantifying the hate of each party. Find out each party’s hate factor, each leader’s hate factor and the difference between the two. Plot a graph of hate against time over the full 47 hours.
Everyone likes pizza. In fact there’s a whole bunch of things that everyone likes, here’s a list of some of them: cheese, coca-cola, Nelly, ‘Cotton-Eye Joe’, the smell of freshly-cut grass. The problem with this is that as soon as everyone shares a mild opinion on something it ceases to be discerning as a quality of that person.
Take this montage of video dating clips in the 80s, at about 2.28 every single one of the dudes, express that they like ‘having fun’. You know as well as i do that as soon as you aren’t enjoying ‘having fun’, you aren’t having fun. Therefore by definition you have to like having fun, and the statement “I love having fun” is useless information when looking for someone to have fun with.
A much better way of getting down to someone’s true opinions is to focus on what they hate and what they love, because as soon as you declare that you hate/love something, you become open to a lot more hassle. You can get away with saying “I like blur” as that doesn’t need to mean much more than “When I was 14, I cried at home while listening to ‘tender’ after a girl rejected me”. However when you’re coming out with “I love blur” you have to answer questions like “how can you like a band which pushes out one good song per album” or “have you listened to any other band ever”. To openly love something, you gotta really love it.
So to get down to what’s really going on at this election, lets see at who or what and when people are hating in the general election.
A hate metric
A ‘metric’ is a standard unit of measurement. We need to define a metric which will adequately get across how much people hate something. It’s no good just scanning twitter for an hour and finding all the tweets with ‘hate’ in it, as that would make it seem like the more popular people would be the ones who are hated due to a larger proportion of people talking about them. A useful metric would be able to tell how much people hate me against how much people hate Sam Smith, if anyone had ever expressed an opinion about me on the internet (they haven’t, I’ve checked), and it should correctly tell you that people hate me a decent amount more.
Really, we should go as natural as possible, which is why I postulated this:
Ha ha, see! Doing a physics degree, if useful for nothing else, can give you the ability to crack shit jokes on the internet.
The actual metric I used was:
Or in words: we take all of the tweets which mention hate about a thing and divide them by the total number of tweets mentioning either hate or love about that same thing. So someone who was equally well loved and hated would get a rating of 0.5, anything higher than that is someone who’s hated, and anything less is someone who’s loved. I bolded that so that anyone who got bored as hell above and just wanted to scroll to the plots would realise they needed to read that.
Let’s get cracking. I took data from twitter of every user saying the name of a party or party leader (of the 7 parties which spoke in the leader’s debate) with the word “hate” or “love” in the text for just under two days.
We like your party, we like, we like your party
This first one is a graph of the results for the hate metrics of each of the parties. The dotted line is to signify the value where the amount of hate tweets and love tweets are the same.
Broadly this looks about right. The parties you’d expect to be hated are above the hate line and the green party, which are essentially un-hateable, are significantly down in the love end. However this doesn’t really sit right, having the Lib Dems and UKIP on similar hate levels doesn’t really make sense – at least in my personal opinion there should be plenty more animosity for UKIP than the Lib Dems. Seems like UKIP just finds a lot of love online.
OK, next up is the party leaders:
I think this is pretty shocking. Isn’t it much more natural to say “I hate [politician’s name]” than to say “I love [politician’s name]”? Apparently not, with the exception of Ol’ Nigel, everyone is better loved than hated. Natalie Bennett scores our only perfect score in this analysis. Over the course of a whole two days, not one person on earth tweeted about hating the leader of a prominent UK party, with quite a few expressing their love. Is this the new face of politics? Greens are doing very well.
Here’s an interesting question though, how much are the leader’s hated compared to their parties. In this next graph, I’ve subtracted the hate metric for each party from the one from their leader. This means that a negative value on this graph corresponds to a leader which is less hated than their party.
Nigel’s not doing well and neither is Ed. Unfortunately I didn’t seem to catch the edge of the #milifandom phase, which is partially what I was looking for with this. I’d guess that the graph would have looked significantly different if I had. I tried to run several different reasons for Nigel and Ed’s failings by someone who knew more about politics than I did but they all ended up sounding like shit.
Haters gonna hate (at unpredictable times of day)
This week, since I spent more than an hour looking for data, it was possible to see some meaningful difference in how much people were tweeting at different times of day. I picked up all the tweets with love, hate and any party names, stuck them into half hour sections and here it is:
First thing thats cool (and something I’m not going to go into) is the dips between 00:00 and 12:00 every day. This is due to the fact we’re looking at tweets about UK politics, and is just basically people sleeping instead of spewing hate (or love) on the internet. I just think its quite cute how similar those dips are.
Anyway, The fact that this has large spikes shows something wrong with my metric: there are plenty of ways to express hate for something without the word ‘hate’, like “FUCK Nigel Farage” etc. but there’s also ways of saying ‘love’ without meaning you love the party, you could write a tweet like “conservatives love fucking over the NHS” and it would count as a love and not a hate. For the most part, this is what produced my large spikes and not any gargantuan shift in popular opinion.
How this produces the spikes is best described by an example, take the gargantuan spike around 12.00 which I’ve kindly dashed. This shift upwards from the trend is just the impact of one single tweet:
Here’s the graph of them:
And here is the full tweet distribution without this one:
Well, shit ok it looks like there’s still a spike, but its taken a bit of the edge off. By being so closed minded about how someone can express their hate or love, I’ve allowed individual tweets to make it look like something drastically bad has happened.
Here’s the graph against time decomposed into loves and hates:
This one has really got me going wild. Why is there so many weird correlations? All that bullshit I spouted earlier about that one tweet causing the spike is apparently only partially true, ‘love’ had a similar sized peak (due to this tweet – nothing to do with politics) just half an hour earlier and they both drop at the same time. Then immediately after that the two lines just shun moving at similar levels and do mirror opposite things of one another. Only thing left to do now is work out the evolution of my hate metric against time:
There we bloody go, that’s politics sorted. I promise next election I’ll be better prepared.