A student project at California’s Humboldt State University maps all references to “hate” words on geolocatable tweets between June 2012 and April 2013. It’s an interesting study, but the results should be used with care. Three aspects of the data collection and processing make this approach problematic, but the study deals directly with only one.

First, was the tweet really a negative? Phrases like “…queer theory says…” and “…I’m just an old cripple…” are two ways that ‘hate’ words might not be negatives. The study deals with this in a straightforward manner — the students read every tweet and applied a definitional rubric.

Second, is there any kind of processing bias? If you use raw numbers, big cities will dominate the map: Portland will generate more tweets and more hate tweets than Tilamook. To avoid this, the study categorized the data as a percentage of tweets from a given area. This throws them into another basin of attraction for errors: a small town with few tweeters will show up here if it holds even one prolific hater. For example, The Dalles is a little one-Starbucks town in northern Oregon (population 13,000, or about two cruise ships). Portland is a major metropolis (750,000 people in the county). On the map below, The Dalles stands out like a beacon in the NW, while Portland doesn’t even warrant shading.

Is The Dalles really a hotbed of hatred?

Is The Dalles really a hotbed of hatred?

Third, do haters tend to hide their geolocation more than normals do? This is a basic limitation of the data collection technique, and could only be compensated for by sampling the location of the non-geolocated tweets, an essentially impossible task. The best one might do is ask Twitter to run an equivalent study based on tweet IP address, except that that might violate Twitter’s privacy policy, and in any event is fraught with its own problems — IP-based advertising regularly offers me the opportunity to meet lonely women in the wrong part of the state, the wrong state, or even the wrong region of the country (I’m not sure I’ve ever been to Louisiana).

Still, this is an imaginative use of data available from social media, and despite its flaws it’s a worthwhile project.


Tags: , ,

One Response to “Haters?”

  1. Kurt Kremer Says:

    I hate twaters, don’t you? (And yes, twat is a perfectly good onomatopoeic synonym for twits and a stupid word for genitalia and the implied association, even if it has an olde English origin. Same for the other slang synonyms for genitals and people with foolish or despicable behaviors. We should also get stats for people who Hate in comments–as well as gush, dismiss, and pontificate. Personally, I think the latter are the worst, and I’ll tell you why…(read more)

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

%d bloggers like this: