Friday, September 05, 2014

Men are More Harassed On The Net Than Women. So Cathy Young Tells Us.

She does so in a recent Daily Beast article with the title "Men Are Harassed More Than Women Online."

It's worth thinking about that title, even knowing that Young herself didn't pick it.  That's because the only evidence she offers for men being harassed MORE than women is a Demos study, which argues that famous men receive more Twitter abuse than famous women.  More about that study later.

The rest of Young's argument consists of anecdotes about individual men who have been harassed (and does not consist of of anecdotes about individual women who have been harassed), the extent of the harassment they have suffered, examples of feminists harassing anti-feminists and so on, as well as the fact that a sizable minority of what Young regards as harassers are female, even though the majority are male.

She then tells us to ignore the 2006 study which found chat room bots given female usernames receiving twenty-five times more threatening or sexually explicit messages than bots with male and neutral usernames.  And why should we ignore the study?  Because Young tells us that the Internet has changed since 2006.  But the study wasn't about something which could be affected by such change, given that it isolated one single question:  That about the impact of being taken for a woman rather than a man on the net, all other aspects being held constant.

Unless we assume that the current cross-section of Internet users is quite different from the 2006 version, with far less sexist behavior, it's difficult to see why that study wouldn't still matter.  It's not a decisive study, of course, but then neither are the studies Young prefers.

She refers to two studies.  The first one is a Pew Institute study about Internet use with a focus on privacy and security of Internet use.  That study has a question (p. 94) which relates to Internet harassment and stalking but does not define these terms to the respondents and does not distinguish stranger harassment from harassment by acquaintances (including people from the respondent's past) or even by advertisers.  Eleven percent of the men interviewed and thirteen percent of the women interviewed stated that they had experienced Internet harassment or stalking.

The study also asks (p. 98) whether the respondent agreed with the statement "Something happened online which led me to physical danger."  Five percent of the female respondents and three percent of the male respondents answered in the affirmative.  But it's hard to know what specific types of examples those answers might have reflected.  Anything that could lead a person to physical danger could qualify, not just harassment by strangers on the net.

Whatever our interpretation of that study, it doesn't demonstrate that men are harassed more on the net than women, right?

That assertion is based on the Demos study.  I tried to find the study at the link given here but was unsuccessful. Thus, what I have to say about the study is based on the summary information the people at Demos have provided and a couple of takes on the study found elsewhere.

The Demos study is on Twitter use, not on all online activities.  It consists of two million tweets sent to some selected group of male and female celebrities, musicians, politicians and journalists, specifically picked so that half of those tweets were to women and half to men.  A million each, right?

But what were the numbers of the male and female celebrities, politicians, musicians and journalists?  Who were they?*  I think both names and numbers matter, for a proper understanding of the study.  For example, if the samples of women and men do not reflect the same average level of fame the results can be biased by that.  Suppose that it's the most famous people in the field who get the most harassment.  If that group includes more men than women, you could get the Demos results without it meaning that famous men are more likely to be harassed than equally famous women.

You might now wonder how someone could read two million tweets and rank them for their offensiveness.  The answer is that nobody did that.  The study used a shortcut:  It combed the tweets for words from a list of words deemed offensive**.  If one or more of those words were contained in a tweet, then that tweet was counted as harassing.  That's actually a neat way of getting around the problem of coping with floods of data.

But it has its problems.***  For example, suppose I tweeted to someone:  "You bastard, you!  Well done, old buck!"   Or "We are fucked as long as these politicians are in power."

Those would be counted as tweets harassing the recipient in the Demos study.  And if I tweeted to someone "I just got called an old bull dyke.  Ever happen to you?"  that, too, would be counted as me harassing the recipient.

On the other hand, a tweet that is explicitly threatening and horrible would slip the study as long as it didn't use any of the dirty words.

Thus, strictly speaking the Demos study is about the number of tweets celebrities, politicians, journalists and musicians receive which contain dirty words.  There's no doubt that many/some of those are harassing tweets.  But the relationship is not one-to-one.  "Fucking brilliant!"  is not a harassing tweet, yet there are subcultures on the net which would use language of that sort.  Figuring out that relationship between tweets containing naughty words and harassing tweets would be a good project for someone.

Given all this, you have to interpret the Demos results with great caution.  Young seems to take them all at face value, which makes her state that female journalists were sent more than twice as many naughty-word-tweets than male journalists:

The only category in which women got more Twitter abuse than men was journalism: abusive messages accounted for more than 5 percent of the tweets sent to the female journalists and TV presenters in the study and fewer than 2 percent of the ones sent to the male journalists.
Perhaps.  It's also possible that many of the tweets were about online harassment or about sexist slurs the tweeter had heard and wished to discuss and so on.

Young then states that Piers Morgan, the most harassed celebrity in the study is really a journalist and had he been included in that group:

he would have single-handedly raised the proportion of abusive tweets to male journalists to almost 6 percent of the total
That, accidentally introduces another aspect that is problematic in these types of studies:  They do not control for the overall language and attitude of the tweet recipient.  Take the case of Morgan:

Nearly 10 percent of the tweets directed at Piers Morgan were derogatory in nature, according to The Drum, a marketing and media website.
Of course, a scan of Morgan’s tweets shows he dishes it out as well as he takes it. Many of his replies to followers are corrections to their grammar and contain British insults like “wanker.”

What are my conclusions about this whole kerfuffle, you might ask.  If you like to think of such matters, figure out how we could test for something like Internet harassment by gender.  Ideally we'd wish to compare men and women who are equally famous, who behave in exactly the same manner and who are active online in the same capacities.  How can we get close to that ideal study?

The sort of studies Demos has used are not without value.  But using dirty words is not the same thing as harassing someone, in particular among certain sub-cultures, and it's easy to harass someone without using naughty words.  An earlier Demos study of misogyny on Twitter (May 2014) used the word list approach.  But it also took a small sample which was more carefully analyzed.  Here are the results concerning the term "rape" in the tweets of that smaller sample:

Over the time period, there were 49,669 unique users contributing to the ‘conversation’ data set. Of those users, men use the word ‘rape’ more than women, although it is not a significant difference.
Based on a random sample of 381 user-profiles of people who tweeted as part of the non-media-related conversation about rape, we found that 4 per cent of users made some reference to gender- related activism, 2 per cent appeared to be overtly sexist, 9 per cent expressed some kind of maladjustment or anti-social sentiment, 8 per cent mentioned sports, 10 per cent mentioned politics in some way and 12 per cent mentioned music.

As things stand, our understanding of online interactions is at its infancy.  But what I'm pretty sure is this:  The evidence Young presents cannot be used to simply conclude that men are harassed more often online than women.

Added later:

*I finally found the list of the people included (in the Excel file for the study results).  Each sample has about twice as many women as men, and the samples contain five men, except for one which contains six men.  I'm not capable of judging the relative fame of the persons the study covers, but it's possible that the study doesn't really address the possibility that the men are more famous than the women (or that few men are very famous) and that the more famous people might get more tweets of all types.  The small number of individuals included means that outliers are more likely to affect the results, too.
**It has been argued that they may also have used various filters on the initial selection of tweets to differentiate between positive and negative tweets, indirect quotes in tweets and so on.  But the link the press release provides is to fairly different type of studies, about European politics.  I cannot tell if the methodology section of that study (from p. 92 on) is to apply to this study, too.  Some of the filters used in that paper are about specific events, whereas this study was not, and in any case a different study cannot tell us how well or poorly the filters might have been performing in the particular study about Twitter harassment of famous people.

But several people have now noted that the original word list of dirty words does not include the word "rape."  That omission (if the word list the press release gives us is the correct one) would make most of the study iffy on any gender differences in experiencing Twitter harassment.

Added even later:  It does look like the word "rape" was included in the list of dirty words for the study even though the link the study summary gave to us as the contents list does not include it.

I'm wondering why the reporting on the study is so unusual.  Why not have a proper paper with all the data included and the methodology explained?  The way things stand it's tough to wade through bits and pieces, to learn that a different study is where the methodology of this study can be found, to find the results given in a very short-hand Excel table and so on. And none of the bits and pieces tells us how the people in the samples were selected.  Because the number of individual celebrities is fairly small, their individual personalities are likely to matter quite a bit here.

***These problems are less if the study used filters which performed well when compared to human assessors.  But we are not given the information on that.  Even in that case the initial selection of words would limit the kinds of tweets which are selected so that "polite threats" would be completely ignored.