In part, that is because the toxicity is directed outwards, “at the Reddit community at large”, says Bell. ShitRedditSays, a subreddit which focuses on highlighting bad content around the rest of Reddit (frequently from a social justice viewpoint), comes close to the top. The problems with /r/Atheism led to that particular forum being removed from Reddit’s list of default subs. Other usual suspects, such as the Atheism, Politics and News subreddits, are all ranked as fairly toxic. The /r/TumblrinAction forum, devoted to mocking users of the Tumblr social network (which is particularly associated with its queer and female userbase), is ranked by Bell as both bigoted and toxic. When it comes to toxicity as a whole, there is some crossover with mere bigotry. Top of the list is /r/TheRedPill, “a subreddit dedicated to proud male chauvinism”, as Bell puts it, “where bigoted comments received overwhelming approval from the community at large”. Those include /r/Jokes and /r/Libertarian, both fairly self-evident.Īt the other end of the spectrum are those communities which seem to deliberately encourage bigotry. That led to some interesting quirks: in a number of subreddits, the community is proactive enough at self-policing that the average score for a bigoted comment is negative. Bell weighed the presence of bigoted comments by the approval they had been given by the other members of the community – the comment’s score, which is the net of upvotes minus downvotes. The initial finding was that, as expected, there’s a huge variation in the scale of bigotry on Reddit. “There are often problems with sentiment analysis and accuracy, but probably a bigger issue is that it’s not always all that actionable.” “Bigoted comments received overwhelming approval” We found that for comments that were randomly chosen, the vast majority were labeled ‘neutral’, which didn’t really provide much information for comparison, while the ones that were chosen by our sentiment model were far more likely to be labeled with the predicted sentiment than any other label. “Each post was labeled as ‘supportive’, ‘toxic’, or ‘neutral’. “Using the sentiment model, we selected for human annotation the 30 most positive, the 30 most negative posts, and another 40 random posts from each subreddit,” he said. In Bell’s tests, however, it proved its worth. It allows researchers to automatically process reams of data but it is criticised as an overly simplistic tool. Sentiment analysis is a controversial technology. The former involves applying Idibon’s technology to attempt to categorise comments as either positive, negative or neutral in sentiment, which let him narrow down the work required for the human annotators by 96%, only looking at those subreddits which had already been picked as containing a lot of negative comments. While this post is specific to Reddit, our methodology here could be applied to offer an objective score of community health for any data set featuring user comments.”īell pulled out a sample of comments from every one of the top 250 subreddits, as well as any forum mentioned in the toxicity thread, and subjected them to a number of tests designed to look for toxicity, which he defined as a combination of ad hominem attacks and overt bigotry.įrom there, he used a combination of sentiment analysis and human annotation to code each comment as toxic or non-toxic. I then compared Reddit’s own evaluation of its subreddits to see where they were right, where they were wrong, and what they may have missed. “With this in mind, I set out to scientifically measure toxicity and supportiveness in Reddit comments and communities. He writes: “As I sifted through the thread, my data geek sensibilities tingled as I wondered: ‘Why must we rely upon opinion for such a question? Shouldn’t there be an objective way to measure toxicity?’ Suggestions included the parenting subreddit – full of “sanctimommies” – and the community for the game League of Legends, which has “made professional players quit the game”. Ben Bell, a data scientist at text-analytics start up Idibon, decided to apply his company’s technology to the site to work out which subreddits have communities you would want to be a part of, and which you would be best avoiding.īell’s interest was sparked by a post asking Redditors to suggest their nominees for the most “toxic communities” on the site.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |