The spread of abusive hate speech online can deepen political divisions, but researchers have now developed a way to automatically detect it and help combat the spread of harmful online content.
They’ve done so using a new Multi-task Learning (MTL) model, a type of machine learning model that works across multiple datasets.
The technology more accurately and consistently detects hate speech on social platforms. It’s also able to separate abusive content from hate speech, and identify particular topics, including Islam, women, ethnicity and immigrants.
The United Nations defines hate speech as “any kind of communication in speech, writing or behaviour, that attacks or uses pejorative or discriminatory language concerning a person or a group based on who they are, including their religion, race, gender or other identity factor’.
Along with deepening political divisions, abusive hate speech online can marginalise vulnerable groups, weaken democracy and trigger real-world harms, including an increased risk of domestic terrorism.
“As social media becomes a significant part of our daily lives, automatic identification of hateful and abusive content is vital in combating the spread of harmful content and preventing its damaging effects,” said Associate Professor Marian-Andrei Rizoiu, Head of the Behavioural Data Science Lab at the University of Technology Sydney (UTS).
“Hate speech is not easily quantifiable as a concept. It lies on a continuum with offensive speech and other abusive content such as bullying and harassment.”
Associate Professor Rizoiu outlines the new model in the paper Generalizing Hate Speech Detection Using Multi-Task Learning: A Case Study of Political Public Figures, published in Computer Speech & Language, with co-author and UTS PhD candidate Lanqin Yuan.
How does the MTL model work?
A Multi-Task Learning model is able to perform multiple tasks at the same time and share information across datasets.
For detecting abusive hate speech online, the MTL model was trained on eight hate speech datasets from platforms like Twitter (now X), Reddit, Gab and the neo-Nazi forum Stormfront.
Researchers then tested the MTL model on a unique dataset of 300,000 tweets from 15 American public figures, including former presidents, conservative politicians, far-right conspiracy theorists, media pundits, and left-leaning representatives perceived as very progressive.
Out of 5299 abusive posts, the analysis revealed that 5093 were generated by right-leaning figures. It showed that abusive and hate-filled tweets, often featuring misogyny and Islamophobia, primarily originate from right-leaning individuals.
This work identified several topics and targets of hate in the online discourse of public figures as well. In particular, researchers identified Muslims, women and immigrants and refugees as targets of hate.
“Future research aims to analyse the impact of the detected hateful tweets on public discourse and how figures interact with such hateful speech, whether by following existing discourse or starting new discussions,” the report says.