Twitter tests safety mode feature to silence abuse

By Jane Wakefield
Technology reporter

  • Published
Twitter logo and phone appImage source, Getty Images

Twitter is launching a feature that it hopes will help crack down on abuse and trolling, both of which have become huge issues for the platform.

Safety Mode will flag accounts using hateful remarks, or those bombarding people with uninvited comments, and block them for seven days.

The feature will work automatically once enabled, taking the burden off users to deal with unwelcome tweets.

It will initially be tested on a small group of users.

The feature can be turned on in settings, and the system will assess both the tweet's content and the relationship between the tweet author and replier. Accounts that are followed by the user or frequently interacted with, will not be autoblocked.

Katy Minshall, head of Twitter UK Public Policy, said: "While we have made strides in giving people greater control over their safety experience on Twitter, there is always more to be done.

"We're introducing Safety Mode; a feature that allows you to automatically reduce disruptive interactions on Twitter, which in turn improves the health of the public conversation."

Like other social media platforms, Twitter relies on a combination of automated and human moderation.

While it has never formally said how many human moderators it uses, a 2020 report by New York business school NYU Stern suggested that it had about 1,500 to cope with the 199 million daily Twitter users worldwide.

A recent study on hate speech produced by Facts Against Hate on behalf of the Finnish government found that Twitter was "the worst of the tech giants" when it came to hate speech.

The answer, according to study author Dr Mari-Sanna Paukkeri, is to utilise artificial intelligence systems which have been trained by humans.

"There are so many different ways to say bad things, and it is rocket science to build tools that can spot these," she said.

Simply highlighting certain words or phrases, a technique many social networks rely on, was not sufficient, she added.

Alongside dealing with abuse on the platform, Twitter has become more determined to crack down on misinformation. In August it partnered with Reuters and the Associated Press to debunk misleading information and stop its spread.

It has previously introduced Birdwatch, a community-moderation system, which allowed volunteers to label tweets they found to be inaccurate.