L1ght Spotlights the Shortfalls of Manual and Automated Content Moderation

A recent article listed the world’s most “evil” companies, and it seems most have one thing in common: They’re plagued by toxic content, some not of their own making.

Toxic content comes in many forms, and what one person sees as “over the line” another may deem acceptable. But before tech platforms can wrestle with thorny First Amendment issues, they need to solve a more practical one: finding a cost-effective way to identify that content in the first place.

To do so, tech companies have taken one of two approaches: Throw people at the problem, hoping they can hire enough moderators to keep up with the millions or billions of new posts per day; or turn to algorithms, which are far from universally applicable or contextually intelligent.

So far, neither model has made an appreciable dent in the flood of toxic content. What they need, according to anti-toxicity startup L1ght — which works with social networks, games, and hosting providers to keep toxicity from affecting young users — is a technology that can handle context.

What’s Wrong With Human Moderation?

Human moderators used to be the gold standard, but like real gold, they come at a steep price.

“Human moderation has real human costs,” L1ght CEO Zohar Levkovitz explains. “It’s not just about the labor expenses.”

At Facebook, for example, 15,000 contracted moderators manually check flagged posts for violence, hate speech, and sexual content. Facebook has been tight-lipped about the financial costs of its moderation program.

A report by The Verge noted issues raised by Facebook employees manually reviewing large amounts of graphic content. Micromanagement, low pay, and job security were all concerns when reviewing borderline content.

While algorithmic moderation is more scalable, most applications have proven ineffective. “Consumer-oriented parental control apps and simplistic AI solutions that look for bad words haven’t worked in the real world,” Levkovitz notes. “We knew we needed a different approach in order to save kids at scale.”

Tech Makes Mistakes

Because traditional algorithmic approaches to moderation struggle to take content’s context into account, they’re prone to two types of statistical error.

Type I errors, called “false positives,” happen when a moderation algorithm flags and removes a post it shouldn’t have. Because companies that build moderation tools don’t want problematic content slipping through the cracks, they often build their models to err toward over-moderation.

In practice, unfortunately, these errors tend to limit legitimate political discourse. A second article by The Verge found that “Fake news” is 47% similar to comments labeled “toxic” by Google’s Perspective Tool; “Bad hombre” is 55% similar. Perspective may let the first phrase slide but censor the second.

Type II errors include posts that should have been algorithmically removed but were not. The team behind Microsoft’s Artemis, offering a step in the right direction against child abuse, worries about this type of error. Artemis promises to identify “grooming” behaviors, which child predators use to gain targets’ trust.

But Artemis only works in English. Second, Artemis can only scan text-based content, not photos, audio, or video. The program is “by no means a panacea,” Microsoft admits.

To be sure, Artemis and Perspective are improvements when it comes to content moderation. But their limitations are real; fighting toxic content takes an “all of the above” approach. How can technology accomplish that?

Realizing the Context of Conversations

Human moderators aren’t scalable enough, and many moderation algorithms aren’t accurate enough. Where is the middle ground?

Levkovitz points out that L1ght’s technology analyzes the human qualities and context behind conversations. It’s built with input from internal behavioral scientists, data scientists, anthropologists, and more.

“L1ght’s algorithms are trained to think like kids and their potential attackers,” Levkovitz says. “By combining deep learning methodologies with human knowledge, we can spot nuance, slang, and secret meanings that other tools can’t. We can even predict when a conversation is about to take a wrong turn before it happens.”

In Fast Company, Levkovitz’s co-founder, Ron Porat, provides an example: Someone who writes, “Omg, I’m going to kill myself. Can’t do this anymore” online should be taken seriously. Say, however, that the author follows that statement with “I have an exam tomorrow, and I’m still procrastinating. I’m going to die.”

In context, it’s clear that the person is simply making a dramatic statement. Human moderators can make that inference, but many algorithmic ones can’t.

The rest of the challenge is proactivity. Platforms must prevent people from posting problematic content in the first place, which requires contextually intelligent algorithms.

“For reasons of scale, tech will need to be the first line of defense,” Levkovitz says. He predicts that people will continue to manage moderation operations due to inevitable edge cases, but new technologies could drastically reduce the manual work required.

L1ght may be close to solving the context problem. But platforms themselves will have to do the rest: Hire with diversity in mind, support human moderators emotionally, and develop rigorous review processes.

Developing intelligent moderation tools will be tough, to be sure. But in hindsight, it’s likely we’ll see context as the harder case to crack.