18 October 2021

Facebook Uses Deceptive Math to Hide Its Hate Speech Problem


IN PUBLIC, FACEBOOK seems to claim that it removes more than 90 percent of hate speech on its platform, but in private internal communications the company says the figure is only an atrocious 3 to 5 percent. Facebook wants us to believe that almost all hate speech is taken down, when in reality almost all of it remains on the platform.

This obscene hypocrisy was revealed amid the numerous complaints, based on thousands of pages of leaked internal documents, which Facebook employee-turned-whistleblower Frances Haugen and her legal team filed to the SEC earlier this month. While public attention on these leaks has focused on Instagram’s impact on teen health (which is hardly the smoking gun it’s been touted as) and on the News Feed algorithm’s role in amplifying misinformation (hardly a revelation), Facebook’s utter failure to limit hate speech and the simple deceptive trick it’s consistently relied on to hide this failure is shocking. It exposes just how much Facebook relies on AI for content moderation, just how ineffective that AI is, and the necessity to force Facebook to come clean.

In testimony to the US Senate in October 2020, Mark Zuckerberg pointed to the company’s transparency reports, which he said show that “we are proactively identifying, I think it’s about 94 percent of the hate speech we ended up taking down.” In testimony to the House a few months later, Zuckerberg similarly responded to questions about hate speech by citing a transparency report: “We also removed about 12 million pieces of content in Groups for violating our policies on hate speech, 87 percent of which we found proactively.” In nearly every quarterly transparency report, Facebook proclaims hate speech moderation percentages in the 80s and 90s like these. Yet a leaked a document from March 2021 says, “We may action as little as 3-5% of hate … on Facebook.”

Was Facebook really caught in an egregious lie? Yes and no. Technically, both numbers are correct—they just measure different things. The measure that really matters is the one Facebook has been hiding. The measure Facebook has been reporting publicly is irrelevant. It’s a bit like if every time a police officer pulled you over and asked how fast you were going, you always responded by ignoring the question and instead bragged about your car’s gas mileage.

There are two ways that hate speech can be flagged for review and possible removal. Users can report it manually, or AI algorithms can try to detect it automatically. Algorithmic detection is important not just because it’s more efficient, but also because it can be done proactively, before any users flag the hate speech.

It’s a bit like if every time a police officer pulled you over and asked how fast you were going, you always responded by ignoring the question and instead bragged about your car’s gas mileage.

The 94 percent number that Facebook has publicly touted is the “proactive rate,” the number of hate speech items taken down that Facebook’s AI detected proactively, divided by the total number of hate speech items taken down. Facebook probably wants you to think this number conveys how much hate speech is taken down before it has an opportunity to cause harm—but all it really measures is how big a role algorithms play in hate-speech detection on the platform.

What matters to society is the amount of hate speech that is not removed from the platform. The best way to capture this is the number of hate-speech takedowns divided by the total number of hate speech instances. This “takedown rate” measures how much hate speech on Facebook is actually taken down—and it’s the number that Facebook tried to keep secret.

Thanks to Haugen, we finally know the takedown rate, and it is dismal. According to internal documents, more than 95 percent of hate speech shared on Facebook stays on Facebook. Zuckerberg boasted to Congress that Facebook took down 12 million pieces of hate speech in Groups, but based on the leaked estimate, we now know that around 250 million pieces of hate speech were likely left up. This is staggering, and it shows how little progress has been made since the early days of unregulated internet forums—despite the extensive investments Facebook has made in AI content moderation over the years.

Unfortunately, the complaint Haugen’s legal team filed to the SEC muddied the issue by prominently asserting in bold, “Facebook’s Records Confirm That Facebook’s Statements Were False.” This itself is false: Facebook did not technically lie or “misstate” the truth, as the complaint alleges—but it has repeatedly and unquestionably deceived the public about what a cesspool of hate speech its platform is, and how terrible the company is at reining it in.

Don’t be surprised to see Facebook’s defense team jump on the Haugen team’s sloppiness. But don’t be misled by any effort to discredit the whistleblower’s findings. The bottom line is that Facebook has known for years that it is failing miserably to control hate speech on its platform, and to hide this from investors and the public Facebook has peddled the meaningless proactive rate to distract us from the meaningful and closely guarded takedown rate.

Another measure that Facebook sometimes gloats about is the "prevalence" of hate speech. When asked for comment on this article, a Facebook spokesperson wrote in an emailed statement that "the prevalence of hate speech on Facebook is now 0.05 percent of content viewed and is down by almost 50 percent in the last three quarters." Prevalence does gives a sense of how much hate speech is on the platform, but it still paints a deceptively sanguine portrait. The distribution of hate speech is so uneven that a blunt percentage like this conceals the high prevalence of hate speech that occurs in specific communities and that many individual users experience. Moreover, seeing non-hateful content on Facebook doesn't make the hateful content any less harmful—yet this is exactly what reliance on prevalence suggests.

As public attention moves from uncovering the ills of social media to finding ways to address them, there are two important takeaways here.

First, Zuckerberg has long repeated the assertion that improvements to AI will be the company’s key to dealing with harmful content. He said it in the wake of the 2016 election, after Russian misinformation campaigns ran wild on the platform. He said it in a 2017 Facebook Live video, while grilling meat in his backyard: “With AI especially, I’m really optimistic. People who are naysayers and kind of try to drum up these doomsday scenarios—I just, I don’t understand it.” It’s telling that Facebook’s CEO shares more granular detail on how he smokes brisket from a cow he butchered himself (set to 225 degrees for eight hours, flipped every two hours) than he does his company’s AI proficiency, but here’s a doomsday scenario he can understand: It’s 2021, and Facebook’s AI is still only catching a tiny fraction of the platform’s hate speech.

Unfortunately, there’s no silver bullet when it comes to online hate speech. Content moderation is an incredibly challenging problem, and we need to admit that AI is very far from the panacea it is frequently hawked as. But if there’s one point driven home more than anything else by Haugen and the whistleblowers who preceded her, it’s that we can’t just hope for honesty from the tech giants—we must find ways to legally mandate it. This brings us to the second takeaway:

A simple but helpful transparency regulation would be to require that all platforms publish their takedown rates for the different categories of harmful content (such as hate speech and misinformation). Takedown rates can surely be gamed, but this would still be a step in the right direction, and it would prevent the deceptive trick Facebook has been using for years. In the same way you and I need a credit score to get a loan, Facebook and other social media platforms should need a content moderation credit score—based on takedown rates, not proactive rates or other meaningless measures—to continue to do business.

No comments: