9 March 2021

The A.I. Industry Is Exploiting Gig Workers Around the World — Sometimes for Just $8 a Day

Dave Gershgorn

OneZero’s General Intelligence is a roundup of the most important artificial intelligence and facial recognition news of the week.

Modern artificial intelligence relies on algorithms processing millions of examples or images or text. A picture of a bird in an A.I. dataset would be manually tagged “bird” so that the algorithm associated aspects of that image with the category “bird.”

The process of tagging this data, by hand, scaled to the millions, is time-consuming and mind-numbingly monotonous.

Much of this work is done outside the United States and other Western countries and exploits workers from around the world, according to a new paper from Princeton, Cornell, University of Montreal, and the National Institute of Statistical Sciences.

Data-labeling companies like Sama (formerly Samasource), Mighty AI, and Scale AI use labor from sub-Saharan Africa and Southeast Asia, paying employees as little as $8 per day. Meanwhile, these companies earn tens of millions of dollars in revenue per year.

Take Amazon Mechanical Turk, an online gig working platform where anyone in the world can log on and perform simple tasks for a few cents each. Until 2019, Mechanical Turk required a U.S. bank account to get paid, meaning that anyone working for the platform without access to U.S. banking wouldn’t even be paid in legal currency. Instead they were compensated in Amazon gift cards.

One of the most impactful datasets in the history of artificial intelligence, ImageNet, relied on Mechanical Turk workers who were paid $2 per hour, according to the paper.

Furthermore, the data being tagged has been selected by developers and programmers in the United States or other Western countries, meaning they often exclude global culture context.

“Images of grooms are classified with lower accuracy when they come from Ethiopia and Pakistan, compared to images of grooms from the United States,” the paper says. “Many of these workers are contributing to AI systems that are likely to be biased against underrepresented populations in the locales they are deployed in, and may not be directly benefiting their local communities.”

A potential fix for this, the researchers write, is simply integrating these data labelers into the A.I. development process, rather than keeping them as gig workers making cents per image labeled. Workers would be paid equitably, and their insight and expertise would help address disparity in the data collection process, improving the accuracy of the product overall.

The paper points to Masakhane, an organization dedicated to the preservation of African languages through artificial intelligence as an example of equitable A.I. development. Masakhane doesn’t create data for A.I. researchers, but instead fosters a community of people who label, research, and build algorithms for the African continent.

“We can likely connect you with annotators or translators but we do not support shallow engagement of Africans as only data generators or consumers,” the organization wrote on its website.

And if companies already include data labelers in the process, the paper says, these individuals should be given the opportunity to grow inside the company.

“We suggest viewing AI development as a path forward for economic development,” they write. “This development activity should not be focused on low-productivity activities, such as data-labeling, but instead on high-productivity activities like model development/deployment and research.”

No comments: