clock menu more-arrow no yes

How Silicon Valley’s successes are fueled by an underclass of ‘ghost workers’

The invisible labor that makes our technology run

Graphic by Michele Doying / The Verge

“Ghost work” is anthropologist Mary L. Gray’s term for the invisible labor that powers our technology platforms. When Gray, a senior researcher at Microsoft Research, first arrived at the company, she learned that building artificial intelligence requires people to manage and clean up data to feed to the training algorithms. “I basically started asking the engineers and computer scientists around me, ‘Who are the people you pay to do this task work of labeling images and classification tasks and cleaning up databases?’” says Gray. Some people said they didn’t know. Others said they didn’t want to know and were concerned that if they looked too closely they might find unsavory working conditions.

So Gray decided to find out for herself. Who are the people, often invisible, who pick up the tasks necessary for these platforms to run? Why do they do this work, and why do they leave? What are their working conditions?

Gray ended up collaborating with fellow MSR senior researcher Siddharth Suri to write Ghost Work: How to Stop Silicon Valley from Building a New Global Underclass (Houghton Mifflin Harcourt).

The Verge spoke to Gray about her research findings and what they mean for the future of employment.

This interview has been lightly edited for clarity.

Labeling data to feed to algorithms is one obvious example of ghost work. Content moderation is another. What are other examples?

Mary Gray.

Filling out surveys, captioning and translation work, any sort of transcription service. Doing web research, verifying location addresses, beta testing, user testing for user designs. Anything you can think of as knowledge work, like content creation, writing editorial, doing design. You name it. The list is endless. All of those are tasks that can be distributed online. It’s all of the things we’re used to seeing in the office, and this is what it looks like to dismantle that as a full-time job and turn it into projects for myriad people.

Am I right in thinking that basically every tech company relies or has relied on ghost work?

I would be hard-pressed to find any business that sells itself as AI that either didn’t deeply rely on ghost work to generate their basic product or isn’t very much reliant on it today. There are so many startups and businesses out there, anything that calls itself “business insights” or “intelligence and analytics.” That’s using crowdsourcing or collective intelligence, and that’s relying on ghost work. There is no way around the need for people to sift through the piles of what’s called unstructured data.

Sometimes, people think that as technology gets better, we won’t need this type of ghost work anymore. But you write that “the great paradox of AI is that the desire to eliminate human work generates new tasks for humans.” So clearly you don’t subscribe to that belief. Why not?

What might change are the specific tasks. Believing that AI will never need humans labeling data means believing that language will never change, style will never change. Service industries, especially, are so difficult to fully automate because being able to listen to somebody’s voice and register that person’s silent anger is such a human capacity. So there are cases when AI will, I argue, always fall short.

Engineers are always wonderfully optimistic about opportunities. As an anthropologist, I know how complicated it is to think cross-culturally about these questions. Even if we fairly reliably get to 100 percent of spoken English with a flat Midwestern accent, what about when you move into vernacular and slang and folks who will splice together languages and code switch? Anytime you see an autotranslation of a talk, you see the places where language breaks down, often around somebody’s name.

Those are the kinds of computational problems that are intractably hard for AI to capture because there’s not enough data consistently available to model what’s going to be the next utterance that somebody is saying using Spanglish. We’ve already effectively automated all of the easy things.

One interesting thing you mention is that we don’t have good labor statistics for how many people are doing ghost work. Why is that?

The biggest challenge is that the ways we count jobs are often in relationship to professional identities, or really clearly defined capacities or skills, and no one is oriented to a world of work that is project-based. We don’t have the language to describe an image tagger or a captionist. One of the findings in our research is that people have really different mental models. They may or may not identify as self-employed. They may or may not identify as a journalist if they write for a content farm, and that might change whether they decide to answer a survey question to help us measure this workforce. Let alone the fact that ghost work is distributed around the globe, and there is no global bureau of labor statistics.

A key question in this book is: who are the people doing ghost work? So, who are they? It sounds like they could be almost anyone.

When we got our initial set of surveys back from the four different platforms we studied, it was clear that there were as many women as men, though they worked different hours. People had college educations, but that wasn’t surprising because that maps on to knowledge work and information services broadly.

They are all of us. These are the folks who, for reasons of social capital, don’t have access to a network that was going to boost them into the full-time job. That’s the pattern I see sociologically or anthropologically. They’re first-generation college-going more often than not. This is a group of people who don’t have strong social ties to elites.

What are people’s motivations for this work?

There’s not one type of person doing this work and not one single motivation. There is a core group of people who are turning to this work, often because of other constraints on their time. People would say that they don’t have time to commute and were going to be commuting for a comparably paid job at least two hours, and that was going to cut into the amount of money they could make. That’s the calculus they’re making here. So they’re deciding to turn to this work and effectively. Once they’ve figured out how to make enough money on enough platforms, they cobble together the equivalent of a full-time salary for them to meet their needs. We call those folks “always on,” and they’re turning this into full-time work by the number of income streams. But this group of people is a small percentage — 10 to 15 percent, depending on the platform. This is what the research tells us about all these platforms. The core group of people is doing the bulk of the work.

Then there are the “regulars,” a deep trench of people who can step in at any time. The regulars are the ones that enable the “always on” people because if the “always on” steps out, there are enough people in that pile of regulars who are going to be able to step in. They’re often caregivers, and they had other motivations; they were pursuing another passion project or they were going back to education and taking courses, and this gave them a means to be able to finance that.

Lastly, there is the long tail of experimentalists, which is the name we gave the people who try one or two projects, figure out that this is not for them, and leave. The most important part of doing anthropological work is we could meet the people who left and figure out why. And it had to do with never hooking into a community of peers to help lower their costs, feeling like they don’t have enough support, and that this was too difficult to figure out. And it was exhausting cognitively.

A feature of this kind of market is that anyone can work for anyone else. What happens in that kind of environment?

For anybody who becomes a regular or “always on,” they’re invested and bring the same framework they have to any job. It’s an amazing amount of self-policing because workers are invested in making sure that work comes back to the pool. They want to make sure their peers are doing well because, if not, that could work against their interests in getting the next job.

Businesses should be equally invested in this accountability in the supply chain. If they’re relying on lowering their costs of investing and what they need most is somebody who’s ready and willing and able to jump in for a project, the exchange is to create some mechanism that ensures that anybody who is entering is refreshed and has the opportunity to keep up. Otherwise, it’s not sustainable as a labor market.

But companies aren’t doing that. They’re not creating the accountability or trust or culture that would help the ghost workers.

If you talk to any of these companies, most of them believe that we’re going to get this automated and think, “I just need these people for a little while.” That’s precisely our problem and that’s historically been our problem since the Industrial Age: treating badly people who do the contingent work that can’t quite be automated. We stop paying attention to these people and their work conditions, we start treating them as something that can be replaced eventually, and we don’t value the fact that they’re doing something that a mechanical process or computational process can’t do.

I hate the parallel to horsepower. This is not like replacing horses with automobiles. People are not performing a mechanical task. They’re extending something distinct about humans — their creativity and their interpretation.

What should we do to address this? What are the policy suggestions?

At the very least, it means valuing everybody’s contribution. The first step is being able to identify the people who have contributed. In Bangladesh, it made a huge difference in textiles when companies selling products had to tell us who was involved in making the shirt on my back. There should be a clear record thanking anybody who contributes labor to an output or service. The consumer should always be able to trace back the supply chain of people who have had a hand in helping them achieve their goals.

This is about regulating a form of employment that does not fit in full-time employment or fully in part-time employment or even clearly in self-employment. I believe that this is the moment to say the classification of employment no longer functions. Anybody who’s working age should have a baseline of provisions that are supplied by companies.

If companies want to happily use contract work because they need to constantly churn through new ideas and new aptitudes, the only way to make that a good thing for both sides of that enterprise is for people to be able to jump into that pool. And people do that when they have health care and other provisions. This is the business case for universal health care, for universal education as a public good. It’s going to benefit all enterprise.

I want to get across to people that, in a lot of ways, we’re describing work conditions. We’re not describing a particular type of work. We’re describing today’s conditions for project-based task-driven work. This can happen to everybody’s jobs, and I hate that that might be the motivation because we should have cared all along, as this has been happening to plenty of people. For me, the message of this book is: let’s make this not just manageable, but sustainable and enjoyable. Stop making our lives wrap around work, and start making work serve our lives.