ReadWriteBuilders is a series of interviews with developers, designers and other architects of the programmable future.
Companies like Google and Facebook are putting lots of time into figuring out how to solve the world’s problems through technology. Their efforts, though—like bringing the Internet to everyone—are both grandiose and a tad self-serving. And they’ll take years to bear fruit.
In the meantime, an interesting Silicon Valley nonprofit is tackling a similar problem—namely, how to encourage the growth of tech skills in developing nations—in a completely different way. Samasource, founded in 2008, works with companies like Google, Microsoft and Getty Images to provide jobs for people in the Third World. Such jobs, sometimes dubbed “microwork,” include data entry and processing, photo tagging and machine learning.
The workers receive training on software and programs created by Samasource. Once they’re familiar with technical tasks, they can begin earning money by doing online work for companies halfway around the globe. Samasource provides income and educational opportunities for marginalized workers in slums, refugee camps, and impoverished communities across Africa, Asia and the Caribbean.
I talked with Samasource founder Leila Janah about the challenges of building tech-centered businesses in countries without reliable electricity, how Samasource works and the role technology can play in development. What follows is a lightly edited transcript of our conversation.
Rural Africa, Meet The Internet
RW: What are some of the challenges you faced implementing technology in third-world countries?
LJ: The challenges are more nuanced than anything. We have a very first-world, or insular view of technology, especially where we are in Silicon Valley. I’ve been to parts of America like Mississippi where it’s as difficult to access high speed Internet as it is in Africa. And I think very few people who are here and whose cell phones work beautifully and who can always go online are aware of the very real challenges of just accessing the basic layers of technology in less wealthy locations.
That said, there’s also a stereotype about Africa—I heard this a lot from our funders—that people in sub-Saharan Africa need to focus on food and water, and tech is the last thing on their minds. Why are we bothering with cell phones when what they really need is rice?
I think that’s ridiculous. Humans everywhere have the same desire to connect and to contribute, and what I’ve seen is that young kids in refugee camps in Kenya are as likely to be proficient and heavy users of Facebook Zero when it was out, or SMS, as their peers in the U.S. The only thing that they don’t have maybe is a lot of data so they can’t send pictures back and forth.
You’d be surprised—there are young people in all kinds of developing markets who are more proficient and tech savvy than people here. There are two sides to that coin. The biggest challenge we see is often infrastructure. The price of getting online has gone down 90% in some regions in the last few years because it’s now fiber and it used to be all satellite.
So there is hope. But you have to realize that, in the case of Sierra Leone, the entire national budget of the country is under half a billion dollars, for six million people. That’s probably less than Facebook’s marketing budget, or Google’s marketing budget.
When you’re trying to run an entire nation of people for less than a Silicon Valley company spends on marketing, that gives you an idea about the real infrastructure challenges. Sierra Leone, like a lot of these countries, is not electrified. So if you want to get a computer running in rural Sierra Leone, you have to have a generator, and if you have a generator, you have to have diesel. And to get diesel to a place like that, you need to airlift it in, because there aren’t even good roads.
There are very real infrastructure challenges in some of these countries, especially when there isn’t tech space to support building that infrastructure. One of the things we need to do in that case is be very realistic about how much money its going to cost like spread the Internet everywhere. The first thing is we need to get electricity everywhere—still over a billion people lack electricity, right? So how are we going to get Internet to everyone when they don’t have power to read? I think that those things need to be thought through and I think a lot of that is going to require more capital than we’re willing to put up for those initiatives.
What Big Companies Get Out Of Samasource
RW: Does would a tech company like Google or Microsoft use Samasource? Does each company have its own API?
LJ: We have an API, and each company typically has their own way of integrating with our data, and there’s an engineering point of contact who is working with our engineering team, who is looking to plug their system into ours.
What Google and Microsoft do, and I can’t really tell you many specifics, but I can say in the general area of machine learning, we’re seeing a lot of demand for our services. Machine learning is one category where the quality of the data that you get is really important, and typically the machine learning teams are using images from videos to train machines to recognize certain things in videos.
One example is with car manufacturers. They’re putting video cameras and sensors in the bumpers of next generation cars. One of the things we want to train cars to do is recognize a pedestrian in front of the car so that it stops, and there are fewer auto fatalities.
That process of teaching a sensor or a camera how to recognize a human involves a lot of manual data processing. You basically have to feed the algorithm tens of thousands of images of people in different cases. When it’s dark, when it’s light, at this time, what their foot and hand [look like]. To be able to help a machine grasp what is a person. And that process involves Samasource workers tagging lots of images from those sensors, and providing the company with that data.
It also means identifying parts of a body, certain part of the body in the image that is being captured by the camera. And that is really difficult to do on a platform like Mechanical Turk, because if you even get 10% of the data wrong, then your entire algorithm can fail. And when its something as important as a car that is going prevent auto fatalities, it’s important to get the data right.
Companies like Google and Microsoft that are building the technology that allows for intelligent systems are very concerned with these tags about locations. That’s where we see a lot of growth.
For search companies, one of the things that we do is support indexing with human judgment.
So machines will sometimes index web pages. Google has algorithms to do that, but they need to check that it’s working—and this is a generic comment on all search companies—they’ll need some subset of their records manually reviewed. And it’s usually a very small percentage, but that helps them to tweak their algorithms and make sure they’re surfacing the right stuff.
What Samasource workers do for search companies is support those indexing needs. We can also do things like, if you’re running ads against Web pages, you have to make sure the ads are relevant. If somebody’s Googling for a toy car, they’re not getting an ad for an actual car. We help those companies ensure that ads that are being displayed are relevant.
Scaling Across The Third World
ReadWrite: Can you describe your software?
Leila Janah: I’m not an engineer. So my initial goal was not to build anything, and just to focus on the labor sourcing side of our model, which I felt I had expertise in.
Initially we actually used Mechanical Turk, Amazon’s [job-outsourcing] platform, and we used [freelancer marketplace] oDesk. I tried putting people directly on those platforms, and then worked with those companies to set up special arrangements for paying people in geographies that they weren’t in yet.
For managing a workforce, that was pretty unusual. That model worked a tiny bit initially, and then we started seeing all of these challenges. We started seeing that these platforms were not built for impact sourcing, which is what we do.
I used Basecamp for the first year of our operations; it worked for me until we hit about $500,000 in revenue. I literally had 120 of our projects on Basecamp and we would unitize the work manually. For example, we would get a giant file from the client, with PDF documents, and then we would have spreadsheets that we would load into Basecamp.
This was in the era before Etherpad was acquired by GoogleDocs, so it was very nascent. We were using Basecamp and Excel spreadsheets. I would write: “Ok, Center A, you need to do pages one through 800. And Center B, you have pages 801 to 1200.” We were managing things in that way, very grassroots. It worked and it was cheap.
Then when we hit about $500K in revenue, things started getting lost. If Google gave us a contract, we would divide that manually and set up four different projects in Basecamp, each one of them in a different group of workers. In that group, some might speak Hindi as their first language, and some might speak Swahili. It was a total mess.
By that point we had identified all of the pain points, and we figured out what we needed to build. I could hire my first engineer around something that really solved a problem, rather than a hypothetical.
The first thing our technology did was provide basic project management tools, but customized for the kind of work we do. The second was quality assurance, so we added gating processes, which means we can determine at the outset whether a worker is qualified to do a task.
For example, in our e-commerce work with Walmart.com, we have training materials that we developed. We load them into our system and workers can read and digest those training materials on their own schedule, then they take a test to qualify into that project.
What’s unique is that we have specific tests that design around that project. And then we can see a worker’s entire employment history, so when they start we can see the results on their English test. And as they progress, we can see how they do with tasks, type A and B. And we’ve done 25 million tasks on our system, some of which are very large tasks, they’re not equivalent to what you’d see on Mechanical Turk—a lot of them take up to 30 minutes or more to complete.
It’s a very flexible platform that allows us to manage the entire process once we’ve secured the contract, loading the work into the system, giving it out to multiple workers, training them on specific types of tasks, and then monitoring them on an ongoing basis.
RW: Do you keep in touch with workers after they’re done working for Samasource?
LJ: We use our technology to measure impact. So we actually administer worker surveys at the beginning and throughout the worker’s history with Samasource. We test their skills, check on their income, communicate directly with them, and we are the first global NGO to use Facebook to do longitudinal surveys of our workers.
When they sign up on the SamaHub, they have to get a Facebook account. A lot of them already have an account if they have a mobile phone, but a lot of workers don’t have a phone. What we’ve found is that’s the most effective way to keep in touch with them over time, and find out what they’re doing years later. It is really important for our donors to understand the full impact of their dollar.
Bottom-Up vs. Top-Down
RW: Do you think plans from companies like Google or Facebook—their initiatives to bring Internet to all—could have legs?
LJ: Their efforts have been criticized in many ways, and I think it’s unfair. I’m an optimist; I think humans want the same things everywhere. I think people are fundamentally good and have good motivations. I think the people behind these initiatives at Google and Facebook, especially given that I know all of Facebook’s founders, and all of them I would say are, for their age and their level of wealth, exceptionally philanthropic. It’s really impressive.
You don’t see people in finance in New York doing similarly bold things with philanthropy, and I think each of them personally are committed to making the world a better place. I think that sometimes we get taken over by the hubris of the technology culture in Silicon Valley, and we think that tech is going to solve all the world’s problems.
Even with the best intentions and even with amazing tech, Google and Facebook have built things that change the world for people on the ground in refugee camps that can connect with relatives they haven’t seen in 15 years, which makes a real difference in their lives.
The Internet isn’t going to solve all problems, even if it could, most of the people who need it don’t have it.
I do hope that these companies, as they search for ways to do more to save the world, that they understand more of these dynamics. When you have a plan where 1.5 billion humans live on less than $1.25 a day, they’re not going to have access to the Internet, even if you make your systems open and free.
You need to do more than make it free, you have to proactively fund getting it to them. Or use the philanthropic side of those companies—and I wish Facebook had a philanthropic arm. It’s a little disappointing that it doesn’t. But if Facebook had a foundation that was committed to using some portion of profits to fund access for people in developing countries, that would be a great step.
Or even here at home. Funding electricity so people could get online.
Images courtesy of Samasource