How Facebook’s Data Sharing Went From Feature to Bug

In 2007, a young Mark Zuckerberg stood on a stage in San Francisco and announced that Facebook was throwing open its doors.

No longer, he said, would Facebook be a closed-off software product like every other social network. Instead, it would become an open platform and invite outside developers to build apps and programs on top of it.

“We want to make Facebook into something of an operating system,” Mr. Zuckerberg told a reporter.

At the time, the announcement drew little notice outside the programming world. Developers quickly went to work making fun and quirky apps that plugged into Facebook — early hits included “Rendezbook,” a kind of proto-Tinder that allowed users to match with each other for “random flings,” and CampusRank, which allowed college students to nominate their peers for yearbook-type awards.

Later, popular games like FarmVille arrived, and apps like Tinder and Spotify began allowing their users to log in using their Facebook credentials. In some ways, it was a fair trade. Facebook got to weave itself more deeply into users’ internet habits, and the outside app developers got access to a big audience and valuable data about their users. In all, millions of apps have been created with Facebook’s open platform tools.

Through it all, Facebook’s users were mostly unfazed. Sure, these apps collected data about their lives. But they seemed convenient and harmless, and, really, what could go wrong?

Today, more than a decade later, the consequences of Facebook’s laissez-faire approach are becoming clear. Over the weekend, The New York Times reported that Cambridge Analytica, a British consulting firm, improperly acquired private data about roughly 50 million Facebook users, and used it to target voters on behalf of the Trump campaign during the 2016 presidential election.

What happened with Cambridge Analytica wasn’t technically a data breach, since this trove of personal information wasn’t stolen from Facebook’s servers. Rather, it was given away freely to the maker of a Facebook personality quiz app called “thisisyourdigitallife.”

That app, which was developed by a University of Cambridge professor, collected data about the 270,000 people who installed it, along with data about their Facebook friends, totaling 50 million people in all. The professor, Aleksandr Kogan, then gave the data he had harvested to Cambridge Analytica.

Technically, only this last step violated Facebook’s rules, which prohibit selling or giving away data collected by a third-party app. The rest was business as usual. Third-party apps collect vast amounts of detailed personal information about Facebook users every day, including their ages, location, pages they’ve liked and groups they belong to. Users can opt out of sharing specific pieces of information, but it’s unclear how many do.

This kind of broad data collection is not only allowed but encouraged by Facebook, which wants to keep developers happily building on top of its platform. Permissiveness is a feature, as they say, not a bug.

But in the wake of incidents like the data leak to Cambridge Analytica, some are questioning the costs of such loose policies on an influential platform with 2.2 billion registered users.

“It seems insane that you can make haphazard decisions about so many people’s data,” said Can Duruk, a technology consultant and software engineer. Facebook, he said, was “extremely lax with what kind of data they allowed people to get.”

In a Facebook post on Monday, Andrew Bosworth, a Facebook vice president, admitted that that thinking may have been a mistake.

“We thought that every app could be social,” Mr. Bosworth wrote. “Your calendar should have your events and your friends birthdays, your maps should know where your friends live, your address book should show their pictures. It was a reasonable vision but it didn’t materialize the way we had hoped.”

An early clue about the potential for misuse of Facebook’s third-party developer tools came in 2010 when my colleague Emily Steel, then at The Wall Street Journal, reported that an online tracking company, RapLeaf, was collecting and reselling data it had gathered from third-party Facebook apps to marketing firms and political consultants. In response, Facebook cut off RapLeaf’s data access and said it would “dramatically limit” the misuse of its users’ personal information by outside parties.

But preventing data-hungry developers from exploiting Facebook’s treasure trove of personal information remained challenging. In 2015, Facebook removed the ability of third-party developers to collect detailed information about the friends of users who had installed an app, citing privacy concerns. (Cambridge Analytica’s data trove, which included this type of information, was gathered in 2014, before the change.) Facebook has also taken away tools used by developers to create games and quizzes that barraged users with annoying notifications.

But the core functions of Facebook’s open platform tool are still intact. There are still many third-party apps like “thisisyourdigitallife” out there, vacuuming up intimate data about Facebook users. That data doesn’t disappear, and Facebook has no real recourse to stop it from falling into the wrong hands.

Not all open data access is used irresponsibly. Researchers and nongovernmental organizations have used Facebook’s third-party development tools to respond to natural disasters. And many of the functions that internet users depend on — for example, the ability to import their digital address books into a new messaging app — are possible thanks only to the tools that allow for third-party development known as application programming interfaces, or A.P.I.s.

“Everything we depend on uses A.P.I.s,” said Kin Lane, a software engineer who maintains a website called API Evangelist. “They’re in your home, in your business, in your car. It’s how these platforms innovate and do cool, interesting things.”

In Facebook’s case, permissive data policies were also good for business. Third-party developers built millions of apps on top of Facebook’s platform, giving Facebook users more reasons to spend time on the site and generating more ad revenue for the company. Restricting access to data would limit Facebook’s usefulness to developers and could drive them to build on a rival platform instead, making those products better.

In this context, it’s even less surprising that Dr. Kogan and Cambridge Analytica were able to use a silly personality quiz to collect information about millions of Americans. After all, why else would the quiz be there?