Who’s Data Is It Anyway?

I’ve had many conversations, both professional and personal, with people about data and who owns it. From my more conventional friends I often get the comments like, “It’s my data. Companies should give it to me or pay me for it.” Which often unveils a deeper intent to try and get in on the monetization of data. My thoughts similarly stem from financial investments, but to the contrary, I respond, “You didn’t pay to build the systems that capture the data, nor do you pay the various costs associated to keep those processes and people in place. So why should you think the data is yours?”

This morning, the NY Times ran an article by Natasha Singer on the need for open consumer data. She runs through a set of examples regarding cellular carriers having location data, energy companies having consumption data, and health clubs having utilization/attendance data. All of which she was unable to access freely; none of which was shared openly. Singer bounces off of various topics trying to illustrate the need for more data transparency from companies, but fails to make clear the simple points embedded in her article that will limit any effective data sharing in the near-term. Specifically, three (3) cultural aspects of how we view data must coincide:

  • Movement from anonymity to named users on the Internet
  • Change in privacy law to reflect digital rights management (at the data level, beyond intellectual property)
  • Creation of analytic marketplaces (think of this as “Analytics-as-a-Service”)

Changes in these three aspects of our culture will seed the one, broad change Singer and other open data advocates desire: a data-driven society.

First things first. Why these three aspects?

Let’s give the answer up front and unpack it from there. Risk. Most of the answer is reducible to risk.When our society becomes less risk averse regarding the use of consumer data, and more cognizant of the value derived from data sharing and decision making, only then will open data seem like a birth right.

I think that there are three key areas, being attacked presently, that move toward the tenets of open data sharing.

Named users on the Internet. At first, this seems like a simple thing. We live in the era of Facebook, Four Square, Glympse, and a slew of mechanisms that either facilitate or uncover people knowing who and where you are. Isn’t anonymity on the Internet a thing of the 90’s? Not entirely. According to Internet World Stats, 78% of the U.S. population is online. Approximately 50% of those users have a Facebook account. But how many Internet users have one account to express and expose all of their online identity? While people may use Facebook, Twitter, OpenID, or other authentication methods to share information, is that authentication persistent across all data sharing platforms? Simply stated, when you use the Internet to buy shoes from Zappos, comment on a CNN.com article, post Instagram photos to friends, and do online payments, are those actions tied to one user ID and profile? For many of us, that answer is no. People use multiple accounts to live their lives online. They may create an account just to post a passionate response to a blog post. Another account is their ‘professional profile’ and yet a third or fourth to conduct personal affairs.

The movement from wanting anonymity online to needing to be who you really are, is a prerequisite for data sharing. Until people begin to live their lives online more in-line with their physical self, the risks associated with authenticating users remains with corporations. Courts will continue to side on the privacy rights of consumers until it can be shown that the user holds responsibility for managing access to data. That’s a liability.

This leads into privacy laws and digital rights management. The reason companies won’t give Singer “her” data is because of two specific areas of risk. The legal system (and court of public opinion) will hold companies liable for what they do with individual consumer data. Privacy laws and advocacy groups favor regulating companies for what data they collect, how they collect it, how it is shared, who has access to it, and much more. In essence, privacy laws incentivize behavior to limit sharing.

However, when society shifts the ownership of data from the entity collecting to the entity observed (from the firm to the individual), then privacy laws will have to reflect this sentiment. Privacy then becomes an issue of what data the individual wants to disclose and through what mechanisms that sharing is allowed. This changes the entire risk posture around digital rights management to data rights management. Now the concept becomes, how do I — the consumer — want to collect and share data about me?

This opens new opportunities to create marketplaces for the collection, analysis, and comparison of data about consumers. Now firms can create open data marketplaces for companies to compete for consumer data. Firms can add value to consumers based on the analytics they provide, comparison data to other firms, price incentives (rate reductions or premiums for consumers who fit various profiles) and the like. Except in these markets, rather than rates being enforced punitively, firms can build services around price models and let market powers and consumer choice respond.

And that’s the third aspect: analytic marketplaces. Yes, most of us are tired of yet another “AAS” (“as-a-service”). Software-as-a-service, platform-as-a-service, and all the other buzz terms in the tech industry trying to sell more products have obfuscated the very real commoditization of IT. Having Analysis-as-a-Service is a useful milestone. The ability to bring data that I have, in a format I can manage, and use/share it in a variety of analytical tools to get more information is a substantive goal. Analytic marketplaces are a sign that companies are willing to push data risk onto consumers and compete on the footing of the services they provide. No longer will they feel compelled to withhold data to gain brand loyalty; rather, they see the value in sharing data and focusing on building better services around realistic consumer data.

As computing becomes more of a utility akin to electricity and water, the question is what remains? What are the faucets and outlets? What are the mechanisms that will use what the commodity is? The infrastructure that provides ubiquitous access to data (networks, servers, mobile devices, sensors) is being laid today. The use of data changes based on the mode of the user. Electricity coming into my home supplies a number of uses. Everything from lights, to heating, computers, battery chargers, devices that are always on, to devices that store a charge are all run by a single flow into the home.

This took decades to create. Standards for the grid, transformers, the ability to generate, distribute, and store power all come into play. Companies that developed their own power generation (anonymous users) had to become customers/contributors (named users) of a common resource. Laws and regulations changed to reflect the importance of a common electrical grid that now needed protections as a national asset. Personal perceptions changed from risk avoidance to risk acceptance as the grid opened new markets for commercial and consumer oriented businesses.

So will be the case for data.

While we like to say we’re in the “Information Age”, it’s more accurate to say the “Information Technology Age.” We’re far more focused on the devices and networks around us than the information and data that flows between them. Despite the lessons learned from the PC battles of the 19080’s and 90’s, and the broad adoption of the Internet as a backbone, developers still code based on platforms (iOS, Android, ‘mobile’, web) rather than building for a standard (W3C, http). [Why do we still build apps for one platform and port code to another?]

Information about our habits, preferences, needs, and social distances are useful if we learn how to make thoughtful decisions. This is the culmination of a much broader, deeper culture change. At best, we are laying the foundation for the early adopters. Yes, some people are very data-driven when it comes to buying cars, dishwashers, homes and other consumer items. But generally we are poor users of data. We don’t think through data quality. We subvert data meaning with subjectivity, and we cling to sunk costs and other thinking traps that skew the data we have. We are slow to think and quick to respond.

Who’s data is it? For now, it still belongs to those who created and invested in it. Those whom the laws and policies are incentivizing. Data belongs to those who have it to analyze. In most cases, that is not us — the consumer. Can and should this change? Yes. I think people should want to own and have their own data. That means taking responsibility for it and the risks that come with that role. This lasting cultural change is still years to come.