Big Data and Due Process: Toward a Framework to Redress Predictive Privacy Harms

Kate Crawford Jason Schultz

10/01/2013

Type
journal-article
Region
Sector
Conceptual Framework
Category
Data Analysis, Big Data, Data Governance
Methodology
Conceptual Framework
Objective
Legitimacy, Privacy

Abstract

The rise of “big data” analytics in the private sector poses new challenges for privacy advocates. Unlike previous computational models that exploit personally identifiable information (PII) directly, such as behavioral targeting, big data has exploded the definition of PII to make many more sources of data personally identifiable. By analyzing primarily metadata, such as a set of predictive or aggregated findings without displaying or distributing the originating data, big data approaches often operate outside of current privacy protections (Rubinstein 2013; Tene and Polonetsky 2012), effectively marginalizing regulatory schema. Big data presents substantial privacy concerns – risks of bias or discrimination based on the inappropriate generation of personal data – a risk we call “predictive privacy harm.” Predictive analysis and categorization can pose a genuine threat to individuals, especially when it is performed without their knowledge or consent. While not necessarily a harm that falls within the conventional “invasion of privacy” boundaries, such harms still center on an individual’s relationship with data about her. Big data approaches need not rely on having a person’s PII directly: a combination of techniques from social network analysis, interpreting online behaviors and predictive modeling can create a detailed, intimate picture with a high degree of accuracy. Furthermore, harms can still result when such techniques are done poorly, rendering an inaccurate picture that nonetheless is used to impact on a person’s life and livelihood. In considering how to respond to evolving big data practices, we began by examining the existing rights that individuals have to see and review records pertaining to them in areas such as health and credit information. But it is clear that these existing systems are inadequate to meet current big data challenges. Fair Information Privacy Practices and other notice-and-choice regimes fail to protect against predictive privacy risks in part because individuals are rarely aware of how their individual data is being used to their detriment, what determinations are being made about them, and because at various points in big data processes, the relationship between predictive privacy harms and originating PII may be complicated by multiple technical processes and the involvement of third parties. Thus, past privacy regulations and rights are ill equipped to face current and future big data challenges. We propose a new approach to mitigating predictive privacy harms – that of a right to procedural data due process. In the Anglo-American legal tradition, procedural due process prohibits the government from depriving an individual’s rights to life, liberty, or property without affording her access to certain basic procedural components of the adjudication process – including the rights to review and contest the evidence at issue, the right to appeal any adverse decision, the right to know the allegations presented and be heard on the issues they raise. Procedural due process also serves as an enforcer of separation of powers, prohibiting those who write laws from also adjudicating them. While some current privacy regimes offer nominal due process-like mechanisms in relation to closely defined types of data, these rarely include all of the necessary components to guarantee fair outcomes and arguably do not apply to many kinds of big data systems (Terry 2012). A more rigorous framework is needed, particularly given the inherent analytical assumptions and methodological biases built into many big data systems (boyd and Crawford 2012). Building on previous thinking about due process for public administrative computer systems (Steinbock 2005; Citron 2010), we argue that individuals who are privately and often secretly “judged” by big data should have similar rights to those judged by the courts with respect to how their personal data has been used in such adjudications. Using procedural due process principles, we analogize a system of regulation that would provide such rights against private big data actors.