Last week, the European equivalent of the US Supreme Court issued a controversial decision. A Spanish citizen petitioned the court to require Google to remove information from its search results about the citizen that related to a 1998 government-ordered auction required to recover debts the citizen owed. The information was published, and still accessible, on a Spanish newspaper’s website; the court concluded that information could be retained there by the newspaper as part of the “media”.
Google is not a newspaper, but the court concluded Google does collect and process personal data and, for that, is to be classified as a “data controller” under the EU privacy and data protection directives. As such, data controllers have an obligation to remove data from their systems if the data is “inadequate, irrelevant, or no longer relevant, or excessive in relation to the purposes of the processing ”. The court required Google to accept the requests of individuals to request the removal of their personal data meeting those standards.
As some have described this decision, the EU court confirmed an individual’s ‘right to be forgotten’. Commentators and pundits are flaming in all directions about the impact of this decision on privacy, freedom of speech, journalistic integrity, and the inevitability of ubiquitous surveillance and recordation of our lives.
But, in terms of digital trust, the question I care about is, “How will I trust that a data asset has been fully removed and inaccessible?” The complexities of search engines, distributed databases, and the inter-dependencies of systems and data services creates an enormous challenge. Even if a search service subject to EU’s jurisdiction affirmatively confirms they have removed personal data as requested by a data subject, the data subject has no means of fully validating the compliance.
First, the data subject must be able to confirm the initial search service has complied. Second, the data subject has the burden of tracking down other search services that have tied to, or independently archived, various search results, links, or content. Not only primary copies of data must be accounted for, but also secondary copies, backup copies, reformatting of data into different databases, etc.
What must be done differently? The “break” that no one is discussing is that the data subject never had control of the manner in which their data can be used. In fact, the newspaper and Google were merely republishing public record content. So, the “root cause” was that the public agency established no controls on the re-use of their public records, including to sell newspapers or side-bar ads on search services.
What this case exposes is that every single acquisition of data involves a negotiation (or the absence of one). Privacy laws vest in individuals the right to control their personal data through consent mechanisms which are, simply, a contract exercise of offer and acceptance. Digital trust depends on the same negotiations and contract formations at every link in the chain.
Data sources, including virtually any public sector website, can establish suitable controls; commercial sources, such as journalists and search services, must similarly impose controls to better assure their right to use the information.
To me, this is not a constraint on freedom of speech nor on journalistic investigation. Reporters still value, and respect, the need for facts to be independently confirmed by at least two sources. Now, a new standard is emerging—can we trust that we have a right to use and publish the information in a digital age? Once that right is explicit, and its scope (including the secondary linking to that information) is more clear, then that trust is possible.
But all of those negotiations must be engineered to be more automated. It is not that hard, really. We already tag data elements with explicit descriptors; we merely need to add a way to connect those descriptors to the rights of publication and use that are associated with them. Those are the rules that must connect to the data. Use of the data requires, as a predicate, assurances the related rules will be followed.
The Spanish citizen never intended for the descriptive information about the auction of his assets to become the content which helps sustain advertising revenue for news media or search services. So now we know the rules that should be in place; how long will we take to change the game?