Managing data: Too much cyber, not enough privacy 101

Just as we are preparing for mandatory data breach notification to commence here in Australia, some interesting pieces of news have revealed that perhaps both corporations and government agencies have taken their eye off the ball when it comes to protecting personal information. A lot of energy and budget is spent on cyber-strategy this and digital-first that and open-data whatever, when perhaps governments and businesses need to get the basics right first.

A NSW auditor-general’s report found that two-thirds of NSW government agencies are failing to properly safeguard their data, by not monitoring the activities or accounts of those with privileged access to data, and one-third are not even limiting access to personal information to only staff with a ‘need to know’.

Leaving aside the question of why the NSW Privacy Commissioner is not resourced adequately to undertake these audits instead of needing the auditor-general to look into data protection, this report highlights a disturbing lack of compliance with the Data Security principle, which is neither new (NSW privacy legislation turns 20 this year) nor rocket science.

Ignoring the privacy risks posed by staff misusing data is naïve; when I think of the more than 300 privacy cases against NSW public sector agencies over the past two decades, I cannot think of one that has involved a complaint arising from a disclosure to hackers, but countless have involved staff misusing the personal information to which they were given access.

But this systemic failure pales into comparison with the more recent revelation that the Department of Prime Minister & Cabinet managed to lose classified Cabinet documents by selling a filing cabinet full of them. Seriously? SERIOUSLY??

This is the government which later this year will create a national shared electronic health record for all Australians unless we opt out, and introduce an opt-in national digital ID, and a national face-recognition system … but they can’t manage to secure paper records?

Of course privacy risks don’t only come from rogue employees who misuse data, or from misplaced documents. Privacy risks can also come from data that has been deliberately released to the public.

Do you remember back to 2016 when the MBS/PBS dataset, containing health information about 10% of the entire Australian population, was released as ‘de-identified’ open data but then it turned out it wasn’t really de-identified because the data about health service providers could be decrypted (in other words, doctors were identifiable)? And do you remember that nonetheless, the Health Minister said that no patients’ information was at risk?

So, um, apparently that was not true. It turns out that the researchers from the University of Melbourne who first re-identified doctors were indeed able to also re-identify patients, including MPs and a high-profile footballer.

And just in the past few days, the Strava data debacle has illustrated a number of other problems with open data initiatives.

(If you missed it: Strava is a social network of people who like to not only use devices like FitBits to track their movements, heart-rate, calories burned etc, but to then share and compare that data with fellow fitness fanatics. In November 2017 Strava released a data visualisation ‘heat map’ of one billion ‘activities’ by people using its app. In late January 2018 an Australian university student pointed out on Twitter that the heat maps could be used to locate sensitive military sites. Uh-oh.)

First, the sheer power of geolocation data is incredible. It can show patterns of behaviour of both individuals and groups, and reveal sensitive locations. Not understanding the risks involved in your data before releasing it publicly is negligent.

Second, geolocation data can be used to find out more about identifiable or already-known individuals; removing identifiers from the data does not make it anonymous.

Third, privacy harms (including physical harm) can be done to individuals even if they are not personally identifiable. (I have previously argued that our privacy laws currently fail to recognise this risk, by protecting only what is ‘identifiable’ data.)

Fourth, when individuals comprise a group, say personnel at an army base or worshippers at a mosque or clients of an abortion clinic, the risk posed by or to one becomes a risk for all.

Fifth, when data is combined from different sources, or taken out of context, or when information is inferred about you from your digital exhaust, the privacy issues move well beyond whether or not this particular app, or device, or type of data, poses a risk to the individual. The risks become almost impossible to foresee, let alone quantify.

Finally, and of importance well beyond geolocation data: the utter failure of the US-model of privacy protection, which relies on ‘notice and consent’ instead of broader privacy principles placing limitations on collection, use or disclosure. Many commentators have been quick to judge the users of the Strava app, saying that military personnel for example should have never allowed themselves to be tracked. And sure, it is easy to judge. But first, look at how the app works. Privacy by Default it ain’t.

The Future of Privacy Forum’s review found that Strava’s request for access to location data – that little box that pops up on your phone when you first install the app, asking you to ‘Allow’ or ‘Don’t Allow’ – did not mention public sharing. Instead, it says “Allow Strava to access your location while using the app so you can track your activities” (emphasis added).

A Strava user herself has explained how she discovered that her workout routes were accessible to (and commented on by) strangers, even when she thought she had used the privacy settings in the app to prevent public sharing of her data. A Princeton professor (who happens to be an expert in re-identification) noted that if he couldn’t understand whether or not Strava’s privacy settings actually worked to obscure the user’s home address, or whether the use of a fake name would be enough to prevent cross-linking with other data, how is a more typical app user supposed to determine where their own level of comfort sits, and how to achieve it?

So before blaming the users, perhaps instead we should be asking why the company did not follow Privacy by Default design rules, why the privacy control settings are so complex, and why the initial permission request to users about their location data was so misleading.

This is not just a problem with Strava. The 2017 OAIC Community Attitudes Survey found that 32% of people ‘rarely or never’ read privacy policies and notifications before providing personal information. (And let’s face it: the other 68% of people probably lied.) Burying detail in complex privacy policies is a lawyer’s art form. Is it really reasonable to expect consumers to have read and understood them before they click ‘I accept’?

Then there are apps which track your location (or copy your address book, or turn on your microphone) without you being told at all: 16% of Android apps in one survey were found to give no notice about the data they were collecting from users. Add in the revelation that Google was tracking the location of users of Android phones, even when they switched off all their location settings, and you can see that consumer-blaming is utterly misplaced.

But even if Strava had done a better job of informing its users about how their data would be shared (with the company, with other users, and ultimately with the public), there remains a problem with the ‘notice and consent’ model of privacy protection. As academic Zeynep Tufekci has noted, ‘informed consent’ is a myth: “Given the complexity (of data privacy risks), companies cannot fully inform us, and thus we cannot fully consent.”

Putting the emphasis for privacy protection onto the consumer is unfair and absurd. As Tufekci argues in a concise and thoughtful piece for the New York Times:

“Data privacy is not like a consumer good, where you click ‘I accept’ and all is well. Data privacy is more like air quality or safe drinking water, a public good that cannot be effectively regulated by trusting in the wisdom of millions of individual choices. A more collective response is needed.”

The data is de-identified so there is nothing to worry about.

If you don’t like it, opt out.

If you’ve done nothing wrong, you’ve got nothing to hide.

It’s time to put those fallacies to rest. The US model of ‘notice and consent’ has failed. Privacy protection should not be up to the actions of the individual citizen or consumer. It’s the organisations which hold our data – governments and corporations – which must bear responsibility for doing us no harm.

They could start by minimising the collection of personal information, storing data securely, and limiting its use and disclosure to only directly related secondary purposes within the subject’s reasonable expectations.

It’s not sexy start-up-agile-cyber-digital-first whatever, but nor is it rocket science. It’s common sense and good manners. It’s Privacy 101.

Photograph (c) Anna Johnston

Too much cyber, not enough privacy 101

CONTACT US

Subscribe to our newsletter.