China’s Zhenhua Data trove: Why do we care and what can we do?

So what if a small Chinese tech company called Zhenhua Data holds personal information on at least 2.4 million [1] people—including government, corporate and media leaders and influencers—from countries across the globe? Isn’t this just open-source data available on the internet? And anyway, what can we do? This is just China doing its thing, isn’t it?

This response to these nasty revelations has a whiff of the passive and cynical. And it is wrong.

We shouldn’t—and don’t have to—let China off the hook or just shrug and say ‘that’s the internet’. Particularly as Zhenhua is just one of a number of Chinese entities collecting and analysing masses of global data, with the intention [2] of assisting the Chinese government in its political warfare and united front [3] work.

Compiling profiles of powerful individuals from data found on the internet, added to confidential data [4] that’s been procured—like bank account details, spending and browsing behaviour, and psychological health—helps a foreign intelligence agency exploit and manipulate people.

So, large datasets like this can boost conventional espionage operations significantly. They give levers to obtain cooperation willingly or unwillingly. When those levers include not just personal foibles and vulnerabilities, but information about a target’s children, relatives or friends, it starts to look uglier, even malign.

The data Zhenhua collected—or, rather, the partial dataset that was leaked to the media—is mostly ‘open source’, meaning it is not classified or otherwise protected but is floating about somewhere on the internet.

So, what’s the problem?

The problem is that the Chinese government and its corporate assistants will be using the insights from this giant data pool to obtain political, commercial, strategic and military advantage for the Chinese government and Chinese companies right now, including in ways that break other countries’ laws and rules. They will not be storing up the data like a squirrel storing nuts for the winter; they’ll be getting value from it daily—and adding to it as they do so.

The leaked Zhenhua dataset is just one of the large-scale datasets that the Chinese state is building and accessing—and it’s probably not the most impressive. ASPI analyst Samantha Hoffman’s Engineering global consent [5] report sets out the data collection and aggregation work of another Chinese technology firm, GTCOM. The company focuses on global big data, facial recognition and artificial intelligence, which it supplies for Chinese government use.

The insights from these datasets are valuable for powering foreign interference [6] by Chinese actors in other states and societies. They are the fuel for Chinese state entities’ covert and corrupting activities, and give these entities an advantage over others. Companies armed with these insights from government can use them to take advantage of competitors and business partners.

But the data is open source, so doesn’t that mean it’s okay?

No. The data may be ‘on the internet’ but it’s likely that laws on data access and sharing—such as the EU’s [7] General Data Protection Regulation and privacy principles like those [8] here in Australia, as well as commercial policies and terms—have been wilfully broken to assemble the datasets. Data obtained from the dark web is the most likely to have breached such laws and rules. It’s even more likely laws will be broken by Chinese state entities using the datasets.

And, even if the data is all out there on the internet, aggregating it is a whole different thing. Aggregating lets you see meaning and patterns that are invisible when the information is fragmented and scattered around the globe.

Cambridge Analytica did something similar, but for a narrower if still powerful purpose and with mainly US data. Once its practices were exposed, the company was shut down [9] and several of its leaders and key employees have suffered career reversals. We shouldn’t give China a free pass for something we find essential to act on in our own countries and jurisdictions.

The consequences of Google or Facebook having masses of data about you are serious and rightly a topic of public debate and potential further regulation. But they are quite different to the consequences of the Chinese Communist Party and its technology companies having huge amounts of personal information about you and your family—because the intent about using the insights from these mass data holdings is different. It’s a case of values and interests—and the CCP has a lot of interest in using the data in ways many of us wouldn’t like or enjoy.

Coercive economics [10] and pressure on governments and companies to say and do what Beijing wants is one clear use. Another is being able to influence political debates and even national decision-making.

Think about telling a key participant in a high-level government decision that if they don’t do what you want, you’ll reveal a distressing and compromising thing about one of their children to the world. Think about supplying, even selling, some of this data to political operators in a country you want to influence or divide and letting them do your dirty work. Think about enticing influential people to do what you want by paying off the large legal or medical bills you know they have because of the data you hold about them.

The fact that the Chinese government and its military and security services have access to this type of data matters, and so does what can be done to make it harder for them to get it.

The only good news here is that there are things that Australia and others can do to help reduce the risk of this behaviour and raise the costs to Beijing for operating in this way.

The biggest is to see this as a call to action as the world looks at redesigning the internet [11]. This work is happening in obscure global bodies like the International Telecommunications Union [12], whose Chinese corporate participants are pushing hard to set the direction of the future internet, no doubt with the full backing and clear direction of the Chinese government.

Australia and other open democracies need to pile in on the new internet standards and governance if we aren’t to get a redesigned internet that’s even easier to use covertly and corruptly for authoritarian purposes than the one we have now.

We also need to work with like-minded governments and international human rights organisations to call out this Chinese government-connected activity for what it is: covert—and immoral—particularly when it comes to personal health data and data on children, and when many of its uses are likely to be covert and corrupting.

And we might also turn some of the open-source power of the internet back on Chinese government leaders—revealing, for example, details about the billion-dollar family fortunes [13] of figures like Xi Jinping, despite their being paid annual salaries of around US$22,000 [14]—and get this data into the hands and minds of as many of China’s 1.4 billion citizens as possible—consistently and repeatedly.

That’ll give these ultra-wealthy leadership elites the chance to explain how their wealth fits with socialism with Chinese characteristics [15] and with Xi’s anti-corruption campaign, which the party tells the world [16] sets a model globally for dealing with corruption.

Right now, luxury consumption [17] in China is hitting new heights, while some 600 million low-paid workers are either struggling to get paid or being forced back to their home provinces to get by. The nasty truth about the huge disparities in Chinese wealth distribution is high on Beijing’s radar of concern—to hide, as much as to address.

With the Zhenhua revelations, we have another chance to take some more steps in getting Beijing to do what it professes to be its current policy—and reduce its egregious interference in other societies as part of shoring up the position of the CCP.

Lastly, we might recognise that our own governments continue to undervalue the power of open-source collection and data in their own agencies and budget plans, particularly when combined with unique classified data sets.

It’s time that changed. Our own governments should be investing at scale in open-source data, wrapped up in the practices and norms of open societies that are governed, as the ABC’s Bill Birtles said on landing back in Australia, ‘with genuine rule of law [18]’.

