Some of the leading figures on AI development have warned that the new technology’s appetite for data is creating serious moral and legal issues.
According to Warwick University’s Professor Irene Ng, one of the world’s leading authorities on data privacy, the activities of the big tech companies in developing AI, are creating huge issues regarding the ‘ownership’ of personal data.
Speaking in an interview for the PassW0rd radio programme ‘Who Owns You’, Professor Ng, who is also the chief executive of Dataswift, a company championing the development of personal data stores called the Hub of All Things, says that there are very real risks to privacy due to the rapidly emerging technology, adding that the public’s awareness of how data is used by big tech has to be raised.

“The problem starts when you collect data, when you store it and when you process it, that is the part where AI comes in because it is doing the processing. Most people make all these assumptions about where data is collected and where is it stored, and that understanding is very simplistic because they think about it in a real-world way. People think that the data is obtained from one place and that it is then mixed up by a computer and is used to build something a little like bricks and a house, but that is where that understanding departs from what happens because that does not happen,” said Professor Ng, adding that we must be given control of our data to know how it’s being used.
The data that’s you
“If you take the last number of your passport, and you change it, digitally, you’re a completely different person. That’s your personal data. If you take the last number of your Fitbit steps today and change it, it doesn’t matter. Both of those are your personal data, so it becomes all about context.
“The thing about data is that it has a few dimensions, on its own, it can be very potent. It also can become harmless when combined with other data and become inert. While on the other hand inert data that’s personal, can become very potent, data when combined with other data could lose its meaning in some ways. This then is all about relevance, meaning and context. But when you don’t understand this about data, we are now confusing, the medium and the message.”
According to Professor Ng, AI’s insatiable appetite for data means that large amounts of personal data could be used for purposes we are unaware of and have not given permission for because of the technology’s practice of making multiple copies of data.
The fight for our lives
It’s an awareness of data ownership that in an era of deep-fakes soon will see people asserting ownership of their faces, fingerprints, voices and other biometrics one of the issues involved in the recent Hollywood actors strike, when film extras demonstrated for the right to own their own images.

A point over essential copyright highlighted by a legal dispute between the New York Times and Microsoft that began at the start of the year. According to the newspaper, Microsoft and its partner, OpenAI, had breached its copyright to train AI systems like ChatGPT by generating multiple copies of articles from the publisher’s website.
The New York Times is not alone, several UK national newspapers are also rumoured to be taking the US tech giant to court for similar infringements.
A dispute the US East Coast media lawyer Kevin Casini says will revolve around an expensive argument about what a copy actually is.
To be or not to be
“It will or may come down to the definition of copy. What does it mean to actually make a copy? You and I have very base level understanding, walking around knowledge of what a copy is right it’s a mimeograph, or it’s a reproduction. It’s essentially taking one thing and making another version,” said Casini, adding that the New York Times case was being viewed as the start of things to come as more and more people begin to realise how their data is being used.
A point that US legal expert Colin Levy, author of ‘The Legal Tech Ecosystem’ says will eventually lead to people disputing with Facebook and other social media companies over the way they use our data.
“Facebook has long used our data and what we post for its own purposes and the fact that only now people are objecting or paying more attention to it, I think means that they weren’t paying attention in the beginning. If Facebook was trying to make money from us by charging us, then it would be clearer what its intentions were.”
Who owns you
“So, ultimately, there are a lot of unresolved issues with respect to the ownership of ourselves. But ultimately, we do own ourselves. It’s just a matter of how much we want to share ourselves with others, whether it’s other people, other tools and so on.”
It’s a debate about data ownership that has finally begun to draw in legislators like UK Labour Peer Lord Jim Knight.
“I’ve become interested in whether we can form data trusts that essentially allow us, as citizens, effectively to aggregate our data. To have the use of that data governed by a set of principles written into the trust that trustees then have to abide by in terms of the exploitation of that data by other people. Individually, our little bits of data aren’t worth that much. Collectively, they’re worth a fortune, and it’s thinking more about how we leverage our collective power over these hugely powerful big tech companies.”