GDPR: AI in grave trouble

25th February 2023

How do you feel about the idea of training an AI using data taken from the dead?

If the answer is ‘pretty creeped out, actually’, then is there anything you could do about it?

There are a lot of technologies that can find a use for a catalogue of facial images, be that for training AIs to recognise patterns across them (expressions, characteristics, ethnicity, etc) or for using those images as seeds to generate new similar ‘synthetic’ images. But in the UK and EU there have always been significant bars that stand in the way of building up that kind of database.

Specifically, the GDPR stands in the way of building up a bank of facial images that have been collected ‘indirectly’ (i.e. not directly from willing data subjects) and then subjecting them to any kind of facial recognition treatment. Those images are personal data, of course, and the kind of data that one might generate from them using a facial recognition algorithm is biometric data (which attracts an even higher level of compliance burden than ‘vanilla’ personal data). We all know what happened to ClearView when it tried to do this kind of thing without obtaining individual consent.

So this article about PimEyes, which is alleged to have built its database using images of, now deceased, individuals taken from fascinated me.

It’s seldom relevant, but the GDPR (in the UK and EU) only affords rights to “natural persons” and in the UK at least our implementing legislation makes it clear that what we mean by that is a “living individual”.

So, at face value, there’s every reason to think that processing data relating to the dead, even at huge volume using technologies that they would have struggled to conceive of in their lifetimes, is fair game under GDPR. In just the same way that you can defame the dead without penalty (the poor guys just can’t catch a break).

The real rub, which is where privacy campaigners are bound to go with this kind of use-case, is that the data you generate using images of the dead is only valuable to you once you start selling its outputs to the living (the dead typically being a fairly low-spending demographic). Once you start using data compiled from the deceased in order to start identifying or profiling those still among us, then you go straight back into GDPR’s territory and need to think about how you intend to justify its rules about identifying them.

PimEyes, Scarlett says, has scraped images of the dead to populate its database. By indexing their facial features, the site’s algorithms can help those images identify living people through their ancestral connections.

A Face Recognition Site Crawled the Web for Dead People’s Photos | WIRED