Apple letting the content-scanning genie out of the bottle

When Apple announced that they would be scanning iPhones for child sexual abuse material (CSAM), the push-back appears to have taken them by surprise. Since then, Apple has been engaging with experts and developing their proposals to mitigate risks that have been raised. In this post, I’ll discuss some of the issues with Apple’s CSAM detection system and what I’ve learned from their documentation and events I’ve participated in.

Technically Apple’s CSAM detection proposal is impressive, and I’m pleased to see Apple listening to the community to address issues raised. However, the system still creates risks that will be difficult to avoid. Governments are likely to ask to expand the system to types of content other than CSAM, regardless of what Apple would like to happen. When they do, there will be complex issues to deal with, both for Apple and the broader technology community. The proposals also risk causing people to self-censor, even when they are doing nothing wrong.

How Apple’s CSAM detection works

The iPhone or iPad scans images for known CSAM just before it uploads the image to Apple’s cloud data storage system – iCloud. Images that are not going to be uploaded don’t get scanned. The comparison between images and the database is made in such a way that minor changes to CSAM, like resizing and cropping, will trigger a match, but any image that wasn’t derived from a known item of CSAM should be very unlikely to match. The results of this matching process go into a clever cryptographic system designed to ensure that the user’s device doesn’t learn the contents of the CSAM database or which of their images (if any) match. If more than a threshold of about 30 images match, Apple will be able to verify if the matching images are CSAM and, if so, report to the authorities. If the number of matching images is less than the threshold, Apple learns nothing.

Risk of scope creep

Now that Apple has built their system, a risk is that it could be extended to search for content other than CSAM by expanding the database used for matching. While some security properties of their system are ensured through cryptography, the restriction to CSAM is only a result of Apple’s policy on the content of the matching database. Apple has clearly stated that it would resist any expansion of this policy, but governments may force Apple to make changes. For example, in the UK, this could be through a Technical Capability Notice (under the Investigatory Powers Act) or powers proposed in the Online Safety Bill.

If a government legally compelled them to expand the matching database, Apple may have to choose between complying or leaving the market. So far, Apple has refused to say which of these choices they would take.

However, in response to concerns about scope creep, Apple has announced measures to make it harder for them to expand the matching database covertly. Two scenarios are dealt with. Firstly how can we prevent Apple from sending a matching database to a selected group of users different from the one that everyone else gets? Secondly, how can we prevent Apple from sending out a matching database that includes content other than CSAM

Preventing selective update of matching database

The matching database is built into the operating system and is not updated other than through system upgrades. Apple distributes a single operating system image to all devices, so selectively distributing an image to a group of users would require significant changes to the update infrastructure. This in itself may be an obstacle to legal compulsion. For example, before a Technical Capability Notice is issued, the Secretary of State must consider “the likely cost of complying with the notice” and “whether the imposition of a notice is affordable and represents value for money”.

Suppose Apple does indeed send out an operating system image with a modified matching database to some users. In that case, this may be detectable by security researchers (if Apple would hold off on blocking them from their systems or suing them). In any case, if Apple sends you a custom-built operating system image, there are far worse things they could do than tweaking the matching database.

Preventing expansion of the matching database

Selective updating of the matching database doesn’t seem like a huge problem. It’s not as covert as law enforcement might hope, and the CSAM matching system doesn’t introduce significant new risks here. However, what if Apple were compelled to add new items to everyone’s matching database, such as leaked documents that embarrass a politician? The matching database is encrypted to prevent users from seeing the contents. Even if they could, the NeuralHash system that created the database is designed not to allow the original images to be recovered. Expanding the matching database would result in Apple being notified which of their users have these images, and Apple could be compelled to disclose this fact to the relevant authorities.

Here, the affordability and value-for-money argument that might have helped Apple resist the selective-update scenario now works against them. While the cost of building an image scanning system from scratch could prevent a Technical Capability Notice from being issued, now that Apple has built the system, the cost of adding a few entries to a matching database is low. Apple will also find it difficult to argue that such bulk scanning of everyone’s iCloud library is a disproportionate intrusion on privacy because they have publicly argued that no information is disclosed about users who don’t trigger a match.

Apple has tried to make such expansions to the matching database possible to detect. Specifically, they require that each entry in the database is present in at least two child-protection organisations’ lists of CSAM. This reduces the risk that a single organisation could insert a non-CSAM entry and reduces the impact of non-CSAM accidentally present on these lists. Apple will also allow auditors to verify that the combined list has been built correctly. They presumably won’t let just anyone do this because the auditor would have access to the unencrypted matching database. Users also can check whether the matching database is the same as the one audited, provided they are willing to trust the operating system to do what it claims to do.

However, none of these measures will prevent Apple from overtly expanding the matching database to include additional content. Their system is technically capable of detecting any material, and governments may require Apple to do precisely that. Apple may resist attempts in court, but experience has shown that it will accept changes that reduce user privacy in the face of legal demands. Internet Service Providers (ISPs) in the UK also resisted expanding their CSAM blocking system to other types of content but lost in court and were required to include entries for pirate movies and knock-off designer watches. The court’s justification was that once the system is built, there’s no harm in adding a few more entries:

“As I have explained above, the ISPs already have the requisite technology at their disposal. Furthermore, much of the capital investment in that technology has been made for other reasons, in particular to enable the ISPs to implement the IWF blocking regime and/or parental controls. Still further, some of the ISPs’ running costs would also be incurred in any event for the same reasons. It can be seen from the figures I have set out in paragraphs 61-65 above that the marginal cost to each ISP of implementing a single further order is relatively small, even once one includes the ongoing cost of keeping it updated.”

Preventing expansion to on-device data

The CSAM detection system operates on the device but only applies to images that will soon be uploaded to iCloud. Like other technology providers, Apple could simply have just waited until the images are uploaded and scanned them there. However, Apple instead designed an elaborate scheme to detect CSAM on iCloud without looking at the content on iCloud. This decision suggests Apple are considering encrypting the content on iCloud such that even they cannot access the content, but want to avoid accusations that they are facilitating the storage and distribution of CSAM. I can see the reasoning behind Apple’s claim that their on-device scanning is more privacy-preserving than the cloud-based approach, but there are other ways in which it is more problematic.

Firstly, there’s no feasible way to change a cloud-based CSAM scanning system to scan content not uploaded to the cloud, but the same cannot be said for Apple’s proposal. No matter what orders are served on Facebook, Google or Microsoft, their CSAM scanning won’t find images they don’t have access to. In contrast, for Apple’s on-device scanning approach, it would be a relatively minor change to expand the scanning system to include files that will not be uploaded to iCloud. This change might be sufficiently straightforward to meet the “reasonably practicable” test that allows a Technical Capability Notice to be served. Security experts might work out what is going on if they can get past the iOS anti-reverse engineering techniques and Apple’s legal teams, but that’s not guaranteed. In any case, some governments might not care about being found out.

Secondly, few people understand the details of Apple’s system. Most users will hear that iPhones will scan their files matching against a secret list provided by a US government-sponsored organisation and blank out before being told about tPSI-AD, root hashes and the like. Even if none of the scenarios I outlined above come into being, Apple’s new system is likely to result in self-censorship. When people learn that their most trusted device is spying on them, even law-abiding users will worry about whether they can write down or share controversial thoughts and ideas.

Letting the genie out of the bottle

Apple has addressed some of the privacy challenges in building an exceptional access system, giving law enforcement a restricted ability to learn about illicit content on people’s devices. Their system is an impressive technical achievement but has also let the genie out of the bottle. Some of the hardest challenges of exceptional access remain unsolved. The proposals don’t address how to deal with jurisdictional differences in law, compatibility with human rights, the risk of implementation flaws, or questions on how to establish trust in closed-source software. So far, Apple has dodged these issues by focusing on content almost universally considered abhorrent, but when the first government comes knocking to expand the system, these questions must be answered. Apple’s proposals have brought this day considerably closer, whether we are ready or not.

Photo by Brett Jordan from Pexels.

M	T	W	T	F	S	S
						1
2	3	4	5	6	7	8
9	10	11	12	13	14	15
16	17	18	19	20	21	22
23	24	25	26	27	28	29
30	31