Machine learning

A Privacy Framework for Research Using Social Media Data

Social media data enables researchers to understand current events and human behavior with unprecedented ease and scale. Yet, researchers often violate user privacy when they access, process, and store sensitive information contained within social media data.

Social media has proved largely beneficial for research. Studies measuring the spread of COVID-19 and predicting crime highlight the valuable insights that are not possible without the Big Data present on social media. Lurking within these datasets, though, are intimate personal stories related to sensitive topics such as sexual abuse, pregnancy loss, and gender transition— both anonymously and in identifying ways.

In our latest paper presented at the 2025 IEEE Symposium on Security and Privacy, “SoK: A Privacy Framework for Security Research Using Social Media Data,” we examine the tensions in social media research between pursuing better science enabled by social media data and protecting social media users’ privacy. We focus our efforts on security and privacy research as it can involve sensitive topics such as misinformation, harassment, and abuse as well as the need for security researchers to be held to a higher standard when revealing vulnerabilities that impact companies and users.

Methodology

Toward this end, we conducted a systematic literature review of security literature on social media. We collected over 10,000 papers from six different disciplines, Computer Security and Cryptography (CSC), Data Mining and Analysis (DMA), Human-Computer Interaction (HCI), Humanities, Literature & Arts, Communication, (HLAC), Social Sciences, Criminology (SSC), and Social Sciences, Forensic Science (SSFS). Our final dataset included 601 papers across 16 years after iterating through several screening rounds.

How do security researchers handle privacy of social media data?

Our most alarming finding is that only 35% of papers mention any considerations of data anonymization, availability, and storage. This means that security and privacy researchers are failing to report how they handle user privacy.

Continue reading A Privacy Framework for Research Using Social Media Data

Mitigating Deep Reinforcement Learning Backdoors

This post introduces the problem of backdoors embedded in deep reinforcement learning agents and discusses our proposed defence. For more technical details please see our paper and the project’s repo.

Deep Reinforcement Learning (DRL) has the potential to be a game-changer in process automation. From automating the decision-making in self-driving cars, to aiding medical diagnosis, to even advancing nuclear fusion plasma control efficiency. While the real-world applications for DRL are innumerable, the development process of DRL models is resource-intensive by nature and often exceeds the resource allocation limits of smaller entities, leading to a dependency on large organisations. This reliance introduces significant risks, including potential policy defects that can result in unsafe agent behaviour during certain phases of its operation.

Instances of unsafe agent behaviour can stem from backdoor attacks aimed at DRL agent policies. Backdoor attacks on AI agents involve intentional policy defects, designed to trigger unexpected agent behaviour deviations given specific environmental cues. An example of a standard backdoor can be seen in the top left corner Figure 1b, which appears in the form of a 3×3 grey pixel that unexpectedly appears every given interval, leading to deviations in DRL agent behaviour.

Figure 1a and 1b: GIFs of Atari Breakout episodes showing a clean DRL policy without a backdoor trigger (encapsulated in a red outline) and a backdoored policy of a DRL agent with a grey 3×3 pixel trigger in Figure 1a and 1b respectively.

The state-of-the-art solution against backdoor attacks currently proposes defence against standard backdoors. However, the solution requires extensive compute time to successfully sanitise the DRL agent from the poisoned policies ingrained within. Figure 1b below illustrates how the defence successfully filters the backdoor policy by creating a “safe subspace” to remove anomalous states in the environment and allow benign agent operations.

Figure 2a and 2b: GIFs of an Atari Breakout episode played by a poisoned DRL agent with a standard backdoor trigger added in the top left corner (encapsulated inside a red outline). 2a is an episode with no defence and 1b is an episode with the current state-of-the-art defence and sanitisation algorithm.

Continue reading Mitigating Deep Reinforcement Learning Backdoors

A well-executed exercise in snake oil evaluation

In the umpteenth chapter of UK governments battling encryption, Priti Patel in September 2021 launched the “Safety Tech Challenge”. It was to give five companies £85K each to develop “innovative technologies to keep children safe when using end-to-end encrypted messaging services”. Tasked with evaluating the outcomes was the REPHRAIN project, the consortium given £7M to address online harms. I had been part of the UKRI 2020 panel awarding this grant, and believed then and now that it concerns a politically laden and technically difficult task, that was handed to a group of eminently sensible scientists.¹ While the call had strongly invited teams to promise the impossible in order to placate the political goals, this team (and some other consortia too) wisely declined to do so, and remained realistic.

The evaluation results have now come back, and the REPHRAIN team have done a very decent job given that they had to evaluate five different brands of snake oil with their hands tied behind their backs. In doing so, they have made a valuable contribution to the development of trustworthy AI in the important application area of online (child) safety technology.

The Safety Tech Challenge

The Safety Tech Challenge was always intellectually dishonest. The essence of end-to-end encryption (E2EE) is that nothing² can be known about encrypted information by anyone other than the sender and receiver. Not whether the last bit is a 0, not whether the message is CSAM (child sexual abuse material).³ The final REPHRAIN report indeed states there is “no published research on computational tools that can prevent CSAM in E2EE”.

In terms of technologies, there really also is no such thing as “in the context of E2EE”: the messages are agnostic as to whether they are about to be encrypted (on the sender side) or have just been decrypted (on the receiving side), and nothing meaningful can be done⁴ in between; any technologies that can be developed are agnostic of when they get invoked.

Continue reading A well-executed exercise in snake oil evaluation

What is Synthetic Data? The Good, the Bad, and the Ugly

Sharing data can often enable compelling applications and analytics. However, more often than not, valuable datasets contain information of sensitive nature, and thus sharing them can endanger the privacy of users and organizations.

A possible alternative gaining momentum in the research community is to share synthetic data instead. The idea is to release artificially generated datasets that resemble the actual data — more precisely, having similar statistical properties.

So how do you generate synthetic data? What is that useful for? What are the benefits and the risks? What are the fundamental limitations and the open research questions that remain unanswered?

All right, let’s go!

How To Safely Release Data?

Before discussing synthetic data, let’s first consider the “alternatives.”

Anonymization: Theoretically, one could remove personally identifiable information before sharing it. However, in practice, anonymization fails to provide realistic privacy guarantees because a malevolent actor often has auxiliary information that allows them to re-identify anonymized data. For example, when Netflix de-identified movie rankings (as part of a challenge seeking better recommendation systems), Arvind Narayanan and Vitaly Shmatikov de-anonymized a large chunk by cross-referencing them with public information on IMDb.

Continue reading What is Synthetic Data? The Good, the Bad, and the Ugly

Exploring an Attack on Image Scaling Algorithms

In their 2019 publication ‘Seeing is Not Believing: Camouflage Attacks on Image Scaling Algorithms’, Xiao et al. demonstrated a fascinating and frightening exploit on a few commonly used and popular scaling algorithms. Through what Quiring et al. referred to as adversarial preprocessing, they created an attack image that closely resembles one image (decoy) but portrays a completely different image (payload) when scaled down. In their example (below), an image of sheep could scale down and suddenly show a wolf.

Two images are shown, the left shows the original attack image, which depicts a group of sheep. The right shows the scaled down attack image, which shows a grey wolf. — On the left, a group of sheep can be seen in a slightly stretched out photo (the decoy). When scaled down to the correct dimensions (right), the image shows a grey wolf (payload). This is an example of an attack image.

These attack images can be used in a number of scenarios, particularly in data poisoning of deep learning datasets and covert dissemination of information. Deep learning models require large datasets for training. A series of carefully crafted and planted attack images placed into public datasets can poison these models, for example, reducing the accuracy of object classification. Essentially all models are trained with images scaled down to a fixed size (e.g. 229 × 229) to reduce the computational load, so these attack images are highly likely to work if their dimensions are correctly configured. As these attack images hide their malicious payload in plain sight, they also evade detection. Xiao et al. described how an attack image could be crafted for a specific device (e.g. an iPhone XS) so that the iPhone XS browser renders the malicious image instead of the decoy image. This technique could be used to propagate payload, such as illegal advertisements, discreetly.

The natural stealthiness of this attack is a dangerous factor, but on top of that, it is also relatively easy to replicate. Xiao et al. published their own source code in a GitHub repository, with which anyone can run and create their own attack images. Additionally, the maths behind the method is also well described in the paper, allowing my group to replicate the attack for coursework assigned to us for UCL’s Computer Security II module, without referencing the paper authors’ source code. Our implementation of the attack is available at our GitHub repository. This coursework required us to select an attack detailed in a conference paper and replicate it. While working on the coursework, we discovered a relatively simple way to stop these attack images from working and even allow the original content to be viewed. This is shown in the series of images below.

Continue reading Exploring an Attack on Image Scaling Algorithms

M	T	W	T	F	S	S
	1	2	3	4	5	6
7	8	9	10	11	12	13
14	15	16	17	18	19	20
21	22	23	24	25	26	27
28	29	30	31