Mitigating Deep Reinforcement Learning Backdoors

This post introduces the problem of backdoors embedded in deep reinforcement learning agents and discusses our proposed defence. For more technical details please see our paper and the project’s repo.

Deep Reinforcement Learning (DRL) has the potential to be a game-changer in process automation. From automating the decision-making in self-driving cars, to aiding medical diagnosis, to even advancing nuclear fusion plasma control efficiency. While the real-world applications for DRL are innumerable, the development process of DRL models is resource-intensive by nature and often exceeds the resource allocation limits of smaller entities, leading to a dependency on large organisations. This reliance introduces significant risks, including potential policy defects that can result in unsafe agent behaviour during certain phases of its operation.

Instances of unsafe agent behaviour can stem from backdoor attacks aimed at DRL agent policies. Backdoor attacks on AI agents involve intentional policy defects, designed to trigger unexpected agent behaviour deviations given specific environmental cues. An example of a standard backdoor can be seen in the top left corner Figure 1b, which appears in the form of a 3×3 grey pixel that unexpectedly appears every given interval, leading to deviations in DRL agent behaviour.

Figure 1a and 1b: GIFs of Atari Breakout episodes showing a clean DRL policy without a backdoor trigger (encapsulated in a red outline) and a backdoored policy of a DRL agent with a grey 3×3 pixel trigger in Figure 1a and 1b respectively.

The state-of-the-art solution against backdoor attacks currently proposes defence against standard backdoors.  However, the solution requires extensive compute time to successfully sanitise the DRL agent from the poisoned policies ingrained within. Figure 1b below illustrates how the defence successfully filters the backdoor policy by creating a “safe subspace” to remove anomalous states in the environment and allow benign agent operations.

Figure 2a and 2b: GIFs of an Atari Breakout episode played by a poisoned DRL agent with a standard backdoor trigger added in the top left corner (encapsulated inside a red outline). 2a is an episode with no defence and 1b is an episode with the current state-of-the-art defence and sanitisation algorithm.

Continue reading Mitigating Deep Reinforcement Learning Backdoors

Exploring the Edges of Blockchain Technology: A Deep Dive into MEV and Oracles

In the dynamic world of blockchain technology, a Research Note from authors at the Financial Conduct Authority (FCA) and University College London (UCL) delves into the complex territories of Maximal Extractable Value (MEV) and blockchain oracles, presenting a nuanced study that underscores both the potential and the pitfalls of these advanced concepts in cryptoasset services and Decentralised Finance (DeFi).

At the heart of this exploration lies MEV, a concept that thrives on the transparency and ordering of transactions within blockchain networks. While it heralds a new era of value extraction, it also opens the door to strategies that may skirt the ethical line, raising concerns over market fairness and the integrity of transactional processes. From the beneficial arbitrage opportunities to the controversial practices of front-running and sandwich attacks, MEV presents a multifaceted phenomenon that demands a keen understanding of its implications.

Equally significant are blockchain oracles, pivotal in bridging the on-chain and off-chain worlds. These conduits of information are indispensable for the seamless operation of smart contracts, yet they bear vulnerabilities that can be exploited, casting a shadow over their reliability. The study highlights the delicate balance that must be struck in the design and implementation of oracles, stressing the importance of robust data sources, decentralisation, and innovative solutions to mitigate risks.

Continue reading Exploring the Edges of Blockchain Technology: A Deep Dive into MEV and Oracles

A Regulatory “Trial and Error” Phase Shaping the Cryptocurrency Ecosystem

In general, the broad aim of regulation is to organise the relationship between persons and to protect their rights and interests in society. However as we have discovered, this has not been the case within the cryptocurrency space.

It has been almost 15 years since Satoshi Nakamoto published the Bitcoin whitepaper. Since then, this seminal document posted on a mailing list, catalysed the emergence of an entirely new-transnational field. Considering the substantial time that has passed and the widespread popularity of cryptocurrencies, one would expect regulators to have at least figured out ‘how’ and ‘what’ to regulate, right? But it appears not so. Regulators do not seem to want to provide long-term, serious solutions, but rather a collection of outdated bandages of reactions that mostly portray the illusion of a solution, with not-so-hidden intentions in mind.

The cryptocurrency ecosystem itself has often been compared to the Wild West. Nevertheless, this also applies to its journey to be regulated. A journey full of ill-suited regulatory approaches that resulted in a plethora of dilemmas. We explore this tension between the technology and regulation in our latest paper “Shaping Cryptocurrency Gatekeepers with a Regulatory ‘Trial and Error’”, with a primary focus on the Financial Action Task Force’s recommendations, and the EU’s 5th Anti-Money Laundering Directive.

Historically speaking, it was only after the rising popularity of the Silk Road and the collapse of the most popular (at the time) exchange Mt. Gox, that regulators realised that they needed to take action. The advertised main objective here is the curbing of criminal activity and providing regulatory protection to consumers/users. However, until only recently, most of the regulatory steps taken by most regulators choosing to act, were approaches mainly targeting money laundering and terrorist financing, with other limited initiatives here and there. Whilst this approach might have had some potential benefits, it was not 1. comprehensive, 2. global, 3. stable/constant, or 4. tailored to address the specific risks and characteristics of cryptocurrencies. In other words, different regulators have been testing diverse approaches, simultaneously, without engaging with one another, and without properly acknowledging the true needs and risks of the ecosystem.

Continue reading A Regulatory “Trial and Error” Phase Shaping the Cryptocurrency Ecosystem

Rugpull reports in the DeFi jungle

A rising category of cryptocurrency scams called ‘rugpulls’ accounted for 37% of all cryptocurrency scam revenue in 2021. A rugpull is an exit scam in the DeFi (Decentralized Finance) ecosystem where developers abandon a project without fully delivering and run away with investors’ funds. Thodex, a Turkish centralized exchange, ran away with $2 billion from victims. In March 2022, the U.S. Department of Justice charged two defendants for a $1.1 million NFT rugpull scam called Frosties.

In our paper to be presented next week at Financial Cryptography and Data Security 2023, we analyze an updated list of rugpulls from an online discussion forum – bitcointalk.org. This forum provides a platform for everyone to discuss anything on crypto that also attracts scammers to advertise their projects. We observe that since 2020, the number of rugpull threads has increased, while the ones containing exit scams have decreased; the total mention of either of these terms is relatively stable over time. This means that users have started using the term ‘rugpull’ instead of ‘exit scam’ since the DeFi space emerged.

Using keywords to search for threads discussing rugpulls, we found 101 rugpulls from six services, summarised in Table 1. Our dataset is available from the Harvard Dataverse as doi:10.7910/DVN/SMGMW8.

Service Type Definition Observation
Initial Coin Offerings (ICOs) Raising money to create a new ERC20 token 73
Yield farms Lending crypto assets to earn interest on the loan 16
Exchanges Platforms for users to buy/sell cryptocurrency 5
Non-Fungible Tokens (NFTs) Unique, non-interchangeable digital asset that can be bought and sold 5
Initial Dex Offerings (IDOs) Similar to ICO, but on a decentralized exchange 1
Cloud mining Fractional shares of a mining operation 1
Table 1: DeFi service types by quantity of observed rugpulls (N=101)

We find that Initial Coin Offerings (ICOs) form the majority of rugpulls, and most of them pulled the rug in less than six months. For example, the SquidGame Token, named after a famous TV show, rugpulled within days in 2021.

Continue reading Rugpull reports in the DeFi jungle

Return of a new version of Drinik Android malware targeting Indian Taxpayers

In October last year, analysts at Cyble published an article on the return of the Drinik malware that was first spotted by CERT-In in 2016. Last month during the tax-paying season of the year, I (Sharad Agarwal), a Ph.D. student at University College London (UCL) researching SMS phishing, found and identified an updated version of the Drinik malware that impersonates the Income Tax Department of India and targets the victim’s UPI (Unified Payment Interface) payment apps.

The iAssist.apk malware was being spread from the URL hxxp://198[.]46[.]177[.]176/IT-R/?id={mobile number} where the user is deceived into downloading a new version of the app, impersonating the Income Tax Department of India. Along with Daniel Arp, I analyzed the malware sample to check for new functionalities compared to previous versions. In the following, we give a brief overview of our findings.

Communication

Our analysis found that the malware communicates with the Command & Control (C&C) server hxxp://msr[.]servehttp[.]com, which is hosted on IP 107[.]174[.]45[.]116. It also silently drops another malicious APK file hosted on the C&C to the victim’s mobile that has already been identified and flagged as malware on VirusTotal – “GAnalytics.apk“.

The previous campaign used a different IP for its C&C communication. However, the hosting provider for the IP addresses, “ColoCrossing“, is the same as in the previous campaign. This strongly indicates that the Threat Actor behind both campaigns is also the same and is abusing the same hosting provider again. As has already been reported for previous versions of this malware, also the most recent version of the malware records the screen of the mobile device and sends the recorded data to the C&C server (see Figure 1).

Function to upload recorded videos to external C&C server.
Figure 1: Function to upload recorded videos to external C&C server.

Additionally, we also found the phone numbers used by the criminals to which the SMSs are sent through this malware (see Table 1). The malicious APK asks for READ, WRITE, RECEIVE, and SEND SMS permission during the installation and does not work unless the user accepts all the permissions (see Table 2).

Indicator Type Indicators
MD5 02e0f25d4a715e970cb235f781c855de
SHA256 99422143d1c7c82af73f8fdfbf5a0ce4ff32f899014241be5616a804d2104ebf
C&C hostname hxxp://msr[.]servehttp[.]com
C&C IP Address 107[.]174[.]45[.]116
Dropped APK URL hxxp://107[.]174[.]45[.]116/a/GAnalytics[.]apk
Dropped APK MD5 95adedcdcb650e476bfc1ad76ba09ca1
Dropped APK SHA256 095fde0070e8c1a10342ab0c1edbed659456947a2d4ee9a412f1cd1ff50eb797
UPI Apps targetted Paytm, Phonepe, and GooglePay
SMS sent to Phone numbers +91-7829-806-961 (Vodafone), +91-7414-984-964 (Airtel, Jaora, Madhya Pradesh), and +91-9686-590-728 (Airtel, Karnataka)
Table 1: Indicators of Compromise (IoCs)

Continue reading Return of a new version of Drinik Android malware targeting Indian Taxpayers

The Acropalypse vulnerability in Windows Snip and Sketch, lessons for developer-centered security

Acropalypse is a vulnerability first identified in the Google Pixel phone screenshot tool, where after cropping an image, the original would be recoverable. Since the part of the image cropped out might contain sensitive information, this was a serious security issue. The problem occurred because the Android API changed behaviour from truncating files by default to leaving existing content in place. Consequently, the beginning of the resulting image file contains the cropped content, but the end of the original file is still present. Image viewers ignore this data and open the file as usual, but with some clever analysis of the compression algorithm used, the original image can (partially) be recovered.

Shortly after the vulnerability was announced, someone noticed that the Windows default screenshot tool, Snip and Sketch, appeared to have the same problem, despite being an entirely unrelated application on a different operating system. I also found a similar problem back in 2004 relating to JPEG thumbnail images. When the same vulnerability keeps re-occurring, it suggests a systemic problem in how we build software, so I set out to understand more about the reasons for the vulnerability existing in Windows Snip and Sketch.

A flawed API

The first problem I found is that the modern Windows API for saving files had a very similar problem to that in Android. Specifically, existing files would not be truncated by default. Arguably the vulnerability was worse because, unlike Android, there is no option to truncate files. The Windows documentation is, at best, unclear on the need to truncate files and what code is needed to achieve the desired result.

This wasn’t always the case. The old Win32 API for saving a file was (roughly) to show a file picker, get the filename the user selected, and then open the file. To open a file, the programmer must specify whether to overwrite the file or not, and example code usually does overwrite the file. However, the new “more secure” Universal Windows Platform (UWP) sandboxes the file picker in a separate process, allowing neat features like capability-based access control. It creates the file if needed and returns a handle which, if the selected file exists, will not overwrite the existing content.

However, from the documentation, a programmer would understandably assume, however, that the file would be empty.

“The file name, extension, and location of this storageFile match those specified by the user, but the file has no content.”

Continue reading The Acropalypse vulnerability in Windows Snip and Sketch, lessons for developer-centered security

A well-executed exercise in snake oil evaluation

In the umpteenth chapter of UK governments battling encryption, Priti Patel in September 2021 launched the “Safety Tech Challenge”. It was to give five companies £85K each to develop “innovative technologies to keep children safe when using end-to-end encrypted messaging services”. Tasked with evaluating the outcomes was the REPHRAIN project, the consortium given £7M to address online harms. I had been part of the UKRI 2020 panel awarding this grant, and believed then and now that it concerns a politically laden and technically difficult task, that was handed to a group of eminently sensible scientists.1 While the call had strongly invited teams to promise the impossible in order to placate the political goals, this team (and some other consortia too) wisely declined to do so, and remained realistic.

The evaluation results have now come back, and the REPHRAIN team have done a very decent job given that they had to evaluate five different brands of snake oil with their hands tied behind their backs. In doing so, they have made a valuable contribution to the development of trustworthy AI in the important application area of online (child) safety technology.

The Safety Tech Challenge

The Safety Tech Challenge was always intellectually dishonest. The essence of end-to-end encryption (E2EE) is that nothing2 can be known about encrypted information by anyone other than the sender and receiver. Not whether the last bit is a 0, not whether the message is CSAM (child sexual abuse material).3 The final REPHRAIN report indeed states there is “no published research on computational tools that can prevent CSAM in E2EE”.

In terms of technologies, there really also is no such thing as “in the context of E2EE”: the messages are agnostic as to whether they are about to be encrypted (on the sender side) or have just been decrypted (on the receiving side), and nothing meaningful can be done4 in between; any technologies that can be developed are agnostic of when they get invoked.

Continue reading A well-executed exercise in snake oil evaluation

What is Synthetic Data? The Good, the Bad, and the Ugly

Sharing data can often enable compelling applications and analytics. However, more often than not, valuable datasets contain information of sensitive nature, and thus sharing them can endanger the privacy of users and organizations.

A possible alternative gaining momentum in the research community is to share synthetic data instead. The idea is to release artificially generated datasets that resemble the actual data — more precisely, having similar statistical properties.

So how do you generate synthetic data? What is that useful for? What are the benefits and the risks? What are the fundamental limitations and the open research questions that remain unanswered?

All right, let’s go!

How To Safely Release Data?

Before discussing synthetic data, let’s first consider the “alternatives.”

Anonymization: Theoretically, one could remove personally identifiable information before sharing it. However, in practice, anonymization fails to provide realistic privacy guarantees because a malevolent actor often has auxiliary information that allows them to re-identify anonymized data. For example, when Netflix de-identified movie rankings (as part of a challenge seeking better recommendation systems), Arvind Narayanan and Vitaly Shmatikov de-anonymized a large chunk by cross-referencing them with public information on IMDb.

Continue reading What is Synthetic Data? The Good, the Bad, and the Ugly

“I am yet to meet a young person that has not experienced some form of abuse via tech”

Technology-facilitated abuse describes the misuse of digital systems such as smartphones or other Internet-connected devices to monitor, control and harm individuals. In recent years increasing attention has been given to this phenomenon in school settings and the criminal justice system. Yet, an awareness in the healthcare sector is lacking. To address this gap, Dr Isabel Straw and Dr Leonie Tanczer from University College London (UCL) have been leading a new research project that examines technology-facilitated abuse in medical settings.

Technology-facilitated forms of abuse are on the rise, with perpetrators adapting digital technologies such as smartphones and drones, trackers such as AirTags, and spyware tools including parental control software, to cause harm. The impact of technology-facilitated abuse on patients may not always be immediately obvious to healthcare professionals. For instance, smart, Internet-connected devices have been showcased to be misused in domestic abuse cases to inflict physical harm. Smart locks have been used to trap individuals inside their homes, smart thermostats have been used to inflict extremes of temperature on victims, and remotely controlled lighting and sound systems have been manipulated to cause psychological distress. COVID-19 catalyzed the proliferation of these technologies within our environment, with sales of smart devices increasing 30% on last year. Yet, while these tools are advertised for their proposed safety and convenience, they are also providing new avenues for violence, harassment, and abuse.

The impact of technology-facilitated abuse is especially notable on young people. In recent years, pediatric safeguarding guidelines have been amended in response to increasing rates of knife crime, gang violence and drug trafficking in the UK. However, technology-facilitated abuse has evolved at a parallel rate and has not received the same level of attention. The impact of technology-facilitated abuse on children and teenagers may manifest as emotional distress, anxiety, suicidal ideation. Koubel reports the exacerbation of mental health risks born from websites that encourage self-harm, eating disorders, and suicide. Furthermore, technology-facilitated dating abuse and sextortion is increasing amongst adolescent populations. With 10% of children being affected by sexual solicitation online, the problem is widespread and under-investigated. As reported by Stonard et al. in “They’ll Always Find a Way to Get to You, digital devices are playing an increasing role in relationship abuse amongst young people.

Vulnerable individuals frequently perceive medical settings as a place of safety. Healthcare professionals, thus, have a role in providing both medical and psychosocial care to ensure their wellbeing. At present, existing clinical and patient management protocols are outdated and do not address the emerging threats of technology-facilitated abuse. For clinicians to provide effective care to patients affected by technological elements of abuse and violence, clinical safeguarding protocols need a radical update if they are to assist professionals navigating high risk scenarios.

Continue reading “I am yet to meet a young person that has not experienced some form of abuse via tech”

Vulnerability in Linux containers – investigation and mitigation

Operating system access controls, that constrain which programs can open which files, have existed for almost as long as computers themselves. Access controls are still widely used and are more flexible and efficient when compared to cryptographically protecting files. Despite the long history, there continues to be innovation in access control, particularly now in containers, like Docker and Kubernetes and similar technologies offered by cloud providers. Here, rather than running lots of software on a single computer, the service is split up into microservices running in containers. Each container is isolated from others on the same computer as if it has its own computer and operating system and is prevented from reading files in other containers.

However, in reality, there’s only one operating system, and the container runtime’s role is to create the illusion that there are more than one. As part of its job, the runtime should also set up containers such that access control works inside each container because not every program running inside a container should be able to access every file. Multiple containers can also be given access to the same directory, and access controls used to restrict what each container can do with the directory contents. If access controls don’t work properly, an attacker could read or modify files they should not be able to.

Unfortunately, there is such a vulnerability. The bad news is that it originates from an omission in the specification that underlies all the major container runtimes and so is present regardless of which container runtime you use (e.g. runc, crun, Kata Containers) and regardless of whether you use containers directly (e.g. through Docker or podman) or indirectly (e.g. through Kubernetes). The good news is that the vulnerability affects a feature of Linux access control permissions that is not widely used – negative group permissions. However, if your system does depend on this feature then the vulnerability could be serious. Read on for more details about the vulnerability, why it exists and what can be done to mitigate the problem.

Introduction to Linux permissions

In Linux there are user accounts and each user is also a member of a group. Each object (files, directories, devices, etc.) has an associated owner and associated group. The object also has a set of permissions associated with the three classes: owner, group, and other. These permissions tell the operating system whether a user should be able to read from the object (r), write to the object (w) and execute the object (x). If a user is the owner of an object the owner-class permissions are used, if the user is a member of the file’s group, the group-class permissions are used, and otherwise the other-class permissions are used.

For example, a file containing a company’s finance database could be owned by the Chief Financial Officer (CFO) and have owner class permissions “r+w”. It could have the group set to “auditors” with group-class permissions only “r” and other-class permissions set to nothing. Then the CFO could freely read and write to the database, all members of the group auditors could read it, and everyone else cannot access the database at all.

Continue reading Vulnerability in Linux containers – investigation and mitigation