The Acropalypse vulnerability in Windows Snip and Sketch, lessons for developer-centered security

Acropalypse is a vulnerability first identified in the Google Pixel phone screenshot tool, where after cropping an image, the original would be recoverable. Since the part of the image cropped out might contain sensitive information, this was a serious security issue. The problem occurred because the Android API changed behaviour from truncating files by default to leaving existing content in place. Consequently, the beginning of the resulting image file contains the cropped content, but the end of the original file is still present. Image viewers ignore this data and open the file as usual, but with some clever analysis of the compression algorithm used, the original image can (partially) be recovered.

Shortly after the vulnerability was announced, someone noticed that the Windows default screenshot tool, Snip and Sketch, appeared to have the same problem, despite being an entirely unrelated application on a different operating system. I also found a similar problem back in 2004 relating to JPEG thumbnail images. When the same vulnerability keeps re-occurring, it suggests a systemic problem in how we build software, so I set out to understand more about the reasons for the vulnerability existing in Windows Snip and Sketch.

A flawed API

The first problem I found is that the modern Windows API for saving files had a very similar problem to that in Android. Specifically, existing files would not be truncated by default. Arguably the vulnerability was worse because, unlike Android, there is no option to truncate files. The Windows documentation is, at best, unclear on the need to truncate files and what code is needed to achieve the desired result.

This wasn’t always the case. The old Win32 API for saving a file was (roughly) to show a file picker, get the filename the user selected, and then open the file. To open a file, the programmer must specify whether to overwrite the file or not, and example code usually does overwrite the file. However, the new “more secure” Universal Windows Platform (UWP) sandboxes the file picker in a separate process, allowing neat features like capability-based access control. It creates the file if needed and returns a handle which, if the selected file exists, will not overwrite the existing content.

However, from the documentation, a programmer would understandably assume, however, that the file would be empty.

“The file name, extension, and location of this storageFile match those specified by the user, but the file has no content.”

Continue reading The Acropalypse vulnerability in Windows Snip and Sketch, lessons for developer-centered security

Vulnerability in Linux containers – investigation and mitigation

Operating system access controls, that constrain which programs can open which files, have existed for almost as long as computers themselves. Access controls are still widely used and are more flexible and efficient when compared to cryptographically protecting files. Despite the long history, there continues to be innovation in access control, particularly now in containers, like Docker and Kubernetes and similar technologies offered by cloud providers. Here, rather than running lots of software on a single computer, the service is split up into microservices running in containers. Each container is isolated from others on the same computer as if it has its own computer and operating system and is prevented from reading files in other containers.

However, in reality, there’s only one operating system, and the container runtime’s role is to create the illusion that there are more than one. As part of its job, the runtime should also set up containers such that access control works inside each container because not every program running inside a container should be able to access every file. Multiple containers can also be given access to the same directory, and access controls used to restrict what each container can do with the directory contents. If access controls don’t work properly, an attacker could read or modify files they should not be able to.

Unfortunately, there is such a vulnerability. The bad news is that it originates from an omission in the specification that underlies all the major container runtimes and so is present regardless of which container runtime you use (e.g. runc, crun, Kata Containers) and regardless of whether you use containers directly (e.g. through Docker or podman) or indirectly (e.g. through Kubernetes). The good news is that the vulnerability affects a feature of Linux access control permissions that is not widely used – negative group permissions. However, if your system does depend on this feature then the vulnerability could be serious. Read on for more details about the vulnerability, why it exists and what can be done to mitigate the problem.

Introduction to Linux permissions

In Linux there are user accounts and each user is also a member of a group. Each object (files, directories, devices, etc.) has an associated owner and associated group. The object also has a set of permissions associated with the three classes: owner, group, and other. These permissions tell the operating system whether a user should be able to read from the object (r), write to the object (w) and execute the object (x). If a user is the owner of an object the owner-class permissions are used, if the user is a member of the file’s group, the group-class permissions are used, and otherwise the other-class permissions are used.

For example, a file containing a company’s finance database could be owned by the Chief Financial Officer (CFO) and have owner class permissions “r+w”. It could have the group set to “auditors” with group-class permissions only “r” and other-class permissions set to nothing. Then the CFO could freely read and write to the database, all members of the group auditors could read it, and everyone else cannot access the database at all.

Continue reading Vulnerability in Linux containers – investigation and mitigation

Apple letting the content-scanning genie out of the bottle

When Apple announced that they would be scanning iPhones for child sexual abuse material (CSAM), the push-back appears to have taken them by surprise. Since then, Apple has been engaging with experts and developing their proposals to mitigate risks that have been raised. In this post, I’ll discuss some of the issues with Apple’s CSAM detection system and what I’ve learned from their documentation and events I’ve participated in.

Technically Apple’s CSAM detection proposal is impressive, and I’m pleased to see Apple listening to the community to address issues raised. However, the system still creates risks that will be difficult to avoid. Governments are likely to ask to expand the system to types of content other than CSAM, regardless of what Apple would like to happen. When they do, there will be complex issues to deal with, both for Apple and the broader technology community. The proposals also risk causing people to self-censor, even when they are doing nothing wrong.

How Apple’s CSAM detection works

The iPhone or iPad scans images for known CSAM just before it uploads the image to Apple’s cloud data storage system – iCloud. Images that are not going to be uploaded don’t get scanned. The comparison between images and the database is made in such a way that minor changes to CSAM, like resizing and cropping, will trigger a match, but any image that wasn’t derived from a known item of CSAM should be very unlikely to match. The results of this matching process go into a clever cryptographic system designed to ensure that the user’s device doesn’t learn the contents of the CSAM database or which of their images (if any) match. If more than a threshold of about 30 images match, Apple will be able to verify if the matching images are CSAM and, if so, report to the authorities. If the number of matching images is less than the threshold, Apple learns nothing.

Risk of scope creep

Now that Apple has built their system, a risk is that it could be extended to search for content other than CSAM by expanding the database used for matching. While some security properties of their system are ensured through cryptography, the restriction to CSAM is only a result of Apple’s policy on the content of the matching database. Apple has clearly stated that it would resist any expansion of this policy, but governments may force Apple to make changes. For example, in the UK, this could be through a Technical Capability Notice (under the Investigatory Powers Act) or powers proposed in the Online Safety Bill.

If a government legally compelled them to expand the matching database, Apple may have to choose between complying or leaving the market. So far, Apple has refused to say which of these choices they would take.

Continue reading Apple letting the content-scanning genie out of the bottle

How Accidental Data Breaches can be Facilitated by Windows 10 and macOS Mojave

Inadequate user interface designs in Windows 10 and macOS Mojave can cause accidental data breaches through inconsistent language, insecure default options, and unclear or incomprehensible information. Users could accidentally leak sensitive personal data. Data controllers in companies might be unknowingly non-compliant with the GDPR’s legal obligations for data erasure.

At the upcoming Annual Privacy Forum 2019 in Rome, I will be presenting the results of a recent study conducted with my colleague Mark Warner, exploring the inadequate design of user interfaces (UI) as a contributing factor in accidental data breaches from USB memory sticks. The paper titled “Fight to be Forgotten: Exploring the Efficacy of Data Erasure in Popular Operating Systems” will be published in the conference proceedings at a later date but the accepted version is available now.

Privacy and security risks from decommissioned memory chips

The process of decommissioning memory chips (e.g. USB sticks, hard drives, and memory cards) can create risks for data protection. Researchers have repeatedly found sensitive data on devices they acquired from second-hand markets. Sometimes this data was from the previous owners, other times from third persons. In some cases, highly sensitive data from vulnerable people were found, e.g. Jones et al. found videos of children at a high school in the UK on a second-hand USB stick.

Data found this way had frequently been deleted but not erased, creating the risk that any tech-savvy future owner could access it using legally available, free to download software (e.g., FTK Imager Lite 3.4.3.3). Findings from these studies also indicate the previous owners’ intentions to erase these files and prevent future access by unauthorised individuals, and their failure to sufficiently do so. Moreover, these risks likely extend from the second-hand market to recycled memory chips – a practice encouraged under Directive 2012/19/EU on ‘waste electrical and electronic equipment’.

The implications for data security and data protection are substantial. End-users and companies alike could accidentally cause breaches of sensitive personal data of themselves or their customers. The protection of personal data is enshrined in Article 8 of the Charter of Fundamental Rights of the European Union, and the General Data Protection Regulation (GDPR) lays down rules and regulation for the protection of this fundamental right. For example, data processors could find themselves inadvertently in violation of Article 17 GDPR Right to Erasure (‘right to be forgotten’) despite their best intentions if they failed to erase a customer’s personal data – independent of whether that data was breached or not.

Seemingly minor design choices, the potential for major implications

The indication that people might fail to properly erase files from storage, despite their apparent intention to do so, is a strong sign of system failure. We know since more than twenty years that unintentional failure of users at a task is often caused by the way in which [these] mechanisms are implemented, and users’ lack of knowledge. In our case, these mechanisms are – for most users – the UI of Windows and macOS. When investigating these mechanisms, we found seemingly minor design choices that might facilitate unintentional data breaches. A few examples are shown below and are expanded upon in the full publication of our work.

Continue reading How Accidental Data Breaches can be Facilitated by Windows 10 and macOS Mojave

JavaCard: The execution environment you didn’t know you were using

This is the story of the most popular execution environment, its shortcomings, and how open source and hacking saved the day.

According to recent revelations, the MINIX operating system is embedded in the Management Engine of all Intel CPUs released after 2015. A side-effect of this is that MINIX became known as the most widespread operating system in the world almost overnight. However, in the last decade another tiny OS has silently pushed itself into even more devices around the world while remaining unknown to most: JavaCard.

Your SIM Card, Credit Cards, Loyalty Cards are all most likely JavaCards.

With more than six billion JavaCards sold last year, and approximately 20 billion estimated to have been purchased in total, JavaCard is the winner no one knows about. The execution environment was designed in 1996 for devices with limited memory and processing capabilities and was the first smartcard platform to give developers the ability to execute the same applet on cards produced by different manufacturers. This was a breakthrough for the industry that established JavaCard as the default platform for applications in need of a secure, tamper-proof element.

*While JavaCard is technically not an OS in the standard sense (smart cards do have their own proprietary OSes), in practice it provides very similar functionality with modern embedded OSes. For instance, unless granted by the vendor, app developers are strictly limited within the JavaCard runtime environment; this distinguishes JavaCard from e.g, the classic Java VM where developers can also execute native applications in addition to those executed in the JVM.

A malfunctioning operating system

An operating system is much more than the sum of its source code: it’s also the ecosystem built around it, including the specifications, support, and most importantly the application developers and the user community.

For JavaCard, the specification part is handled formally by Oracle and Java Card Forum who make periodical releases of the platform’s virtual machine (JCVM) specification, runtime library (JCRL) and application programming interface (JC API). All these contribute towards homogeneity between cards from different manufacturers; aiming to ensure applet interoperability and a minimum level of support for basic cryptographic algorithms – at least in theory. In practice, our research has shown that no product in the market implements JC API completely, and different cards support different sets of features. This severely hinders the interoperability of applications and constrains developers within the limited subset of JavaCard features supported by most manufacturers. Developers who choose to use all the features provided by a specific card will, with high likelihood, abolish the interoperability of their applets. Furthermore, some of those specifications are also inadvertently limiting the scope of the platform. For instance, the API specification that lists all the calls to be potentially available to smart card applications ended up acting as an evolutionary bottleneck. This is due to:

  1. Approximately three-year-long API revision cycles that severely delay support for newer cryptographic algorithms (the last API revision was released in 2015).
  2. More complex cryptographic operations (especially asymmetric cryptography) requiring design and production of dedicated hardware accelerators to actually support newly added cryptographic algorithms.
  3. The business model is geared towards the large corporations. Hence, newer cards are available only to those buying in large quantities while smaller development houses and researchers are forced to work with five years old, or even older, cards.

Continue reading JavaCard: The execution environment you didn’t know you were using

MAMADROID: Detecting Android Malware by Building Markov Chains of Behavioral Models

Now making up 85% of mobile devices, Android smartphones have become profitable targets for cybercriminals, allowing them to bypass two factor authentication or steal sensitive information such as credit cards details or login credentials.

Smartphones have limited battery and memory available, therefore, the defences that can be deployed on them have limitations. For these reasons, malware detection is usually performed in a centralised fashion by Android market operators. As previous work and even recent news have shown, however, even Google Play Store is not able to detect all malicious apps; to make things even worse, there are countries in which Google Play Store is blocked. This forces users to resort to third party markets, which are usually performing less careful malware checks.

Previous malware detection studies focused on models based on permissions or on specific API calls. While the first method is prone to false positives, the latter needs constant retraining, because apps as well as the Android framework itself are constantly changing.

Our intuition is that, while malicious and benign apps may call the same API calls during their execution, the reason why those calls are made may be different, resulting in them being called in a different order. For this reason, we decided to rely on sequences of calls that, as explained later, we abstract to higher level for performance, feasibility, and robustness reasons. To implement this idea we created MaMaDroid, a system for Android malware detection.

MaMaDroid

MaMaDroid is built by combining four different phases:

  • Call graph extraction: starting from the apk file of an app, we extract the call graph of the analysed sample.
  • Sequence extraction: from the call graph, we extract the different potential paths as sequences of API calls and abstract all those calls to higher levels.
  • Markov Chain modelling: all the samples got their sequences of abstracted calls, and these sequences can be modelled as transitions among states of a Markov Chain.
  • Classification: Given the probabilities of transition between states of the chains as features set, we apply machine learning to detect malicious apps.
Four phases of MaMaDroid

Call graph extraction

MaMaDroid is a system based only on static analysis. To analyse the app, we use off-the-shelf tools, such as Soot and FlowDroid for the first step of the system.

Sequence Extraction

Taking the call graph as input, we extract the sequences of functions potentially called by the program and, by identifying the set of entry nodes, enumerate all the possible paths and output them as sequences of API calls.

Example call graph in which we can observe 3 different potential paths, or sequences, starting from the root node

Continue reading MAMADROID: Detecting Android Malware by Building Markov Chains of Behavioral Models

Double-Fetch Situations Turn into Double-Fetch Vulnerabilities: A Study of Double Fetches in the Linux Kernel

We held our annual ACE-CSR event on 2 November 2016. This post and the title is drawn from the second talk by Dr Jens Krinke, a senior lecturer in the Centre for Research on Evolution, Search and Testing, and the Software Systems Engineering group at UCL, who outlined work he conducted with Pengfei Wang that found flaws in the Linux kernel.

The work Krinke presented, completed over the last year, revolves around double-fetch problems, which got their name in 2008 in a study that found exploitable vulnerabilities in the Windows kernel. In an updated study in 2013, Mateusz “j00ru” Jurczyk and Gynvael Coldwind produced a paper, which examined the Windows kernel and found 89 potential issues, 36 of which were highly exploitable. Most of the paper focused on how to exploit these bugs; today, if you find such a vulnerability you can download an exploit from Github.

Many of these vulnerabilities have now been fixed, but Jurczyk and Gynvael’s analysis was extremely slow, because they essentially simulated Windows over and over again, with the result that it took them 15 hours just to get to the boot screen. Because the analysis was dynamic, they were only able to cover a subset of the drivers present in the system, and they said outright that they knew there were other vulnerabilities present. Krinke was sure there were vulnerabilities of this type in Linux, but no one had audited the Linux kernel in detail.

Krinke and Wang wanted to audit Linux, looking at the source code directly and including all the drivers – which was essential, because, as it turned out, roughly 80% of the problems they found were in the drivers.

Double-fetch problems are memory access conflicts that arise in the interface between the operating system kernel and an application when data is changed between the time it’s checked and the time it’s actually used. It works this way: every process a user starts has its own virtual memory space that is isolated from other user memory spaces; only the kernel can see all memory spaces. However, there must be interaction, and this is done through system calls. So, a user selects some data and asks the system to operate on it. The data remains in the user space but the system copies it to the kernel and checks the data is valid before using it. The upshot is that the kernel may fetch the data twice: once for checking and once for use. A malicious application that intercedes between those two fetches and changes the data after it’s been checked but before it’s used can cause system crashes.

If an error like this arises outside of the operating system, it causes software errors, but this affects only you; when it happens in the operating system, an attacker can take over the complete system.

Analysing this process is extremely hard, especially because exactly what’s occurring is unknown beforehand. Krinke and Wang instead created a two-step analysis. First, a pattern-based double-fetch analysis based on Coccinelle examined the source code to find occurrences; these were then manually analysed to determine how the data was used and what the consequences were. The results were divided into three categories: size checking; type selection; and shallow copy. In the 40,000 source code files comprising Linux, they found 90 such situations, of which 57 were in drivers, five were true bugs. All of them have been confirmed and three are exploitable vulnerabilities. In checking back, Krinke and Wang found that some of these bugs are at least ten years old.

Krinke and Wang also applied their analysis also to Android (a variant of Linux) and FreeBSD. The latter, which is more security-oriented in design, had no problems. Android had only three bugs, but they also found a problem that had been fixed in Linux (but not in Android). After studying these three operating systems with Coccinelle, they can say that after these bugs have been fixed, that all three are likely to be free of double-fetch vulnerabilities. As the bug was in the file system, Android apps with native binary code that interacts with the phone at the file system level can also exploit it.