A Reflection on the Waves Of Malice: Malicious File Distribution on the Web (part 2)

The first part of this article introduced the malicious file download dataset and the delivery network structure. This final part explores the types of files delivered, discusses how the network varies over time, and concludes with challenges for the research community.

The Great Divide: A PUP Ecosystem and a Malware Ecosystem

We found a notable divide in the delivery of PUP and malware. First, there is much more PUP than malware in the wild: we found PUP-to-malware ratios of 5:1 by number of SHA-2s, and 17:2 by number of raw downloads. Second, we found that mixed delivery mechanisms of PUP and malware are not uncommon (e.g., see our Opencandy case study in the paper). Third, the highly connected Giant Component is predominantly a PUP Ecosystem (8:1 PUP-to-malware by number of SHA-2s), while the many “islands” of download activity outside of this component are predominantly a Malware Ecosystem (1.78:1 malware-to-PUP by number of SHA-2s).

Comparing the structures of the two ecosystems,we found that the PUP Ecosystem leverages a higher degree of IP address and autonomous system (AS) usage per domain and per dropper than the Malware Ecosystem, possibly indicating higher CDN usage or the use of evasive fast-flux techniques to change IP addresses (though, given earlier results, the former is the more likely). On the other hand, the Malware Ecosystem was attributed with fewer SHA-2s being delivered per domain than the PUP Ecosystem with the overall numbers in raw downloads remaining the same, which could again be indicative of a disparity in the use of CDNs between the two ecosystems (i.e., CDNs typically deliver a wide range of content). At the same time, fewer suspicious SHA-2s being delivered per domain could also be attributable to evasive techniques being employed (e.g., malicious sites delivering a few types of files before changing domain) or distributors in this ecosystem dealing with fewer clients and smaller operations.

We tried to estimate the number of PPIs in the wild by defining a PPI service as a network-only component (or group of components aggregated by e2LD) that delivered more than one type of malware or PUP family. Using this heuristic, we estimated a lower bound of 394 PPIs operating on the day, 215 of which were in the PUP Ecosystem. In terms of proportions, we found that the largest, individual PPIs in the PUP and Malware Ecosystems involved about 99% and 24% of all e2LDs and IPs in their ecosystems, respectively.

With there being a number of possible explanations for these structural differences between ecosystems, and such a high degree of potential PPI usage in the wild (especially within the PUP Ecosystem), this is clearly an area in which further research is required.

Keeping Track of the Waves

The final part of the study involved tracking these infrastructures and their activities over time. Firstly, we generated tracking signatures of the network-only (server-side) and file-only (client-side) delivery infrastructures. In essence, this involved tracking the root and trunk nodes in a component, which typically had the highest node degrees, and thus, were more likely to be stable, as opposed to the leaf nodes, which were more likely to be ephemeral.

Continue reading A Reflection on the Waves Of Malice: Malicious File Distribution on the Web (part 2)

A Reflection on the Waves Of Malice: Malicious File Distribution on the Web (part 1)

The French cybercrime unit, C3N, along with the FBI and Avast, recently took down the Retadup botnet that infected more than 850,000 computers, mostly in South America. Though this takedown operation was successful, the botnet was created as early as 2016, with the operators reportedly making millions of euros since. It is clear that large-scale analysis, monitoring, and detection of malicious downloads and botnet activity, even as far back as 2016, is still highly relevant today in the ongoing battle against increasingly sophisticated cybercriminals.

Malware delivery has undergone an impressive evolution since its inception in the 1980s, moving from being an amateur endeavor to a well-oiled criminal business. Delivery methods have evolved from the human-centric transfer of physical media (e.g., floppy disks), sending of malicious emails, and social engineering, to the automated delivery mechanisms of drive-by downloads (malicious code execution on websites and web advertisements), packaged exploit kits (software packages that fingerprint user browsers for specific exploits to maximise the coverage of potential victims), and pay-per-install (PPI) schemes (botnets that are rented out to other cybercriminals).

Furthermore, in recent times, researchers have uncovered the parallel economy of potentially unwanted programs (PUP), which share many traits with the malware ecosystem (such as their delivery through social engineering and PPI networks), while being primarily controlled by different actors. However with some types of PUP, including adware and spyware, PUP has generally been regarded as an annoyance rather than a direct threat to security.

Using the download metadata of millions of users worldwide from 2015/16, we (Colin C. Ife, Yun Shen, Steven J. Murdoch, Gianluca Stringhini) carried out a comprehensive measurement study in the short-term (a 24-hour period), the medium-term (daily, over the course of a month), and the long-term (weekly, over the course of a year) to characterise the structure of this complex malicious file delivery ecosystem on the Web, and how it evolves over time. This work provides us with answers to some key questions, while, at the same time, posing some more and exemplifying some significant issues that continue to hinder security research on unwanted software activity.

An Overview

There were three main research questions that influenced this study, which we will traverse in the following sections of this post:

    1. What does the malicious file delivery ecosystem look like?
    2. How do the networks that deliver only malware, only PUP, or both compare in structure?
    3. How do these file delivery infrastructures and their activities change over time?

For full technical details, you can refer to our paper – Waves of Malice: A Longitudinal Measurement of the Malicious File Delivery Ecosystem on the Web – published by and presented at the ACM AsiaCCS 2019 conference.

The Data

The dataset was provided (and pre-sanitized) by Symantec and consisted of 129 million download events generated by 12 million users. Each download event contained information such as the timestamp, the SHA-2s of the downloaded file and its parent file, the filename, the size (in bytes), the referrer URL, Host URLs (landing pages after redirection) of the download and parent file, and the IP address hosting the download.

Continue reading A Reflection on the Waves Of Malice: Malicious File Distribution on the Web (part 1)

New threat models in the face of British intelligence and the Five Eyes’ new end-to-end encryption interception strategy

Due to more and more services and messaging applications implementing end-to-end encryption, law enforcement organisations and intelligence agencies have become increasingly concerned about the prospect of “going dark”. This is when law enforcement has the legal right to access a communication (i.e. through a warrant) but doesn’t have the technical capability to do so, because the communication may be end-to-end encrypted.

Earlier proposals from politicians have taken the approach of outright banning end-to-end encryption, which was met with fierce criticism by experts and the tech industry. The intelligence community had been slightly more nuanced, promoting protocols that allow for key escrow, where messages would also be encrypted under an additional key (e.g. controlled by the government). Such protocols have been promoted by intelligence agencies as recently as 2016 and early as the 1990s but were also met with fierce criticism.

More recently, there has been a new set of legislation in the UK, statements from the Five Eyes and proposals from intelligence officials that propose a “different” way of defeating end-to-end encryption, that is akin to key escrow but is enabled on a “per-warrant” basis rather than by default. Let’s look at how this may effect threat models in applications that use end-to-end encryption in the future.

Legislation

On the 31st of August 2018, the governments of the United States, the United Kingdom, Canada, Australia and New Zealand (collectively known as the “Five Eyes”) released a “Statement of Principles on Access to Evidence and Encryption”, where they outlined their position on encryption.

In the statement, it says:

Privacy laws must prevent arbitrary or unlawful interference, but privacy is not absolute. It is an established principle that appropriate government authorities should be able to seek access to otherwise private information when a court or independent authority has authorized such access based on established legal standards.

The statement goes on to set out that technology companies have a mutual responsibility with government authorities to enable this process. At the end of the statement, it describes how technology companies should provide government authorities access to private information:

The Governments of the Five Eyes encourage information and communications technology service providers to voluntarily establish lawful access solutions to their products and services that they create or operate in our countries. Governments should not favor a particular technology; instead, providers may create customized solutions, tailored to their individual system architectures that are capable of meeting lawful access requirements. Such solutions can be a constructive approach to current challenges.

Should governments continue to encounter impediments to lawful access to information necessary to aid the protection of the citizens of our countries, we may pursue technological, enforcement, legislative or other measures to achieve lawful access solutions.

Their position effectively boils down to requiring technology companies to provide a technical means to fulfil court warrants that require them to hand over private data of certain individuals, but the implementation for doing so is open to the technology company.

Continue reading New threat models in the face of British intelligence and the Five Eyes’ new end-to-end encryption interception strategy

UCL runs a digital security training event aimed at domestic abuse support services

In late November, UCL’s “Gender and IoT” (G-IoT) research team ran a “CryptoParty” (digital security training event) followed by a panel discussion which brought together frontline workers, support organisations, as well as policy and tech representatives to discuss the risk of emerging technologies for domestic violence and abuse. The event coincided with the International Day for the Elimination of Violence against Women, taking place annually on the 25th of November.

Technologies such as smartphones or platforms such as social media websites and apps are increasingly used as tools for harassment and stalking. Adding to the existing challenges and complexities are evolving “smart”, Internet-connected devices that are progressively populating public and private spaces. These systems, due to their functionalities, create further opportunities to monitor, control, and coerce individuals. The G-IoT project is studying the implications of IoT-facilitated “tech abuse” for victims and survivors of domestic violence and abuse.

CryptoParty

The evening represented an opportunity for frontline workers and support organisations to upskill in digital security. Attendees had the chance to learn about various topics including phone, communication, Internet browser and data security. They were trained by a group of so-called “crypto angels”, meaning volunteers who provide technical guidance and support. Many of the trainers are affiliated with the global “CryptoParty” movement and the CryptoParty London specifically, as well as Privacy International, and the National Cyber Security Centre.

G-IoT’s lead researcher, Dr Leonie Tanczer, highlighted the importance of this event in light of the socio-technical research that the team pursued so far: “Since January 2018, we worked closely with the statutory and voluntary support sector. We identified various shortcomings in the delivery of tech abuse provisions, including practice-oriented, policy, and technical limitations. We set up the CryptoParty to bring together different communities to holistically tackle tech abuse and increase the technical security awareness of the support sector.”

Continue reading UCL runs a digital security training event aimed at domestic abuse support services

When Convenience Creates Risk: Taking a Deeper Look at Security Code AutoFill on iOS 12 and macOS Mojave

A flaw in Apple’s Security Code AutoFill feature can affect a wide range of services, from online banking to instant messaging.

In June 2018, we reported a problem in the iOS 12 beta. In the previous post, we discussed the associated risks the problem creates for transaction authentication technology used in online banking and elsewhere. We described the underlying issue and that the risk will carry over to macOS Mojave. Since our initial reports, Apple has modified the Security Code AutoFill feature, but the problem is not yet solved.

In this blog post, we publish the results of our extended analysis and demonstrate that the changes made by Apple mitigated one symptom of the problem, but did not address the cause. Security Code AutoFill could leave Apple users in a vulnerable position after upgrading to iOS 12 and macOS Mojave, exposing them to risks beyond the scope of our initial reports.

We describe four example attacks that are intended to demonstrate the risks stemming from the flawed Security Code AutoFill, but intentionally omit the detail necessary to execute them against live systems. Note that supporting screenshots and videos in this article may identify companies whose services we’ve used to test our attacks. We do not infer that those companies’ systems would be affected any more or any less than their competitors.

Flaws in Security Code AutoFill

The Security Code AutoFill feature extracts short security codes (e.g., a one-time password or OTP) from an incoming SMS and allows the user to autofill that code into a web form, webpage, or app when authenticating. This feature is meant to provide convenience, as the user no longer needs to memorize and re-enter a code in order to authenticate. However, this convenience could create risks for the user.

Continue reading When Convenience Creates Risk: Taking a Deeper Look at Security Code AutoFill on iOS 12 and macOS Mojave

Stronger Password, Longer Lifetime: Studying UCL’s password policy

In October 2016, UCL’s Information Services Division (ISD) implemented a new password policy to encourage users to choose stronger passwords. The policy links password lifetime (the time before the password expires) to password strength: The stronger the password, the longer the lifetime.

We (Ingolf Becker, Simon Parkin and M. Angela Sasse) decided to collaborate with the Information Services Division to study the effect of this policy change, and the results were published at USENIX Security this week. We find that users appreciate the choice and respond to the policy by choosing stronger passwords when changing passwords. Even after 16 months the mean password lifetime at UCL continues to increase, yet stronger passwords also lead to more password resets.

The new policy

In the new policy, passwords with Shannon Information Entropy of 50 bits receive a lifetime of 100 days, and passwords with 120 bits receive a lifetime of 350 days:

Password expiry by entropy

Additionally, the new policy penalises the lifetime of passwords containing words from a large dictionary.

Users play the game

We analysed the password lifetime – what we will refer to from here on in as the ‘password strength’ – of all password change and reset events of all pseudonymised users at UCL. The following figure shows the mean password expiration of all users over time, smoothed by 31-day moving averages:

Password expiration over time for all users and new users.

A small drop in password strength was observed between November ’16 and February ’17, as users were moved on to and generally became accustomed to the new system; the kinds of passwords they would have been used to using were at that point not getting them as many days as before (hence the drop). After February ’17, the mean strength increases from 145 days to 170 days in 12 months – an increase of 6.9 bits of entropy. This strongly suggests that users have generally adapted slowly to the new password policy, and eventually make use of the relatively new ability to increase password lifetime by expanding and strengthening their passwords.

Continue reading Stronger Password, Longer Lifetime: Studying UCL’s password policy

What can infosec learn from strategic theory?

Antonio Roque, of MIT Lincoln Labs, has published some provocative papers to arXiv over the last year. These include one on cybersecurity meta-methodology and one on making predictions in cybersecurity. These papers ask some good questions. The one I want to focus on in this short space is what cybersecurity can learn from Carl von Clausewitz’s treatise On War.

This might seem a bit odd to modern computer scientists, but I think it’s a plausible question. Cybersecurity is about winning conflicts, at least sometimes. And as I and others have written, one of the interesting challenges about generating knowledge with a science of security is the fact we have active adversaries. As Roque tells us, generating knowledge in the face of adversaries is also one of the things On War is about.

One important question for me is whether Clausewitz interestingly presaged our current problems (and has since been overtaken), or if On War makes contributions to thinking about cybersecurity that are new and comparable to those from the fields of economics, mathematics, philosophy of science, etc. After a close reading of these papers, my stance is: I have more questions that need answers.

Continue reading What can infosec learn from strategic theory?

Attack papers are case studies

We should treat attack papers like case studies. When we read them, review them, use them for evidence, and learn from them. This claim is not derogatory. Case studies are useful. But like anything, to be useful case studies need to be done and used appropriately.

Let’s be clear what I mean by attack paper. Any paper that reports how to attack some system. Any paper that includes details of an exploit, discloses a vulnerability, or demonstrates a proof-of-concept for breaching the security of a system. The efail paper that Steven discussed recently is an example. Security conferences are full of these; the ratio of attack papers to total papers varies per conference. USENIX Security tends to contain a fair few.

Let’s be clear what I mean by case study. I mean a scientific report that details a specific occurrence of interest as observed by the author. Case studies can be active, and include interviews or other questioning. They can be solely passive observation. Case studies can follow just one case in isolation, or might follow a series of related cases in similar ways for comparison. Case studies usually do not involve a planned intervention by the observer, otherwise we start to call them experiments. But they may track changes as the result of interventions outside the observer’s control.

What might change if we think about attack papers as case studies? We can apply our scientific experience from other disciplines. I’ve argued before that security is a science. We need to adapt scientific techniques, and other sciences might learn from what we do in security. But we need to be in a dialogue there. Calling attack papers what they are opens up this dialogue in several ways.

Continue reading Attack papers are case studies

Scanning the Internet for Liveness

Internet-wide scanning (or probing) has emerged as a key measurement technique to study a diverse set of the Internet’s properties, including address space utilization, host reachability, topology, service availability, vulnerabilities, and service discrimination. But despite its widespread use and critical importance for Internet measurement, we still lack a clear understanding of IP liveness—whether a target IP address responds to a probe packet. What type of probe packets should we send if we, for example, want to maximize the responding host population? What type of responses can we expect and which factors determine such responses? What degree of consistency can we expect when probing the same host with different probe packets?

In our recent paper Scanning the Internet for Liveness, we presented a systematic analysis of liveness and how it shows up in active scanning campaigns. We developed a taxonomy of liveness which we employed to develop a method to perform concurrent IPv4 scans using ICMP, five TCP-based, and two UDP-based protocols, capturing all responses to our probes. Our key findings are:

  • Responsive host populations are highly sensitive to the choice of probe. While ICMP discovers the highest number of raw IPs, our TCP and UDP measurements exclusively contribute a fifth to the total population of responsive hosts.
  • Collecting ICMP Error messages for TCP and UDP scans increases the responsive population by more than 13%, and provides new opportunities to interpret scan results.
  • At the transport layer, our concurrent measurements reveal that the majority of hosts exhibit inconsistent behaviour when probed on different ports, and that capturing negative responses significantly improves scanning completeness.
  • Our concurrent scans allow us to identify nearly 2M tarpits (IPs masquerading as fake hosts) that would bias measurements that do not take them into account.
  • Our study of cross-protocol liveness shows that responsiveness for some protocols is correlated, suggesting that the same seed set of responsive IP addresses can be potentially used to bootstrap multiple highly-correlated target populations to reduce scan traffic.

This work recently appeared in the April 2018 issue of ACM SIGCOMM Computer Communication Review (CCR), and was conducted in collaboration with Philipp Richter (MIT), Mobin Javed (LUMS Pakistan, ICSI Berkeley), Srikanth Sundaresan (Princeton University), Zakir Durumeric (Stanford University), Steven J. Murdoch (University College London), Richard Mortier (University of Cambridge) and Vern Paxson (UC Berkeley, ICSI Berkeley). Overall, this study yields practical insights and methodological improvements for the design and the execution of active Internet measurement studies. We released the code and data of this work as open source to allow for reproducibility of the results, and to enable further research.

Continue reading Scanning the Internet for Liveness

Tampering with OpenPGP digitally signed messages by exploiting multi-part messages

The EFAIL vulnerability in the OpenPGP and S/MIME secure email systems, publicly disclosed yesterday, allows an eavesdropper to obtain the contents of encrypted messages. There’s been a lot of finger-pointing as to which particular bit of software is to blame, but that’s mostly irrelevant to the people who need secure email. The end result is that users of encrypted email, who wanted formatting better than what a mechanical typewriter could offer, were likely at risk.

One of the methods to exploit EFAIL relied on the section of the email standard that allows messages to be in multiple parts (e.g. the body of the message and one or more attachments) – known as MIME (Multipurpose Internet Mail Extensions). The authors of the EFAIL paper used the interaction between MIME and the encryption standard (OpenPGP or S/MIME as appropriate) to cause the email client to leak the decrypted contents of a message.

However, not only can MIME be used to compromise the secrecy of messages, but it can also be used to tamper with digitally-signed messages in a way that would be difficult if not impossible for the average person to detect. I doubt I was the first person to discover this, and I reported it as a bug 5 years ago, but it still seems possible to exploit and I haven’t found a proper description, so this blog post summarises the issue.

The problem arises because it is possible to have a multi-part email where some parts are signed and some are not. Email clients could have adopted the fail-safe option of considering such a mixed message to be malformed and therefore treated as unsigned or as having an invalid signature. There’s also the fail-open option where the message is considered signed and both the signed and unsigned parts are displayed. The email clients I looked at (Enigmail with Mozilla Thunderbird, and GPGTools with Apple Mail) opt for a variant of the fail-open approach and thus allow emails to be tampered with while keeping their status as being digitally signed.

Continue reading Tampering with OpenPGP digitally signed messages by exploiting multi-part messages