Bentham’s Gaze

Resolving disputes through computer evidence: lessons from the Post Office Trial

On Monday, the final judgement in the Post Office trial was handed down, finding in favour of the claimants on all counts. The outcome will be of particular interest to the group of 587 claimants who brought the case against Post Office Limited, but the judgement also illustrates problems handling evidence generated by computers that have much broader applicability. I think this trial demonstrates that the way such disputes are resolved is not fit for purpose and that changes are needed in both in how computers generate evidence and how such evidence is reasoned about in litigation.

This case centres around disputes between Post Office Limited and sub-postmasters who operate Post Office branches on its behalf. Post Office Limited supplies these sub-postmasters with products to sell, and the computer accounting system – Horizon – for managing the branch. The claimants contend that shortfalls between the money that was in their branch and what Horizon says result from bugs in Horizon or someone maliciously accessing it. The Post Office instead claims that the shortfalls are real, and it is the responsibility of the sub-postmaster to reimburse the Post Office.

Such disputes have resulted in sub-postmasters being bankrupted, and others have even been jailed because the Post Office contends that evidence produced by Horizon demonstrates fraud by the sub-postmaster. The judgement vindicates the sub-postmasters, concluding that Horizon “was not remotely robust”.

This trial is actually the second in this case, with the prior one also finding in favour of the sub-postmasters – that the contractual terms set by Post Office regarding how they investigate and handle shortfalls are unfair. There would have been at least two more trials, had the parties not settled last week with Post Office Limited offering an apology and £58m in compensation. Of this, the vast majority will go towards legal costs and to the fund which bankrolled the litigation – leaving claimants lucky to get much more than £10k on average. Disappointing, sure, but better than nothing and that is what they could have got had the trials and inevitable appeals continued.

As would be expected for a trial depending on highly technical arguments, expert evidence featured heavily. The Post Office expert took a quantitative approach, presenting a statistical argument that claimant’s losses were implausibly high. This argument went by making a rough approximation as to the total losses of all sub-postmasters resulting from bugs in Horizon. Then, by assuming that these losses were spread over all sub-postmasters equally, losses by the 587 claimants would be no more than £25,000 – far less than the £18.7 million claimed. On this basis, the Post Office said that it is implausible for Horizon bugs to be the cause of the losses, and instead they are the fault of the affected sub-postmasters.

This argument is fundamentally flawed; I said so at the time, as did others. The claimant group was selected specifically as people who thought they were victims of Horizon bugs so it’s quite reasonable to think this group might indeed be disproportionally affected by Horizon bugs. The judge agreed, saying, “The group has a bias, in statistical terms. They plainly cannot be treated, in statistical terms, as though they are a random group of 587 [sub-postmasters]”. This error can be corrected, but the argument becomes circular and a statistical approach adds little new information. As the judgement concludes, “probability theory only takes one so far in this case, and that is not very far”.

Continue reading Resolving disputes through computer evidence: lessons from the Post Office Trial

We’re fighting the good fight, but are we making full use of the armoury?

In this post, we reflect on the current state of cybersecurity and the fight against cybercrime, and identify, we believe, one of the most significant drawbacks Information Security is facing. We argue that what is needed is a new, complementary research direction towards improving systems security and cybercrime mitigation, which combines the technical knowledge and insights gained from Information Security with the theoretical models and systematic frameworks from Environmental Criminology. For the full details, you can read our paper – “Bridging Information Security and Environmental Criminology Research to Better Mitigate Cybercrime.”

The fight against cybercrime is a long and arduous one. Not a day goes by without us hearing (at an increasingly alarming rate) the latest flurry of cyber attacks, malware operations, (not so) newly discovered vulnerabilities being exploited, and the odd sprinkling of a high-profile victim or a widely-used service being compromised by cybercriminals.

A burden borne for too long?

Today, the topic of security and cybercrime is one that is prominent in a number of circles and fields of research (e.g., crime science and criminology, law, sociology, economics, policy, policing), not to talk of wider society. However, for the best part of the last half-century, the burden of understanding and mitigating cybercrime, and improving systems security has been predominantly borne by information security researchers and computer engineers. Of course, this is entirely reasonable. As circumstances had long dictated, the exponential penetration and growth in the capability of digital technologies co-dependently brought the opportunity for malicious exploitation, and, alongside it, the need to combat and prevent such malicious activities. Enter the arms race.

However, and potentially the biggest downside to holding this solitary responsibility for so long, the traditional, InfoSec approach to security and cybercrime prevention has leaned heavily towards the technical side of this mantle: discovering vulnerabilities, creating patches, redefining secure software design (e.g., STRIDE), conceptualising threat models for technical systems, and developing technologies to detect, prevent, and/or counter these threats. But, with the threat landscape of today, is this enough?

Taking stock

Make no mistake, it is clear that such technical skill-sets and innovations that abound and are produced from information security are invaluable in keeping up with similarly skilled and innovative cybercriminals. Unfortunately, however, one may find that such approaches to security and preventing cybercrime are generally applied in an ad hoc manner and lacking systemic structure, with, on the other hand, focus being constantly drawn towards the “top” vulnerabilities (e.g., OWASP’s Top 10) as opposed to “less important” ones (which are just as capable in enabling a compromise), or focus on the most recent wave of cyber threats as opposed to those only occurring a few years ago (e.g., the Mirai botnet and its variants, which have been active as far back as 2016, but are seemingly now on the back burner of priorities).

How much thought, can we say, is being directed towards understanding the operational aspects of cybercrime – the journey of the cybercriminal, so to speak, and their opportunity framework? Patching vulnerabilities and taking down botnets are indeed important, but how much attention is placed on understanding criminal displacement and adaptation: the shift of criminal activity from one form to another, or the adaptation of cybercriminals (and even the victims, targets, and other stakeholders), in reaction to new countermeasures? Are system designers taking the necessary steps to minimise the attack surfaces effectively, considering all techniques available to them? Is it enough to look a problem at face value, develop a state-of-the-art detection system, and move on to the next one? We believe much more can and should be done.

Continue reading We’re fighting the good fight, but are we making full use of the armoury?

UK Parliament on protecting consumers from economic crime

On Friday, the UK House of Commons Treasury Committee published their report on the consumer perspective of economic crime. I’ve frequently addressed this topic in my research, as well as here on Bentham’s Gaze, so I’m pleased to see several recommendations of the committee match what myself and colleagues have proposed. In other respects, the report could have gone further, so as well as discussing the positive aspects of the report, I would also like to suggest what more could be done to reduce economic crime and protect its victims.

Irrevocable payments are the wrong default

Transfers between UK bank accounts will generally use the Faster Payment System (FPS), where money will immediately show up in the recipient account. FPS transfers cannot be revoked, even in the case of fraud. This characteristic protects banks because if fraudulently obtained funds leave the banking system, the bank receiving the transfer has no obligation to reimburse the victim.

In contrast, the clearing system for paper cheques permits payments to be revoked for a few days after the funds appeared in the recipient account, should there have been a fraud. This period allows customers to quickly make use of funds they receive, while still giving a window of opportunity for banks and customers to identify and prevent fraud. There’s no reason why this same revocation window could not be applied to fully electronic payment systems like FPS.

In my submissions to consultations on how to prevent push payment scams, I argued that irrevocable payments are the wrong default, and transfers should be possible to reverse in cases of fraud. The same argument applies to consumer-oriented cryptocurrencies like Libra. I’m pleased to see that the Treasury Committee agrees and they have recommended that when a customer sends money to an account for the first time, that transfer be revocable for 24 hours.

Introducing Confirmation of Payee, finally

The banking industry has been planning on launching the Confirmation of Payee system to check if the name of the recipient of a transfer matches what the customer sending money thinks. The committee is clearly frustrated with delays on deploying this system, first promised for September 2018 but since slipped to March 2020. Confirmation of Payee will be a helpful tool for customers to help avoid certain frauds. Still, I’m pleased the committee also recognise it’s limitations and that the “onus will always be on financial firms to develop further methods and technologies to keep up with fraudsters.” It is for this reason that I argued that a bank showing a customer a Confirmation of Payee mismatch should not be a sufficient condition to hold customers liable for fraud, and the push-payment scam reimbursement scheme is wrong to do so. It doesn’t look like the committee is asking for the situation to be changed though.

Continue reading UK Parliament on protecting consumers from economic crime

Forcing phone companies to secure SMS authentication would cause more harm than good

Food-writer and campaigner, Jack Monroe, has become the latest high-profile victim of a SIM-swap scam, losing over £5,000 from both her PayPal and bank accounts to a criminal who intercepted SMS authentication codes. The Payment Services Directive requires that fraud victims get their money back, but banks act slowly and sometimes push the blame onto the victims. When (as I hope it will) the money does eventually get reimbursed, she’s still unlikely to get compensation for any consequential losses, nor for the upset caused. It’s no surprise that this experience has been stressful for Jack, as it would be for most people in her situation.

I am, of course, very sympathetic to victims of SIM-swap fraud and recognise the substantial financial costs, as well as the sense of violation that results. Naturally, fingers are being pointed at the phone companies and followed up with calls for them to do better identity checks before transferring a phone number to a new SIM card. I think this isn’t entirely fair. The real problem is that banks and other payment service providers have outsourced authentication to phone companies, without ensuring that the level of security is appropriate for the sums of money at risk. Banks could have chosen to distribute authentication devices and find a secure way to re-issue ones that are lost. Instead, they have pushed this task to unwitting phone companies, and leave their customers to pick up the pieces when things go wrong, so don’t have an incentive to do better.

More secure SMS authentication

But what if phone companies did do a better job at handing out replacement SIM cards? Maybe the government could push them into doing so, or the phone companies might just get fed up with the bad press. Phone companies could, in principle, set up a process for re-issuing SIM cards which would meet the highest standards of the banking industry. Let’s put aside the issue that SMS was never designed to be secure, and that these processes would put up the cost of phone bills – would it fix the problem? I would argue that it does not. Processes good enough for banking authentication could lock people out of receiving phone calls, and disproportionately harm the most vulnerable members of society.

Continue reading Forcing phone companies to secure SMS authentication would cause more harm than good

A Marlin is One of the Fastest SNARKs in the Ocean

In this post, we discuss our new zero-knowledge proving system, Marlin, by Chiesa, Hu, Maller, Mishra, Vesely, and Ward. This year has been the year of the universal SNARK, with Sonic, Libra, and Plonk all bidding for attention. Marlin is yet another competitor, one which we recommend using when you require fast verification time without the use of batching.

Why Universal SNARKs?

A universal SNARK is a proving system in which a single trusted setup suffices to prove anything that we know how to prove. That means that the same setup could be used across all applications and that parameters could be stored in a general-purpose library. Additionally, these universal SNARKs typically have relatively easy to coordinate setup procedures, which makes it easier to convince users that the procedure has been carried out correctly and securely.

Some SNARKs avoid setup procedures altogether. Such works include Spartan, Halo, and Hyrax. However, the cost of avoiding a trusted setup can generally be seen in the proof sizes and verification time.

Marlin or Sonic?

In this authors’ humble opinion, Sonic is fabulous. The proofs are small, the provers are fast, and the verification is fast provided one is verifying many proofs at the same time. For applications that use batched verifications, Sonic currently remains the state-of-the-art. Cryptocurrency transactions are a classic example of this – nodes can verify all the transactions in a new block simultaneously (provided the miner aggregates the transactions). However, this setting in which Sonic excels, i.e. the setting in which the verifier is not just given a single proof but many, many proofs of the same thing, is not always a given. For an example where Sonic’s batched proofs would not suffice, consider a randomness beacon. Here verification of the beacons outputs is only done once in a while. Therefore it would be a setting where batching is totally inappropriate.

Continue reading A Marlin is One of the Fastest SNARKs in the Ocean

Measuring mobility without violating privacy – a case study of the London Underground

In the run-up to this year’s Privacy Enhancing Technologies Symposium (PETS 2019), I noticed some decidedly non-privacy-enhancing behaviour. Transport for London (TfL) announced they will be tracking the wifi MAC addresses of devices being carried on London Underground stations. Before storing a MAC address it will be hashed with a key, but since this key will remain unchanged for an extended period (2 years), it will be possible to track the movements of an individual over this period through this pseudonymous ID. These traces are likely enough to link records back to the individual with some knowledge of that person’s distinctive travel plans. Also, for as long as the key is retained it would be trivial for TfL (or someone who stole the key) to convert the someone’s MAC address into its pseudonymised form and indisputably learn that that person’s movements.

TfL argues that under the General Data Protection Regulations (GDPR), they don’t need the consent of individuals they monitor because they are acting in the public interest. Indeed, others have pointed out the value to society of knowing how people typically move through underground stations. But the GDPR also requires that organisations minimise the amount of personal data they collect. Could the same goal be achieved if TfL irreversibly anonymised wifi MAC addresses rather than just pseudonymising them? For example, they could truncate the hashed MAC address so that many devices all have the same truncated anonymous ID. How would this affect the calculation of statistics of movement patterns within underground stations? I posed these questions in a presentation at the PETS 2019 rump session, and in this article, I’ll explain why a set of algorithms designed to violate people’s privacy can be applied to collect wifi mobility information while protecting passenger privacy.

It’s important to emphasise that TfL’s goal is not to track past Underground customers but to predict the behaviour of future passengers. Inferring past behaviours from the traces of wifi records may be one means to this end, but it is not the end in itself, and TfL creates legal risk for itself by holding this data. The inferences from this approach aren’t even going to be correct: wifi users are unlikely to be typical passengers and behaviour will change over time. TfL’s hope is the inferred profiles will be useful enough to inform business decisions. Privacy-preserving measurement techniques should be judged by the business value of the passenger models they create, not against how accurate they are at following individual passengers around underground stations in the past. As the saying goes, “all models are wrong, but some are useful”.

Simulating privacy-preserving mobility measurement

To explore this space, I built a simple simulation of Euston Station inspired by one of the TfL case studies. In my simulation, there are two platforms (A and B) and six types of passengers. Some travel from platform A to B; some from B to A; others enter and leave the station at one platform (A or B). Of the passengers that travel between platforms, they can take either the fast route (taking 2 minutes on average) or the slow route (taking 4 minutes on average). Passengers enter the station at a Poisson arrival rate averaging one per second. The probabilities that each new passenger is of a particular type are shown in the figure below. The goal of the simulation is to infer the number of passengers of each type from observations of wifi measurements taken at platforms A and B.

Continue reading Measuring mobility without violating privacy – a case study of the London Underground

A Reflection on the Waves Of Malice: Malicious File Distribution on the Web (part 2)

The first part of this article introduced the malicious file download dataset and the delivery network structure. This final part explores the types of files delivered, discusses how the network varies over time, and concludes with challenges for the research community.

The Great Divide: A PUP Ecosystem and a Malware Ecosystem

We found a notable divide in the delivery of PUP and malware. First, there is much more PUP than malware in the wild: we found PUP-to-malware ratios of 5:1 by number of SHA-2s, and 17:2 by number of raw downloads. Second, we found that mixed delivery mechanisms of PUP and malware are not uncommon (e.g., see our Opencandy case study in the paper). Third, the highly connected Giant Component is predominantly a PUP Ecosystem (8:1 PUP-to-malware by number of SHA-2s), while the many “islands” of download activity outside of this component are predominantly a Malware Ecosystem (1.78:1 malware-to-PUP by number of SHA-2s).

Comparing the structures of the two ecosystems,we found that the PUP Ecosystem leverages a higher degree of IP address and autonomous system (AS) usage per domain and per dropper than the Malware Ecosystem, possibly indicating higher CDN usage or the use of evasive fast-flux techniques to change IP addresses (though, given earlier results, the former is the more likely). On the other hand, the Malware Ecosystem was attributed with fewer SHA-2s being delivered per domain than the PUP Ecosystem with the overall numbers in raw downloads remaining the same, which could again be indicative of a disparity in the use of CDNs between the two ecosystems (i.e., CDNs typically deliver a wide range of content). At the same time, fewer suspicious SHA-2s being delivered per domain could also be attributable to evasive techniques being employed (e.g., malicious sites delivering a few types of files before changing domain) or distributors in this ecosystem dealing with fewer clients and smaller operations.

We tried to estimate the number of PPIs in the wild by defining a PPI service as a network-only component (or group of components aggregated by e2LD) that delivered more than one type of malware or PUP family. Using this heuristic, we estimated a lower bound of 394 PPIs operating on the day, 215 of which were in the PUP Ecosystem. In terms of proportions, we found that the largest, individual PPIs in the PUP and Malware Ecosystems involved about 99% and 24% of all e2LDs and IPs in their ecosystems, respectively.

With there being a number of possible explanations for these structural differences between ecosystems, and such a high degree of potential PPI usage in the wild (especially within the PUP Ecosystem), this is clearly an area in which further research is required.

Keeping Track of the Waves

The final part of the study involved tracking these infrastructures and their activities over time. Firstly, we generated tracking signatures of the network-only (server-side) and file-only (client-side) delivery infrastructures. In essence, this involved tracking the root and trunk nodes in a component, which typically had the highest node degrees, and thus, were more likely to be stable, as opposed to the leaf nodes, which were more likely to be ephemeral.

Continue reading A Reflection on the Waves Of Malice: Malicious File Distribution on the Web (part 2)

A Reflection on the Waves Of Malice: Malicious File Distribution on the Web (part 1)

The French cybercrime unit, C3N, along with the FBI and Avast, recently took down the Retadup botnet that infected more than 850,000 computers, mostly in South America. Though this takedown operation was successful, the botnet was created as early as 2016, with the operators reportedly making millions of euros since. It is clear that large-scale analysis, monitoring, and detection of malicious downloads and botnet activity, even as far back as 2016, is still highly relevant today in the ongoing battle against increasingly sophisticated cybercriminals.

Malware delivery has undergone an impressive evolution since its inception in the 1980s, moving from being an amateur endeavor to a well-oiled criminal business. Delivery methods have evolved from the human-centric transfer of physical media (e.g., floppy disks), sending of malicious emails, and social engineering, to the automated delivery mechanisms of drive-by downloads (malicious code execution on websites and web advertisements), packaged exploit kits (software packages that fingerprint user browsers for specific exploits to maximise the coverage of potential victims), and pay-per-install (PPI) schemes (botnets that are rented out to other cybercriminals).

Furthermore, in recent times, researchers have uncovered the parallel economy of potentially unwanted programs (PUP), which share many traits with the malware ecosystem (such as their delivery through social engineering and PPI networks), while being primarily controlled by different actors. However with some types of PUP, including adware and spyware, PUP has generally been regarded as an annoyance rather than a direct threat to security.

Using the download metadata of millions of users worldwide from 2015/16, we (Colin C. Ife, Yun Shen, Steven J. Murdoch, Gianluca Stringhini) carried out a comprehensive measurement study in the short-term (a 24-hour period), the medium-term (daily, over the course of a month), and the long-term (weekly, over the course of a year) to characterise the structure of this complex malicious file delivery ecosystem on the Web, and how it evolves over time. This work provides us with answers to some key questions, while, at the same time, posing some more and exemplifying some significant issues that continue to hinder security research on unwanted software activity.

An Overview

There were three main research questions that influenced this study, which we will traverse in the following sections of this post:

1. What does the malicious file delivery ecosystem look like?
2. How do the networks that deliver only malware, only PUP, or both compare in structure?
3. How do these file delivery infrastructures and their activities change over time?

For full technical details, you can refer to our paper – Waves of Malice: A Longitudinal Measurement of the Malicious File Delivery Ecosystem on the Web – published by and presented at the ACM AsiaCCS 2019 conference.

The Data

The dataset was provided (and pre-sanitized) by Symantec and consisted of 129 million download events generated by 12 million users. Each download event contained information such as the timestamp, the SHA-2s of the downloaded file and its parent file, the filename, the size (in bytes), the referrer URL, Host URLs (landing pages after redirection) of the download and parent file, and the IP address hosting the download.

Continue reading A Reflection on the Waves Of Malice: Malicious File Distribution on the Web (part 1)

Beyond Regulators’ Concerns, Facebook’s Libra Cryptocurrency Faces another Big Challenge: The Risk of Fraud

Facebook has attracted attention through the announcement of their blockchain-based payment network, Libra. This won’t be the first payment system Facebook has launched, but what makes Facebook’s Libra distinctive is that rather than transferring Euros or dollars, the network is designed for a new cryptocurrency, also called Libra. This currency is backed by a reserve of nationally-issued currencies, and so Facebook hopes it will avoid the high volatility of cryptocurrencies like Bitcoin. As a result, Libra won’t be attractive to currency speculators, but Facebook hopes that it will, therefore, be useful for its stated goal – to be a “simple global currency and financial infrastructure that empowers billions of people.”

Reducing currency volatility is only one step towards meeting this goal of scaling cryptocurrencies to billions of users. The Libra blockchain design addresses how the network can maintain the high throughput and low transaction fees needed to compete with existing payment networks like Visa or MasterCard. However, a question that is equally important but as yet unanswered is how Facebook will develop a secure authentication and fraud prevention system that can scale to billions of users while maintaining good usability and low cost.

Facebook designed the Libra network, but in contrast to traditional payment networks, the Libra network is open. Anyone can send transactions through the network, and anyone can write programs (known as “smart contracts”) that control how, and under what conditions, funds can move between Libra accounts. To comply with anti-money-laundering regulations, Know Your Customer (KYC) checks will be performed, but only when Libra enters or leaves the network through exchanges. Transactions moving funds within the network should be accepted if they meet the criteria set out in the applicable smart contract, regardless of who sent them.

The Libra network isn’t even restricted to transactions transferring the Libra currency. Facebook has explicitly designed the Libra blockchain to make it easy for anyone to implement their own currency and benefit from the same technical facilities that Facebook designed for its currency. Other blockchains have tried this. For example, Ethereum has spawned hundreds of special-purpose currencies. But programming a smart contract to implement a new currency is difficult, and errors can be costly. The programming language for smart contracts within the Libra network is designed to help developers avoid some of the most common mistakes.

Facebook’s Libra and Securing the Calibra Wallet

There’s more to setting up an effective currency than just the technology: regulatory compliance, a network of exchanges, and monetary policy are essential. Facebook, through setting up the Libra Association, is focusing its efforts here solely on the Libra currency. The widespread expectation is, therefore, at least initially, the Libra cryptocurrency will be the dominant usage of the network, and most users will send and receive funds through the Calibra wallet smartphone app, developed by a Facebook subsidiary. From the perspective of the vast majority of the world, the Calibra wallet will be synonymous with Facebook’s Libra, and so damage to trust in Calibra will damage the reputation of Libra as a whole.

Continue reading Beyond Regulators’ Concerns, Facebook’s Libra Cryptocurrency Faces another Big Challenge: The Risk of Fraud

Tracing transactions across cryptocurrency ledgers

The Bitcoin whitepaper specifies the risks of revealing owners of addresses. It states that “if the owner of a key is revealed, linking could reveal other transactions that belonged to the same owner.” Five years later, we have seen many projects which look at de-anonymising entities in Bitcoin. Such projects use techniques such as address tagging and clustering to tie many addresses to one entity, making it easier to analyse the movement of funds. However, this is not only limited to Bitcoin but also occurs on alternative cryptocurrencies such as Zcash and Monero. Thus tracing transactions on-chain is a known and studied problem.

But we have recently seen a shift into entities performing cross-currency trades. For example, the WannaCry hackers laundered over $142,000 Bitcoin from ransoms across cryptocurrencies. The issue here is that cross-chain transactions appear to be indistinguishable from native transactions on-chain. For example, to trade Bitcoin for Monero, one would have to send the exchange bitcoin, and in return, the exchange sends the user some coins in Monero. Both these transactions occur on separate chains and do not appear to be connected, so the actual swap can appear to be obscured. This level of obscurity can be used to hide the original flow of coins, giving users an additional form of anonymity.

Thus it is important to ask whether or not we can analyse such transactions and the extent of the analysis possible, and if so, how? In our paper being presented today at the USENIX Security Symposium, we (Haaroon Yousaf, George Kappos and Sarah Meiklejohn) answer these questions.

Our Research

In summary, we scraped and linked over 1.3 million transactions across different blockchains from the service ShapeShift. In doing so, we found over 100,000 cases where users would convert coins to another currency then move right back to the original one, identified that a Bitcoin address associated with CoinPayments.net address is a very popular service for users to shift to, and saw that scammers preferred shifting their Ethereum to Bitcoin and Monero.

We collected and analysed 13 months of transaction data across eight different blockchains to identify how users interacted with this service. In doing so, we developed new heuristics and identified various patterns of cross-currency trades.

What is ShapeShift?

ShapeShift is a lightweight cross-currency non-custodial service that facilitates trades which allows users to directly trade coins from one currency to another (a cross-currency shift). This service acts as the entity which facilitates the entire trade, allowing users to essentially swap their coins with its own supply. ShapeShift and Changelly are examples of such services.

Continue reading Tracing transactions across cryptocurrency ledgers