How to do Zero-Knowledge from Discrete-Logs in under 7kB

We recently had our annual conference for the Academic Centres of Excellence in the UK and I am proud that Jonathan Bootle from UCL won the PhD student elevator pitch competition. I’ll now hand over the rest of the blog post to Jonathan to explain the communication reduction technique for zero-knowledge arguments he presented.

— Jens Groth

In zero-knowledge arguments, a verifier wants to check that a prover possesses secret knowledge satisfying some stated conditions. The prover wants to convince the verifier without disclosing her secrets. Security is based on cryptographic assumptions, like the discrete logarithm assumption. In 1997, Cramer and Damgård produced a discrete log argument requiring the prover to send N pieces of data to the verifier, for a statement of size N. Groth improved this to √N in 2009 and for five years, this was thought to be the best possible. That is, until cryptographers at UCL, including myself, discovered an incredible trick, which shattered the previous record. So how does it work?


The first step is to compile the statement into the correct format for our protocol. Existing compilers take practical statements, such as verifying the correct execution of a C program, and convert them into a type of circuit. We then convert this circuit into a vector equation for the verifier to check. A circuit of size N produces a vector of size roughly N. So far, so good.


We also need use a commitment scheme, which works like a magician’s envelope. By committing to a hidden value, it can be fixed, and revealed later.


In previous works in the discrete logarithm setting, the prover commits to the vector using a single commitment.


Later the prover reveals the whole vector. This gives high communication costs, and was a major bottleneck.


We discovered a rare and extraordinary property of the commitment scheme which allowed us to cut a vector in half and compress the two halves together. Applying the same trick repeatedly produces a single value which is easy to send, and drastically reduces the communication costs of the protocol.


We begin with a vector of roughly N elements. If we repeatedly cut the vector in half, it takes log(N) cuts to produce a vector containing only a single element. Each time that the prover cuts a vector in half, they must make new commitments to enable the verifier to check the shorter vectors produced. Overall, roughly log(N) new commitments are produced, so the prover must send about log(N) values to the verifier. This is a dramatic improvement from √N. To put this in perspective, doubling the size of the statement only adds a constant amount of data for the prover to send.

Of course, it is possible to base security on other cryptographic assumptions, but other efficient protocols often use very strong and untested assumptions which have yet to receive a high level of scrutiny. By contrast, the discrete logarithm assumption has been used and studied for over three decades.

Is this just a theoretical advance? Far from it. Our protocol has a working Python implementation, which has been extensively benchmarked. By combining our code with the compilers that I mentioned earlier, we verified the correct execution of C programs. By sending under 7kB of data each time, we can verify matrix multiplications, simulations of the motion of gas molecules, and SHA-1 hashes. It’s nice to see theory that compiles and works.


The full paper is by Jonathan Bootle, Andrea Cerulli, Pyrros Chaidos, Jens Groth and Christophe Petit and was published at EUROCRYPT 2016.


Yes, we have no receipts

Internet voting is a hard problem: there are many ways to fail, and the cost of failing is high. Furthermore, the security requirements appear to be self-contradicting at times. Verifiability requires that a voter, let’s say Alice, must have sufficient knowledge about her ballot to detect if it was tampered with, while Privacy requires that Alice’s vote remains secret, even if Alice herself is bribed or forced into revealing everything she knows.

Can we build a system that allows Alice to verify that her ballot remains unchanged, but also allows her to “forget” who she voted for?

In joint work with Véronique Cortier, Georg Fuchsbauer and David Galindo we present a solution to this problem, in the form of BeleniosRF. We use rerandomizable encryption to change how ballots are encrypted without changing their contents. Digital signatures and zero knowledge techniques ensure ballot integrity, and prevent any tampering during the rerandomization.

Verifiability and privacy

Verifiability means that there must be some way for voters to monitor the election such that they can detect foul play from other voters, or even officials. This is often split into two notions: individual verifiability, dealing with a voter monitoring the way her vote was processed (i.e. that it was not changed or simply thrown away), and universal verifiability, covering requirements for the entirety of the election (e.g. “every ballot must be valid”).

Privacy, in its simplest form, keeps the contents of a vote secret from malicious observers, or even authorities and other compromised voters. In plain terms, we don’t want hackers to decrypt votes, and we don’t allow authorities to decrypt votes when they’re not supposed to; current systems anonymize votes by shuffling or summing them before they are decrypted.

Unfortunately, the above discussion fails to cover one potential adversary: the voter herself. An adversarial voter might be fully corrupt, as in the case of a vote seller who does not value her vote apart from any potential financial gain, or simply be placed in a coercive environment where she is “encouraged” to vote the “right” way. A good voting system should aim to hinder vote sellers and protect coerced users. This might not be possible in all cases: if a voter is monitored 24/7 or simply prevented from voting, the problem may well be unsolvable.

Continue reading Yes, we have no receipts


QUUX: a QUIC un-multiplexing of the Tor relay transport

Latency is a key factor in the usability of web browsing. This has added relevance in the context of anonymity systems such as Tor, because the anonymity property is strengthened by having a larger user-base.

Decreasing the latency of typical web requests in Tor could encourage a wider user base, making it more viable for typical users who value their privacy and less conspicuous for the people who most need it. With this in mind for my MSc Information Security project at UCL, supervised by Dr Steven J. Murdoch, I looked at the transport subsystem used by the Tor network, hoping to improve its performance.

After a literature review of the area (several alternative transport designs have been proposed in the past), I started to doubt my initial mental model for an alternative design.

Data flow in Freedom
Data flow in Freedom (Murdoch, 2011)
Data flow in Tor
Data flow in Tor (Murdoch, 2011)

These diagrams show an end-to-end design (Freedom) and hop-by-hop design (Tor) respectively. In the end-to-end design, encrypted IP packets are transported between relays using UDP, with endpoints ensuring reliable delivery of packets. In the hop-by-hop design, TCP data is transported between relays, with relays ensuring reliable delivery of data.

The end-to-end Freedom approach seems elegant, with relays becoming somewhat closer to packet routers, however it also leads to longer TCP round-trip times (RTT) for web browser HTTP connections. Other things being equal, a longer TCP RTT will result in a slower transfer. Additional issues include difficulty in ensuring fairness of utilisation (requiring an approach outlined by Viecco), and potentially greater vulnerability to latency-based attack.

Therefore I opted to follow the hop-by-hop transport approach Tor currently takes. Tor multiplexes cells for different circuits over a single TCP connection between relay-pairs, and as a result a lost packet for one circuit could hold up all circuits that share the same connection (head-of-line blocking). A long-lived TCP connection is beneficial for converging on an optimal congestion window size, but the approach suffers from head-of-line blocking and doesn’t compete effectively with other TCP connections using the same link.

To remedy these issues, I made a branch of Tor which used a QUIC connection in place of the long-lived TCP connection. Because a QUIC connection carries multiple TCP-like streams, it doesn’t suffer from head-of-line blocking. The streams also compete for utilisation at the same level as TCP connections, allowing them to more effectively use either the link capacity or the relay-configured bandwidth limit.

Download time for a 320KiB file
Download time for a 320KiB file

Initial results from the experiments are promising, as shown above. There’s still a way to go before such a design could make it into the Tor network. This branch shows the viability of the approach for performance, but significant engineering work still lies ahead to create a robust and secure implementation that would be suitable for deployment. There will also likely be further research to more accurately quantify the performance benefits of QUIC for Tor. Further details can be found in my MSc thesis.


Battery Status readout as a privacy risk

Privacy risks and threats arise even in seemingly innocuous mechanisms. It is a fairly regular issue.

Over a year ago, I was researching the risk of the W3C Battery Status API. The mechanism allows a web site to read the battery level of a device (smartphone, laptop, etc.). One of the positive use cases may be, for example, stopping the execution of intensive operations if the battery is running low.

Our privacy analysis of Battery Status API revealed interesting results.

Privacy analysis of Battery API

The battery status provides the following information:

  • the current level of battery (format: 0.0–1.0, for empty and full battery, respectively)
  • time to a full discharge of battery (in seconds)
  • time to a full charge of battery, if connected to a charger (in seconds)

These items are updated whenever a new value is supplied by the operating system

It turns out that privacy risks may surface even in this kind of – seemingly innocuous – data and access mechanisms.

Frequency of changes

The frequency of changes in the reported readouts from Battery Status API potentially allow the monitoring of users’ computer use habits; for example, potentially enabling analyzing of how frequently the user’s device is under heavy use. This could lead to behavioral analysis.

Additionally, identical installations of computer deployments in standard environments (e.g. at schools, work offices, etc.) are often are behind a NAT. In simple terms, NAT allows a number of users to browse the Internet with an – externally seen – single IP address. The ability of observing any differences between otherwise identical computer installations – potentially allows particular users to be identified (and targeted?).

Battery readouts as identifiers

The information provided by the Battery Status API is not always subject to rapid changes. In other words, this information may be static for a period of time; this in turn may give rise to a short-lived identifier. The situation gets especially interesting when we consider a scenario of users sometimes clearing standard web identifiers (such as cookies). In such a case, a web script could potentially analyse identifiers provided by Battery Status API, and this information then could possibly even lead to re-creation of other identifiers. A simple sketch follows.

Continue reading Battery Status readout as a privacy risk


Steven Murdoch – Privacy and Financial Security

Probably not too many academic researchers can say this: some of Steven Murdoch’s research leads have arrived in unmarked envelopes. Murdoch, who has moved to UCL from the University of Cambridge, works primarily in the areas of privacy and financial security, including a rare specialty you might call “crypto for the masses”. It’s the financial security aspect that produces the plain, brown envelopes and also what may be his most satisfying work, “Trying to help individuals when they’re having trouble with huge organisations”.

Murdoch’s work has a twist: “Usability is a security requirement,” he says. As a result, besides writing research papers and appearing as an expert witness, his past includes a successful start-up. Cronto, which developed a usable authentication device, was acquired by VASCO, a market leader in authentication and is now used by banks such as Commerzbank and Rabobank.

Developing the Cronto product was, he says, an iterative process that relied on real-world testing: “In research into privacy, if you build unusable system two things will go wrong,” he says. “One, people won’t use it, so there’s a smaller crowd to hide in.” This issue affects anonymising technologies such as Mixmaster and Mixminion. “In theory they have better security than Tor but no one is using them.” And two, he says, “People make mistakes.” A non-expert user of PGP, for example, can’t always accurately identify which parts of the message are signed and which aren’t.

The start-up experience taught Murdoch how difficult it is to get an idea from research prototype to product, not least because what works in a small case study may not when deployed at scale. “Selling privacy remains difficult,” he says, noting that Cronto had an easier time than some of its forerunners since the business model called for sales to large institutions. The biggest challenge, he says, was not consumer acceptance but making a convincing case that the predicted threats would materialise and that a small company could deliver an acceptable solution.

Continue reading Steven Murdoch – Privacy and Financial Security


Moving towards security and privacy experiments for the real world

Jono and I recently presented our joint paper with Simon and Angela at the Learning from Authoritative Security Experiment Results (LASER) Workshop in San Jose, CA, USA. The workshop was co-located with the IEEE Security and Privacy Symposium. LASER has a different focus each year; in 2016, presented papers explored new approaches to computer security experiments that are repeatable and can be shared across communities.

Through our LASER paper, “Towards robust experimental design for user studies in security and privacy”, we wanted to advance the quest for better experiment design and execution. We proposed the following five principles for conducting robust experiments into usable security and privacy:

  1. Give participants a primary task
  2. Ensure participants experience realistic risk
  3. Avoid priming the participants
  4. Perform experiments double-blind whenever possible
  5. Define these elements precisely: threat model; security; privacy and usability

Understanding users and their interaction with security is a blind spot for many security practitioners and designers. Learning from prior studies within and outside our research group, we have defined principles for conducting robust experiments into usable security and privacy. These principles are informed by efforts in other fields such as biology, qualitative research methods, and medicine, where four overarching experiment-design factors guided our principles:

Internal validity – The experiment is of “suitable scope to achieve the reported results” and is not “susceptible to systematic error”.

External validity – The result of the experiment “is not solely an artifact of the laboratory setting”.

Containment  – There are no “confounds” in the results, and no experimental “effects are a threat to safety” of the participants, the environment, or society generally.

Transparency – “There are no explanatory gaps in the experimental mechanism” and the explanatory “diagram for the experimental mechanism is complete”, in that it covers all relevant entities and activities.

Continue reading Moving towards security and privacy experiments for the real world


Do you know what you’re paying for? How contactless cards are still vulnerable to relay attack

Contactless card payments are fast and convenient, but convenience comes at a price: they are vulnerable to fraud. Some of these vulnerabilities are unique to contactless payment cards, and others are shared with the Chip and PIN cards – those that must be plugged into a card reader – upon which they’re based. Both are vulnerable to what’s called a relay attack. The risk for contactless cards, however, is far higher because no PIN number is required to complete the transaction. Consequently, the card payments industry has been working on ways to solve this problem.

The relay attack is also known as the “chess grandmaster attack”, by analogy to the ruse in which someone who doesn’t know how to play chess can beat an expert: the player simultaneously challenges two grandmasters to an online game of chess, and uses the moves chosen by the first grandmaster in the game against the second grandmaster, and vice versa. By relaying the opponents’ moves between the games, the player appears to be a formidable opponent to both grandmasters, and will win (or at least force a draw) in one match.

Similarly, in a relay attack the fraudster’s fake card doesn’t know how to respond properly to the payment terminal because, unlike a genuine card, it doesn’t contain the cryptographic key known only to the card and the bank that verifies the card is genuine. But like the fake chess grandmaster, the fraudster can relay the communication of the genuine card in place of the fake card.

For example, the victim’s card (Alice, in the diagram below) would be in a fake or hacked card payment terminal (Bob) and the criminal would use the fake card (Carol) to attempt a purchase in a genuine terminal (Dave). The bank would challenge the fake card to prove its identity, this challenge is then relayed to the genuine card in the hacked terminal, and the genuine card’s response is relayed back on behalf of the fake card to the bank for verification. The end result is that the terminal used for the real purchase sees the fake card as genuine, and the victim later finds an unexpected and expensive purchase on their statement.

A rigged payment terminal capable of performing the relay attack can be made from off-the-shelf components
The relay attack, where the cards and terminals can be at any distance from each other

Demonstrating the grandmaster attack

I first demonstrated that this vulnerability was real with my colleague Saar Drimer at Cambridge, showing on television how the attack could work in Britain in 2007 and in the Netherlands in 2009.

In our scenario, the victim put their card in a fake terminal thinking they were buying a coffee when in fact their card details were relayed by a radio link to another shop, where the criminal used a fake card to buy something far more expensive. The fake terminal showed the victim only the price of a cup of coffee, but when the bank statement arrives later the victim has an unpleasant surprise.

At the time, the banking industry agreed that the vulnerability was real, but argued that as it was difficult to carry out in practice it was not a serious risk. It’s true that, to avoid suspicion, the fraudulent purchase must take place within a few tens of seconds of the victim putting their card into the fake terminal. But this restriction only applies to the Chip and PIN contact cards available at the time. The same vulnerability applies to today’s contactless cards, only now the fraudster need only be physically near the victim at the time – contactless cards can communicate at a distance, even while the card is in the victim’s pocket or bag.

Continue reading Do you know what you’re paying for? How contactless cards are still vulnerable to relay attack


Analyzing privacy aspects of the W3C Vibration API

When making web standards, multiple scenarios possibly affecting privacy are considered. This includes even extreme ones; and this is a good thing. It’s best to predict the creative use and abuse of web features, before they are exploited.

Vibration API

The mechanism allowing websites to utilize a device’s vibration motor is called the Vibration API. The mechanism allows a device to be vibrated in particular patterns. The argument to the vibration() function is a list called a pattern. The list’s odd indices cause a vibration for a specific length of time, and even values are the still periods. For example, a web designer can make the device to vibrate for a specific duration, say 50 ms and follow that with a still period of 100 ms using the following call:


In certain circumstances this can create several interesting potential privacy risks. Let’s look at the Vibration API from a privacy point of view. I will consider a number of scenarios on various technical levels.

Toy de-anonymisation scenario

One potential risk is the identification of a particular person in real life. Imagine several people in the same room placing their devices on a table. At some point, one person’s device vibrates in specific patterns. This individual might then become marked to a potential observer.

How could such a script be delivered? One possibility is though web advertising infrastructures. These offer capabilities of targeting individuals with a considerable accuracy (with respect to their location).

Continue reading Analyzing privacy aspects of the W3C Vibration API


Microsoft Ireland: winning the battle for privacy but losing the war

On Thursday, Microsoft won an important federal appeals court case against the US government. The case centres on a warrant issued in December 2013, requiring Microsoft to disclose emails and other records for a particular email address which was related to a narcotics investigation. It transpired that these emails were stored in a Microsoft datacenter in Ireland, but the US government argued that, since Microsoft is a US company and can easily copy the data into the US, a US warrant would suffice. Microsoft argued that the proper way for the US government to obtain the data is through the Mutual Legal Assistance Treaty (MLAT) between the US and Ireland, where an Irish court would decide, according to Irish law, whether the data should be handed over to US authorities. Part of the US government’s objection to this approach was that the MLAT process is sometimes very slow, although though the Irish government has committed to consider any such request “expeditiously”.

The appeal court decision is an important victory for Microsoft (following two lower courts ruling against them) because they sell their european datacenters as giving their european customers confidence that their data will be subject to the more stringent european privacy laws. Microsoft’s case was understandably supported by other technology companies in the same position, as well as civil liberties organisations such as the Electronic Frontier Foundation in the US and the Open Rights Group in the UK. However, I have mixed opinions about the outcome: while probably the right decision in this case, the wider consequences could be detrimental to privacy.

Both sides of the case wanted to set a precedent (if not legally, at least in practice). The US government wanted US law to apply to data held by US companies, wherever in the world the data resides. Microsoft wanted the location of the data to imply which legal regime applied, and so their customers could be confident that their own country’s laws will be respected, provided Microsoft have a datacenter in their own country (or at least one with compatible laws). My concern is that this ruling will give false assurance to customers of US companies, because in other circumstances a different decision could quite easily be taken.

We know about this case because Microsoft chose to challenge it in court, and were able to do so. This is the first time Microsoft has challenged a US warrant for data stored in their Irish datacenter despite it being in operation for three years prior to the case. Had the email address been associated with a more serious crime, or the demand for emails accompanied by a gagging order, it may not have been challenged. Microsoft and other technology companies may still choose to accept, or may even be forced to accept, the applicability of future US warrants to data they control, regardless of the court decision last week. One extreme approach to compel this approach would be for the US to jail employees until their demands are complied with.

For this reason, I have argued that control over data is more important than where data resides. If a company does not have the technical capability to comply with an order, it is easier for them to defend their case, and so protects both the company’s customers and staff. Microsoft have taken precisely this approach for their new German datacenters, which will be operated by staff in Germany working for a German “data trustee” (Deutsche Telekom). In contrast to their Irish datacenter, Microsoft staff will be unable to access customer data, except with the permission of and oversight from the data trustee.

While the data trustee model resists information being obtained through improper legal means, a malicious employee could still break rules for personal gain, or the systems designed to process legal requests could be hacked into. With modern security techniques it is possible to do better. End-to-end encryption for instant messaging is one such example, because (if designed properly) the communications provider does not have access to messages they carry. A more sophisticated approach is “distributed consensus”, where a decision is only taken if a majority of participants agree. The consensus process is automated and enforced through cryptography, ensuring that rules are respected even if some participants are malicious. Critical decisions in the Tor network and in Bitcoin are taken this way. More generally, there is a growing recognition that purely legal or procedural mechanisms are insufficient to protect privacy. This is one of the common threads present in much of the research presented at the Privacy Enhancing Technologies Symposium, being held this week in Darmstadt: recognising that there will always be imperfections in software, people and procedures and showing that nevertheless individual’s privacy can still be protected.


Cybersecurity: Supporting a Resilient and Trustworthy System for the UK

Yesterday, the Royal Society published their report on cybersecurity policy, practice and research – Progress and Research in Cybersecurity: Supporting a Resilient and Trustworthy System for the UK. The report includes 10 recommendations for government, industry, universities and research funders, covering the topics of trust, resilience, research and translation. This major report was written based on evidence gathered from an open call, as well as meetings with key stakeholders, guided by a steering committee which included UCL members M. Angela Sasse and Steven Murdoch. Here, we summarise what we think are the most important signposts for cybersecurity research and practice.

The report points out that, as online technology and services touches nearly everyone’s lives, the role of cybersecurity is to support a resilient digital economy and society in the UK. Previously, the government focus was very much on national security – but it is just as important that we are able to secure our personal data, financial assets and homes, and that our decisions as consumers and citizens are not manipulated or subverted. The report rightly states that the national authority for cybersecurity needs to be transparent, expert and have a clear and widely-understood remit. The creation of the National Cyber Security Center (NCSC) may be a first step towards this, but the report also points out that currently, it is to be under control of GCHQ – and this is bound to be a problem given the lack of trust they have from parts of industry and civil society, as a result of their role in subverting the development of security standards in order to make surveillance easier.

The report furthermore recommends that the government preserves the robustness of encryption, including end-to-end encryption and promotes its widespread use. Encryption and other computer security measures provides the foundation that allows individuals to trust organisations and attempts to weaken these measures in order to facilitate surveillance will create security risks and reduce robustness. Whether weaknesses are created by requiring fragile encryption algorithms or mandating exceptional access, these attempts increase the risk of unauthorised parties gaining access to sensitive computer systems.

The report also rightly says that companies need to take more responsibility for cyber security: to be a trustworthy business partner or service provider, they need to be competent, and have the correct motivation. “Dumping” the risks associated with online transactions on customers or business partners who don’t have skills and resources to deal with them, and hiding this in complex terms and conditions, is not trustworthy behaviour. Making companies take liability for the security failures will likely play a part in improving trustworthiness, but needs to be done carefully. Important open source software such as OpenSSL is developed by a handful of people in their spare time. When something goes wrong (such as Heartbleed), multi-billion dollar companies who built their business around open source software without contributing or even properly evaluating the risk, should not be able to assign liability to the volunteer developers. Companies should also be transparent and be required to disclose vulnerabilities and breaches. The report calls for such disclosures to be made to a central body, but we would go further and recommend that they be disclosed to the customers exposed to risks as a result of the security failures.

In order to improve and demonstrate competence in cybersecurity, we need evidence-based guidance on state-of-the-art cybersecurity principles, standards and practices. These go further than just following widely used industry practice, or following craft knowledge based on expert opinion, but should be an an ambitious set of criteria which have been demonstrated to make a pronounced improvement in security. A significant effort is required to transform what is currently a set of common practices (the term “best practice” is a misnomer) through empirical tests and measurements into a set of practices and tools that we know to be effective and efficient under real-world conditions (this is the mission of The Research Institute in Science of Cyber Security (RISCS), which has just started a new 5 year phase). The report in particular calls for research on ways to quantify the security offered by anonymization algorithms and anonymous communication techniques, as these perform an critical role in supporting privacy by design.

The report calls for more research, and new means to assess and support research. Cybersecurity is an international field, and research funders should seek for peer-review to be performed by the best expertise available internationally and to remove barriers to international and multidisciplinary research. However, supporting multidisciplinary research should not be at the expense of addressing the many hard technical problems which remain. The report also identifies the benefits of challenge-led funding, where a research programme is led by a world-leading expert with substantial freedom in how research funds are distributed. For this model to work it is critical to create the right environment for recruiting international experts to both lead and participate in such challenges, which as fellow steering-group member Ross Anderson has pointed out, the vote to leave the EU has seriously harmed. Finally, the report calls for improvements to the research commercialisation process, including that universities priorities getting research out into the real world over trying to extract as much money as possible, and that new investment sources are developed to fill in the gaps left by traditional venture capital, such as for software developed for the public good.