Data is increasingly collected and shared, with potential benefits for both individuals and society as a whole, but people cannot always be confident that their data will be shared and used appropriately. Decisions made with the help of sensitive data can greatly affect lives, so there is a need for ways to hold data processors accountable. This requires not only ways to audit these data processors, but also ways to verify that the reported results of an audit are accurate, while protecting the privacy of individuals whose data is involved.
Not only is it important that requests are auditable, the requested data can also be used as evidence in legal proceedings. In this case, it is necessary to ensure the integrity of the data or to rely on representatives of data providers and expert witnesses, the latter being more expensive and requiring trust in third parties.
In the healthcare case, individuals usually consent for their GP or any medical professional they interact with to have access to relevant medical records, but may have concerns about the way their information is then used or shared. The NHS regularly shares data with researchers or companies like DeepMind, sometimes in ways that may reduce the trust levels of individuals, despite the potential benefits to healthcare.
The EFAIL vulnerability in the OpenPGP and S/MIME secure email systems, publicly disclosed yesterday, allows an eavesdropper to obtain the contents of encrypted messages. There’s been a lot of finger-pointing as to which particular bit of software is to blame, but that’s mostly irrelevant to the people who need secure email. The end result is that users of encrypted email, who wanted formatting better than what a mechanical typewriter could offer, were likely at risk.
One of the methods to exploit EFAIL relied on the section of the email standard that allows messages to be in multiple parts (e.g. the body of the message and one or more attachments) – known as MIME (Multipurpose Internet Mail Extensions). The authors of the EFAIL paper used the interaction between MIME and the encryption standard (OpenPGP or S/MIME as appropriate) to cause the email client to leak the decrypted contents of a message.
However, not only can MIME be used to compromise the secrecy of messages, but it can also be used to tamper with digitally-signed messages in a way that would be difficult if not impossible for the average person to detect. I doubt I was the first person to discover this, and I reported it as a bug 5 years ago, but it still seems possible to exploit and I haven’t found a proper description, so this blog post summarises the issue.
The problem arises because it is possible to have a multi-part email where some parts are signed and some are not. Email clients could have adopted the fail-safe option of considering such a mixed message to be malformed and therefore treated as unsigned or as having an invalid signature. There’s also the fail-open option where the message is considered signed and both the signed and unsigned parts are displayed. The email clients I looked at (Enigmail with Mozilla Thunderbird, and GPGTools with Apple Mail) opt for a variant of the fail-open approach and thus allow emails to be tampered with while keeping their status as being digitally signed.
I recently came across an interesting white paper published by PwC, “A false sense of security? Cyber-security in the Middle East”. This paper is interesting for a number of reasons. Most obviously, I guess, it’s about an area of the world that’s a bit different from that of my immediate experience in the West and which faces many well-reported challenges. Indeed, it seems, as reported in the PwC paper, that companies and governments in the region suffer from more cyberattacks, resulting in bigger financial losses, than anywhere else in the world.
The paper confirms that many of the problems faced by companies and governments in the Middle East are, as of course one would expect, exactly those faced by their Western counterparts – too often, the cybersecurity industry responds to incidents in a fire-fighting style, rolling out patches in rushed knee-jerk reactions to imminent threats.
The way to counteract these problems is, of course, to train cybersecurity professionals who will be capable of making appropriate strategic and tactical investments in security and able to respond to respond better to attacks. All well and good, but there is global skills deficit in the cybersecurity industry and it seems that this problem is particularly acute in the Middle East; and it seems to be a notable contributory factor to the problems experienced in the region. The problem needs some long-term thinking: in the average user, we need to encourage good security behaviours, which are learned over many years; in the security profession, we need to ensure that there is sufficient upcoming talent to fill our growing needs over the next century.
Exploring this topic a bit, I came across a company called SiConsult, a security services provider (with which I have no personal connection), with offices in the Middle East. They are taking an initiative, which provides students (or, indeed, anyone I think) with an interesting opportunity. They have been thinking about cryptography in the post-quantum world, and how to develop solutions and relevant expertise in the long term.
All public key cryptography as we currently know it may be rendered insecure by the deployment of quantum computers. Your Internet connection to the bank, the keys protecting your Dropbox, and your secure messaging applications will all be compromised. But a quantum computer that can run Shor’s algorithm, which means large numbers can be factorized in polynomial time, is still maybe ten years away (or five, or twenty, or … ). So why should we care now? Well, the consequences of losing the protection of good public-key crypto would be very serious and, consequently, NIST (the US’s National Institute of Standards and Technology) is running a process to standardize quantum-resistant algorithms. The first round of submissions has just closed, but we will have to wait until 2025 for draft standards, which could be too late for some use cases.
As a result of the process timeline, companies and academics are likely to search for their own solutions long before NIST standardizes theirs. SiConsult, the company I mentioned, is inviting students (or anyone else) develop a quantum-safe application messaging application, for a small prize – the Post Quantum Innovation Challenge. What is interesting is that the company’s motivation here is not purely financial – they are not looking to retain ownership of any designs or applications that may be submitted to the competition – but instead they are looking to spark interest in post-quantum cryptography, search for new cybersecurity talent, and encourage cybersecurity education, especially in the Middle East.
Initiatives like the Post Quantum Innovation Challenge are needed to energise those that may be considering a career in cyber security, to make sure that the talent pipeline is flowing well for years to come. Importantly, the barrier for entry to PQIC is relatively low: anyone with an interest in security should consider entering. Perhaps it will go a little of the way towards a solution to both the quantum and education long-term problems.
Selective disclosure credentialsallow the issuance of a credential to a user, and the subsequent unlinkable revelation (or ‘showing’) of some of the attributes it encodes to a verifier for the purposes of authentication, authorisation or to implement electronic cash. While a number of schemes have been proposed, these have limitations, particularly when it comes to issuing fully functional selective disclosure credentials without sacrificing desirable distributed trust assumptions. Some entrust a single issuer with the credential signature key, allowing a malicious issuer to forge any credential or electronic coin. Other schemes do not provide the necessary re-randomisation or blind issuing properties necessary to implement modern selective disclosure credentials. No existing scheme provides all of threshold distributed issuance, private attributes, re-randomisation, and unlinkable multi-show selective disclosure.
We address these challenges in our new work Coconut – a novel scheme that supports distributed threshold issuance, public and private attributes, re-randomization, and multiple unlinkable selective attribute revelations. Coconut allows a subset of decentralised mutually distrustful authorities to jointly issue credentials, on public or private attributes. These credentials cannot be forged by users, or any small subset of potentially corrupt authorities. Credentials can be re-randomised before selected attributes being shown to a verifier, protecting privacy even in the case all authorities and verifiers collude.
Applications to Smart Contracts
The lack of full-featured selective disclosure credentials impacts platforms that support ‘smart contracts’, such as Ethereum, Hyperledger and Chainspace. They all share the limitation that verifiable smart contracts may only perform operations recorded on a public blockchain. Moreover, the security models of these systems generally assume that integrity should hold in the presence of a threshold number of dishonest or faulty nodes (Byzantine fault tolerance). It is desirable for similar assumptions to hold for multiple credential issuers (threshold aggregability). Issuing credentials through smart contracts would be very useful. A smart contract could conditionally issue user credentials depending on the state of the blockchain, or attest some claim about a user operating through the contract—such as their identity, attributes, or even the balance of their wallet.
As Coconut is based on a threshold issuance signature scheme, that allows partial claims to be aggregated into a single credential, it allows collections of authorities in charge of maintaining a blockchain, or a side chain based on a federated peg, to jointly issue selective disclosure credentials.
Coconut is a fully featured selective disclosure credential system, supporting threshold credential issuance of public and private attributes, re-randomisation of credentials to support multiple unlikable revelations, and the ability to selectively disclose a subset of attributes. It is embedded into a smart contract library, that can be called from other contracts to issue credentials. The Coconut architecture is illustrated below. Any Coconut user may send a Coconut request command to a set of Coconut signing authorities; this command specifies a set of public or encrypted private attributes to be certified into the credential (1). Then, each authority answers with an issue command delivering a partial credentials (2). Any user can collect a threshold number of shares, aggregate them to form a consolidated credential, and re-randomise it (3). The use of the credential for authentication is however restricted to a user who knows the private attributes embedded in the credential—such as a private key. The user who owns the credentials can then execute the show protocol to selectively disclose attributes or statements about them (4). The showing protocol is publicly verifiable, and may be publicly recorded.
We use Coconut to implement a generic smart contract library for Chainspace and one for Ethereum, performing public and private attribute issuing, aggregation, randomisation and selective disclosure. We evaluate their performance, and cost within those platforms. In addition, we design three applications using the Coconut contract library: a coin tumbler providing payment anonymity, a privacy preserving electronic petitions, and a proxy distribution system for a censorship resistance system. We implement and evaluate the first two former ones on the Chainspace platform, and provide a security and performance evaluation. We have released the Coconut white-paper, and the code is available as an open-source project on Github.
Coconut uses short and computationally efficient credentials, and efficient revelation of selected attributes and verification protocols. Each partial credentials and the consolidated credential is composed of exactly two group elements. The size of the credential remains constant, and the attribute showing and verification are O(1) in terms of both cryptographic computations and communication of cryptographic material – irrespective of the number of attributes or authorities/issuers. Our evaluation of the Coconut primitives shows very promising results. Verification takes about 10ms, while signing an attribute is 15 times faster. The latency is about 600 ms when the client aggregates partial credentials from 10 authorities distributed across the world.
Existing selective credential disclosure schemes do not provide the full set of desired properties needed to issue fully functional selective disclosure credentials without sacrificing desirable distributed trust assumptions. To fill this gap, we presented Coconut which enables selective disclosure credentials – an important privacy enhancing technology – to be embedded into modern transparent computation platforms. The paper includes an overview of the Coconut system, and the cryptographic primitives underlying Coconut; an implementation and evaluation of Coconut as a smart contract library in Chainspace and Ethereum, a sharded and a permissionless blockchain respectively; and three diverse and important application to anonymous payments, petitions and censorship resistance.
A Hardware Trojan (HT) is a malicious modification of the circuitry of an integrated circuit.
A malicious chip can make a device malfunction in several ways. It has been rumored that a hardware trojan implanted in a Syrian air-defense radar caused it to stop operating during an airstrike, thus instantly minimizing the country’s situational awareness and threat response capabilities. In other settings, hardware trojans may leak encryption keys or other secrets, or even generate weak keys that can be easily recovered by the adversary.
Judging from the abundance of governmental, industrial and academic projects concerned with the prevention and the detection of hardware trojans, there is a consensus regarding the severity of the threat and it’s not taken lightly. DARPA has launched the “Integrity and Reliability of Integrated Circuits“ program aiming to develop techniques for the detection of malicious circuitry. The Intelligence Advanced Research Projects Activity funded a project aiming to redesign the fabrication of integrated circuits, while various other initiatives are currently undergoing (e.g., the COST Action project on “Trustworthy Manufacturing and Utilization of Secure Devices” and the DoD Trusted Foundry program). In addition to these, there are numerous other industrial and academic projects proposing new trojan detection techniques every year, only to be circumvented by follow up work.
But do Hardware Trojans exist?
Ironically, until now there have been no cases where malicious circuitry was detected in military-grade or even commercial chips. With nothing more than rumors to hint about hardware trojans (in places other than academic lab benches), one cannot but question their existence. In other words, is HT design and insertion too complex to be practical, or do our detection tools fail to detect the malicious circuitry embedded in the chips around us?
It could be that both are true: hardware trojans do not exist (yet) as malicious actors are focusing on other aspects of the hardware that are easier to compromise. In all cases where trojans were discovered, the erroneous behavior was traced to the chip’s firmware and not its circuitry. Interestingly, in the vast majority of those incidents the security flaws were attributed to honest fabrication mistakes (e.g., manufacturer failing to disable a testing interface).
Intentional vs. Unintentional Errors
It is safe to always assume that an IC will fail in the worst possible way, at the worst possible time (see Syrian airdefense incident). This “crash n’ burn” approach is common in critical systems (e.g., airplanes, satellites, dams), where any divergence from normal operation will result in an irrecoverable failure of the whole system.
To mitigate the risk, critical system designers employ redundancy techniques to eliminate single points of failure and thus make their setups resilient to faults. A common example are triple-redundant systems used in autopilots. Those systems employ three identical chips sourced from disjoint supply chains and replicate all the navigation computations across them. This allows the system to both tolerate a misbehaving chip and detect its presence.
It is noteworthy, that those systems do not consider the cause of the chip malfunction, and simply assume that they fail in the worst possible way. Following from this, a malicious chip is not significantly different from a defective one. After all, any adversary sophisticated enough to design and insert a hardware trojan is capable of making it indistinguishable from honest manufacturing errors. Similarly, from an operational perspective it makes little sense to distinguish between trojans in the circuitry and trojans in the firmware as the risk they pose for the system is identical.
Distributing Trust for Resilience
Given that it is impossible to achieve 100% detection rates of hardware trojans and errors, it is important that our devices maintain their security properties even in their presence. Our work introduces a new high-level device architecture that is resilient to both. In its core, it uses a redundancy-based architecture and secret-sharing protocols to distribute all secrets and computations among multiple chips. Hence, unless all chips are compromised by the same adversary, the security of the system remains intact. A key point is that those chips should originate from disjoint supply chains. This is to minimize the risk of the same adversary compromising more than one chips. To evaluate its practicality in real-life applications, we built a Hardware Security Module (HSM) that performs standard cryptographic operations (e.g., key generation, decryption, signing) at a very high rate. HSMs are commonly used in operations where security is critical, and an increased transaction throughput is needed (e.g., banking, certification Authorities). A demonstration is shown in the video above, and further details are on our website. Finally, our work can be easily combined with all existing detection and prevention techniques to further decrease the likelihood of compromises.
Since the introduction of Bitcoin in 2008, blockchains have gone from a niche cryptographic novelty to a household name. Ethereum expanded the applicability of such technologies, beyond managing monetary value, to general computing with smart contracts. However, we have so far only scratched the surface of what can be done with such “Distributed Ledgers”.
The EU Horizon 2020 DECODE project aims to expand those technologies to support local economy initiatives, direct democracy, and decentralization of services, such as social networking, sharing economy, and discursive and participatory platforms. Today, these tend to be highly centralized in their architecture.
There is a fundamental contradiction between how modern services harness the work and resources of millions of users, and how they are technically implemented. The promise of the sharing economy is to coordinate people who want to provide resources with people who want to use them, for instance spare rooms in the case of Airbnb; rides in the case of Uber; spare couches of in the case of couchsurfing; and social interactions in the case of Facebook.
These services appear to be provided in a peer-to-peer, and disintermediated fashion. And, to some extent, they are less mediated at the application level thanks to their online nature. However, the technical underpinnings of those services are based on the extreme opposite design philosophy: all users technically mediate their interactions through a very centralized service, hosted on private data centres. The big internet service companies leverage their centralized position to extract value out of user or providers of services – becoming de facto monopolies in many case.
When it comes to privacy and security properties, those centralized services force users to trust them absolutely, and offer little on the way of transparency to even allow users to monitor the service practices to ground that trust. A recent example illustrating this problem was Uber, the ride sharing service, providing a different view to drivers and riders about the fare that was being paid for a ride – forcing drivers to compare what they receive with what riders pay to ensure they were getting a fair deal. Since Uber, like many other services, operate in a non-transparent manner, its functioning depends on users absolute to ensure fairness.
Even more worryingly, the opaque algorithms being used to promote and propagate posts have been associated with creating a filter bubble effect, influencing elections, and dark adverts, only visible to particular users, are able to flout standards of fair political advertising. It is a fact of the 21st century that a key facet of the discursive process of democracy will take place on online social platforms. However, their centralized, opaque and advertising-driven form is incompatible with their function as a tool for democracy.
Finally, the revelations of Edward Snowden relating to mass surveillance, also illustrate how the technical centralization of services erodes privacy at an unprecedented scale. The NSA PRISM program coerced internet services to provide access to data on their services under a FISA warrant, not protecting the civil liberties of non-American persons. At the same time, the UPSTREAM program collected bulk information between data centres making all economic, social and political activities taking place on those services transparent to US authorities. While users struggle to understand how those services operate, governments (often foreign) have total visibility. This is a complete inversion of the principles of liberal democracy, where usually we would expect citizens to have their privacy protected, while those in position of authority and power are expected to be accountable.
The problems of accountability, transparency and privacy are social, but are also based on the fundamental centralized architecture underpinning those services. To address them, the DECODE project brings together technical, legal, social experts from academia, alongside partners from local government and industry. Together they are tasked to develop architectures that are compatible with the social values of transparency, user and community control, and privacy.
The role of UCL Computer Science, as a partner, is to provide technical options into two key technical areas: (1) the scalability of secure decentralized distributed ledgers that can support millions or billions of users while providing high-integrity and transparency to operations; (2) mechanisms for protecting user privacy despite the decentralized and transparent infrastructure. The latter may seem like an oxymoron: how can transparency and privacy be reconciled? However, thanks to advances in modern cryptography, it is possible to ensure that operations were correctly performed on a ledger, without divulging private user data – a family of techniques known as zero-knowledge.
I am particularly proud of the UCL team we have put together that is associated with this project, and strengthens considerably our existing expertise in distributed ledgers.
I will be leading and coordinating the work. I have a long standing interest, and track record, in privacy enhancing technologies and peer-to-peer computing, as well as scalable distributed ledgers – such as the RSCoin currency proposal. Shehar Bano, an expert on systems and networking, has joined us as a post-doctoral researcher after completing her thesis at Cambridge. Alberto Sonnino will be doing his thesis on distributed ledgers and privacy, as well as hardware and IoT applications related to ledgers, after completing his MSc in Information Security at UCL last year. Mustafa Al-Bassam, is also associated with the project and works on high-integrity and scalable ledger technologies, after completing his degree at Kings College London – he is funded by the Turing Institute to work on such technologies. Those join our wider team of UCL CS faculty, with research interests in distributed ledgers, including Sarah Meiklejohn, Nicolas Courtois and Tomaso Aste and their respective teams.
In January 2009, Bitcoin was released into the world by its pseudonymous founder, Satoshi Nakamoto. In the ensuing years, this cryptocurrency and its underlying technology, called the blockchain, have gone on a rollercoaster ride that few could have predicted at the time of its deployment. It’s been praised by governments around the world, and people have predicted that “the blockchain” will one day be like “the Internet.” It’s been banned by governments around the world, and people have declared it “adrift” and “dead.”
After years in which discussions focused entirely on Bitcoin, people began to realize the more abstract potential of the blockchain, and “next-generation” platforms such as Ethereum, Steem, and Zcash were launched. More established companies also realized the value in the more abstract properties of the blockchain — resilience, integrity, etc. — and repurposed it for their particular industries to create an even wider class of technologies called distributed ledgers, and to form industrial consortia such as R3 and Hyperledger. These more general distributed ledgers can look, to varying degrees, quite unlike blockchains, and have a somewhat clearer (or at least different) path to adoption given their association with established partners in industry.
Amidst many unknowns, what is increasingly clear is that, even if they might not end up quite like “the Internet,” distributed ledgers — in one form or another — are here to stay. Nevertheless, a long path remains from where we are now to widespread adoption and there are many important decisions to be made that will affect the security and usability of any final product. In what follows, we present the top ten obstacles along this path, and highlight in some cases both the problem and what we as a community can do (and have been doing) to address them. By necessity, many interesting aspects of distributed ledgers, both in terms of problems and solutions, have been omitted, and the focus is largely technical in nature.
10. Usability: why use distributed ledgers?
The problem, in short. What do end users actually want from distributed ledgers, if anything? In other words, distributed ledgers are being discussed as the solution to problems in many industries, but what is it that the full public verifiability (or accountability, immutability, etc.) of distributed ledgers really maps to in terms of what end users want?
9. Governance: who makes the rules?
The problem, in short. The beauty of distributed ledgers is that no one entity gets to control the decisions made by the network; in Bitcoin, e.g., coins are generated or transferred from one party to another only if a majority of the peers in the network agree on the validity of this action. While this process becomes threatened if any one peer becomes too powerful, there is a larger question looming over the operation of these decentralized networks: who gets to decide which actions are valid in the first place? The truth is that all these networks operate according to a defined set of rules, and that “who makes the rules matters at least as much as who enforces them.”
In this process of making the rules, even the most decentralized networks turn out to be heavily centralized, as recent issues in cryptocurrency governance demonstrate. These increasingly common collapses threaten to harm the value of these cryptocurrencies, and reveal the issues associated with ad-hoc forms of governance. Thus, the problem is not just that we don’t know how to govern these technologies, but that — somewhat ironically — we need more transparency around how these structures operate and who is responsible for which aspects of governance.
8. Meaningful comparisons: which is better?
The problem. Bitcoin was the first cryptocurrency to be based on the architecture we now refer to as the blockchain, but it certainly isn’t the last; there are now thousands of alternative cryptocurrencies out there, each with its own unique selling point. Ethereum offers a more expressive scripting language and maintains state, Litecoin allows for faster block creation than Bitcoin, and each new ICO (Initial Coin Offering) promises a shiny feature of its own. Looking beyond blockchains, there are numerous proposals for cryptocurrencies based on consensus protocols other than proof-of-work and proposals in non-currency-related settings, such as Certificate Transparency, R3 Corda, and Hyperledger Fabric, that still fit under the broad umbrella of distributed ledgers.
Recently I worked on some research with colleagues at Boston University (Manuel Egele, William Koch) and University College London (Gianluca Stringhini) into defeating ransomware. The fruit of our labor, PayBreak published this year in ACM ASIACCS, is a novel proactive system against ransomware. It happens to work against the new global ransomware threat, WannaCry. WannaCry is infecting more than 230,000 computers in 150 countries demanding ransom payments in exchange for access to precious files. This attack has been cited as being unprecidented, and the largest to date. Luckily, our research works against it.
PayBreak works by storing all the cryptographic material used during a ransomware attack. Modern ransomware uses what’s called a “hybrid cryptosystem”, meaning each ransomed file is encrypted using a different key, and each of those keys are then encrypted using another private key held by the ransomware authors. When ransomware attacks, PayBreak records the cryptographic keys used to encrypt each file, and securely stores them. When recovery is necessary, the victim retrieves the ransom keys, and iteratively decrypts each file.
Defeating WannaCry Ransomware
At this point, I think I’ve reverse engineered and researched something like 30 ransomware families, from over 1000 samples. Wannacry isn’t really much different than every other ransomware family. Those include other infamous families like Locky, CryptoWall, CryptoLocker, and TeslaLocker.
They all pretty much work the same way, including Wannacry. Actually, this comic sums up the ransom process the best I’ve seen. Every successful family today encrypts each file for ransom with a new unique “session” key, and encrypts each session key with a “private” ransom key. Those session keys are generated on the host machine. This is where PayBreak shims the generation, and usage of those keys, and saves them. Meaning, the encryption of those session keys by the ransomware’s private key is pointless, and defeated.
The PayBreak system doesn’t rely on any specific algorithm, or cryptographic library to be used by ransomware. Actually, Wannacry implemented, or atleast, statically compiled its own AES-128-CBC function. PayBreak can be configured to hook arbitrary functions, including that custom AES function, and record the parameters, such as the key, passed to it. However, a simpler approach in this case was to hook the Windows secure pseudorandom number generator function, CryptGenRandom, which the ransomware (and most others) use to create new session keys per file, and save the output of the function calls.
Recovering files is simply testing each of the recorded session keys with the encrypted files, until a successful decryption. Decrypting my file system of ~1000 files took 94 minutes.
We held our annual ACE-CSR event in November 2016. The last talk was my inaugural lecture to full professor. I did not write the summary below myself, hence the use of third person, not because I now consider myself royalty! 🙂
In introducing Jens Groth, professor of cryptology, George Danezis, head of the information security group, commented that Groth’s work provided the only viable solutions to many of the hard privacy problems he himself was tackling. To most qualified engineers, he said, the concept of zero-knowledge proofs seems impossible: the idea is to show the properties of a secret without revealing them. A zero-knowledge proof could, for example, verify the result of a computation on some data without revealing the data itself. Most engineers believe that you must choose between integrity and confidentiality; Groth has proved this is not true. In addition, Danezis praised Groth’s work as highly creative, characterised by great mathematical depth and subtlety, and admired Groth’ willingness to speak his mind fearlessly even to government funders. Angela Sasse, head of the department, called Groth’s work “security tools we’re going to need for future generations”, and noted that simultaneously with these other accomplishments Groth helped put in place the foundation for the group as it is today.
Groth, jokingly opted to structure his talk around papers he’s had rejected to illustrate how hard it can be to publish innovative research. The concept of zero-knowledge proofs originated with a 1985 paper by Shafi Goldwasser, Silvio Micali, and Charles Rackoff. Zero-knowledge proofs have three characteristics: completeness (the prover can convince the verifier that the statement is true); soundness (the claiming prover cannot convince the verifier when the statement is false); and secrecy (no information other than the truth of the statement is leaked, even when the prover is interacting with a verifier who cheats). Groth illustrated the latter idea with a simple card trick: he asked an audience member to choose a card and then say whether the card was a heart or not. If the respondent shows all the cards that are not hearts, counting these proves that the selected card must be a heart without revealing what it is.
Zero-knowledge proofs can be extended to think about more complicated statements. Groth listed some examples:
Assert that a logical formula has an assignment to the variables that makes it true
Verify that a graph is Hamiltonian – that is, there is a path that touches each vertex exactly once
That a set of inputs into a Boolean circuit will produce an output of 1
Any statement of the general form U belongs to some NP-language L
Groth could see many possible applications for these proofs: signatures, encryption, electronic cash, electronic auctions, internet voting, multiparty computation, and verifiable cloud computing. His overall career has focused on building versatile and efficient proofs with the goal of moving them from being expensive and slow to being just a fraction of the cost of the task that’s being executed so that people would stop thinking about the cost and just toss them in as a standard part of any transaction.
The introduction of ZCash, and subsequentarticles explaining the technology, brought (slightly more) mainstream attention to a situation that people interested in blockchain technology have been aware of for quite a while: Bitcoin doesn’t provide users with a lot of anonymity.
Bitcoin instead provides a property referred to as pseudonymity (derived from pseudonym, meaning ‘false name’). Spending and receiving bitcoin can be done without transacting parties ever learning each other’s off-chain (real world) identities. Users have a private key with which to spend their funds, a public key so that they can receive funds, and an address where the funds are stored, on the blockchain. However, even if addresses are frequently changed, very revealing analysis can be performed on the information stored in the blockchain, as all transactions are stored in the clear.
For example, imagine if you worked at a small company and you were all paid in bitcoin – based on your other purchases, your colleagues could have a very easy time linking your on-chain addresses to you. This is more revealing than using a bank account, and we want cryptocurrencies to offer better properties than bank accounts in every way. Higher speed, lower cost and greater privacy of transfers are all essential.
ZCash is a whole different ball game. Instead of sending transactions in the clear, with the transaction value and sender and recipient address stored on the blockchain for all to see, ZCash transactions instead produce a zero-knowledge proof that the sender owns an amount of ZEC greater than or equal to the amount that they are trying to spend.
Put more simply, you can submit a proof that you have formed a transaction properly, rather than submitting the actual transaction to be stored forever on the blockchain. These proofs take around 40 seconds to generate, and the current supply of ZCash is a very limited 7977 ZEC. So I’m going to take you through some privacy enhancing methods that work on top of Ethereum, a cryptocurrency with approximately 15 second blocktimes and a current market cap of nearly $1 billion. If we do everything right, our anonymous transactions might even be mined before the ZCash proof has even finished generating.
If a lot of the words above meant nothing to you, here’s a blockchain/Bitcoin/Ethereum/smart contract primer. If you already have your blockchain basics down, feel free to skip to part 2.
Satoshi Nakamoto introduced the world to the proof-of-work blockchain, through the release of the bitcoin whitepaper in 2008, allowing users and interested parties to consider for the first time a trustless system, with which it is possible to securely transfer money to untrusted and unknown recipients. Since its launch, the success of bitcoin has motivated the creation of many other cryptocurrencies, both those built upon Bitcoin’s underlying structure, and those built entirely independently.
Cryptocurrencies are most simply described as ‘blockchains’ with a corresponding token or coin, with which you can create transactions that are then verified and stored in a block on the underlying blockchain. Put even more simply, they look a little like this:
Transactions in bitcoin are generally simple transfers of tokens from one account to another, and are stored in a block in the clear, with transaction value, sender, and recipient all available for any curious individual to view, analyse, or otherwise trace.
The blockchain has a consensus algorithm. This is essential for construction of one agreed-upon set of transactions, rather than multiple never-converging views of who currently owns which bitcoins.
The most common consensus algorithms in play at the moment are ‘proof of work’ algorithms, which require a computationally hard ‘puzzle’ to be solved by miners (a collection of competing participants) in order to make the cost of subverting or reversing transactions prohibitively expensive. Transactions are verified and mined as follows:
To really understand how the privacy enhancing protocol conceived and implemented during my Master’s thesis works, we need to go a little deeper into the inner workings of a blockchain platform called Ethereum.
Ethereum: Programmable Money
In response to the limitations of Bitcoin’s restricted scripting language (you can pretty much only transfer money from A to B with Bitcoin, for security reasons), the Ethereum platform was created, offering an (almost) Turing-complete distributed virtual machine atop the Ethereum blockchain, along with a currency called Ether. The increased scripting ability of the system enables developers to create ‘smart contracts’ on the blockchain, programs with rich functionality and the ability to operate on the blockchain state. The blockchain state records current ownership of money and of the local, persistent storage offered by Ethereum. Smart contracts are limited only by the amount of gas they consume. Gas is a sub-currency of the Ethereum system, existing to impose a limit on the amount of computational time an individual contract can use.
Ethereum accounts take one of two forms – either they are ‘externally owned’ accounts, controlled with a private key (like all accounts in the bitcoin system), or ‘contracts’, controlled by the code that resides in the specific address in question. Contracts have immutable code stored at the contract address, and additional storage which can be read from and written to by the contract. An Ethereum transaction contains the destination address, optional data, the gas limit, the sequence number and signature authorising the transaction. If the destination address corresponds to a contract, the contract code is then executed, subject to the gas limit as specified in the transaction, which is used to allow a certain number of computational steps before halting.
The ability to form smart contracts suggests some quite specific methods of addressing the lack of privacy and corresponding potential lack of fungibility of coins in the Ethereum system. Although cryptocurrencies provide some privacy with the absence of identity related checks required to buy, mine, or spend coins, the full transaction history is public, enabling any motivated individual to track and link users’ purchases. This concept heavily decreases the fungibility of cryptocurrencies, allowing very revealing taint analysis of coins to be performed, and leading to suggestions of blacklisting coins which were once flagged as stolen.
With smart contracts, we can do magical things, starting in part 2