Blockchains and Why We Need Privacy

The introduction of ZCash, and subsequent articles explaining the technology, brought (slightly more) mainstream attention to a situation that people interested in blockchain technology have been aware of for quite a while: Bitcoin doesn’t provide users with a lot of anonymity.

Bitcoin instead provides a property referred to as pseudonymity (derived from pseudonym, meaning ‘false name’). Spending and receiving bitcoin can be done without transacting parties ever learning each other’s off-chain (real world) identities. Users have a private key with which to spend their funds, a public key so that they can receive funds, and an address where the funds are stored, on the blockchain. However, even if addresses are frequently changed, very revealing analysis can be performed on the information stored in the blockchain, as all transactions are stored in the clear.

For example, imagine if you worked at a small company and you were all paid in bitcoin – based on your other purchases, your colleagues could have a very easy time linking your on-chain addresses to you. This is more revealing than using a bank account, and we want cryptocurrencies to offer better properties than bank accounts in every way. Higher speed, lower cost and greater privacy of transfers are all essential.

ZCash is a whole different ball game. Instead of sending transactions in the clear, with the transaction value and sender and recipient address stored on the blockchain for all to see, ZCash transactions instead produce a zero-knowledge proof that the sender owns an amount of ZEC greater than or equal to the amount that they are trying to spend.

Put more simply, you can submit a proof that you have formed a transaction properly, rather than submitting the actual transaction to be stored forever on the blockchain. These proofs take around 40 seconds to generate, and the current supply of ZCash is a very limited 7977 ZEC. So I’m going to take you through some privacy enhancing methods that work on top of Ethereum, a cryptocurrency with approximately 15 second blocktimes and a current market cap of nearly $1 billion. If we do everything right, our anonymous transactions might even be mined before the ZCash proof has even finished generating.

Statistics from Zchain
Currently 229/(7748+229)=2.87% of ZEC is stored in ‘shielded’ accounts, the rest (97.13%) is in a state with Bitcoin-like privacy properties. Stats and graph: Zchain.

Blockchains

If a lot of the words above meant nothing to you, here’s a blockchain/Bitcoin/Ethereum/smart contract primer. If you already have your blockchain basics down, feel free to skip to part 2.

Bitcoin

Satoshi Nakamoto introduced the world to the proof-of-work blockchain, through the release of the bitcoin whitepaper in 2008, allowing users and interested parties to consider for the first time a trustless system, with which it is possible to securely transfer money to untrusted and unknown recipients. Since its launch, the success of bitcoin has motivated the creation of many other cryptocurrencies, both those built upon Bitcoin’s underlying structure, and those built entirely independently.

Cryptocurrencies are most simply described as ‘blockchains’ with a corresponding token or coin, with which you can create transactions that are then verified and stored in a block on the underlying blockchain. Put even more simply, they look a little like this:

detailedblockchain

Transactions in bitcoin are generally simple transfers of tokens from one account to another, and are stored in a block in the clear, with transaction value, sender, and recipient all available for any curious individual to view, analyse, or otherwise trace.

The blockchain has a consensus algorithm. This is essential for construction of one agreed-upon set of transactions, rather than multiple never-converging views of who currently owns which bitcoins.

diverge

The most common consensus algorithms in play at the moment are ‘proof of work’ algorithms, which require a computationally hard ‘puzzle’ to be solved by miners (a collection of competing participants) in order to make the cost of subverting or reversing transactions prohibitively expensive. Transactions are verified and mined as follows:

animine

To really understand how the privacy enhancing protocol conceived and implemented during my Master’s thesis works, we need to go a little deeper into the inner workings of a blockchain platform called Ethereum.

Ethereum: Programmable Money

In response to the limitations of Bitcoin’s restricted scripting language (you can pretty much only transfer money from A to B with Bitcoin, for security reasons), the Ethereum platform was created, offering an (almost) Turing-complete distributed virtual machine atop the Ethereum blockchain, along with a currency called Ether. The increased scripting ability of the system enables developers to create ‘smart contracts’ on the blockchain, programs with rich functionality and the ability to operate on the blockchain state. The blockchain state records current ownership of money and of the local, persistent storage offered by Ethereum. Smart contracts are limited only by the amount of gas they consume. Gas is a sub-currency of the Ethereum system, existing to impose a limit on the amount of computational time an individual contract can use.

The arrows along the top show how to produce each piece of data from the previous. The labels on the bottom arrows are 'known hard problems', which cannot feasibly be solved with today's computing power.
The arrows along the top show how to produce each piece of data from the previous. The labels on the bottom arrows are ‘known hard problems’, which cannot feasibly be solved with today’s computing power.

Ethereum accounts take one of two forms – either they are ‘externally owned’ accounts, controlled with a private key (like all accounts in the bitcoin system), or ‘contracts’, controlled by the code that resides in the specific address in question. Contracts have immutable code stored at the contract address, and additional storage which can be read from and written to by the contract. An Ethereum transaction contains the destination address, optional data, the gas limit, the sequence number and signature authorising the transaction. If the destination address corresponds to a contract, the contract code is then executed, subject to the gas limit as specified in the transaction, which is used to allow a certain number of computational steps before halting.

The ability to form smart contracts suggests some quite specific methods of addressing the lack of privacy and corresponding potential lack of fungibility of coins in the Ethereum system. Although cryptocurrencies provide some privacy with the absence of identity related checks required to buy, mine, or spend coins, the full transaction history is public, enabling any motivated individual to track and link users’ purchases. This concept heavily decreases the fungibility of cryptocurrencies, allowing very revealing taint analysis of coins to be performed, and leading to suggestions of blacklisting coins which were once flagged as stolen.

With smart contracts, we can do magical things, starting in part 2

Microsoft Ireland: winning the battle for privacy but losing the war

On Thursday, Microsoft won an important federal appeals court case against the US government. The case centres on a warrant issued in December 2013, requiring Microsoft to disclose emails and other records for a particular msn.com email address which was related to a narcotics investigation. It transpired that these emails were stored in a Microsoft datacenter in Ireland, but the US government argued that, since Microsoft is a US company and can easily copy the data into the US, a US warrant would suffice. Microsoft argued that the proper way for the US government to obtain the data is through the Mutual Legal Assistance Treaty (MLAT) between the US and Ireland, where an Irish court would decide, according to Irish law, whether the data should be handed over to US authorities. Part of the US government’s objection to this approach was that the MLAT process is sometimes very slow, although though the Irish government has committed to consider any such request “expeditiously”.

The appeal court decision is an important victory for Microsoft (following two lower courts ruling against them) because they sell their european datacenters as giving their european customers confidence that their data will be subject to the more stringent european privacy laws. Microsoft’s case was understandably supported by other technology companies in the same position, as well as civil liberties organisations such as the Electronic Frontier Foundation in the US and the Open Rights Group in the UK. However, I have mixed opinions about the outcome: while probably the right decision in this case, the wider consequences could be detrimental to privacy.

Both sides of the case wanted to set a precedent (if not legally, at least in practice). The US government wanted US law to apply to data held by US companies, wherever in the world the data resides. Microsoft wanted the location of the data to imply which legal regime applied, and so their customers could be confident that their own country’s laws will be respected, provided Microsoft have a datacenter in their own country (or at least one with compatible laws). My concern is that this ruling will give false assurance to customers of US companies, because in other circumstances a different decision could quite easily be taken.

We know about this case because Microsoft chose to challenge it in court, and were able to do so. This is the first time Microsoft has challenged a US warrant for data stored in their Irish datacenter despite it being in operation for three years prior to the case. Had the email address been associated with a more serious crime, or the demand for emails accompanied by a gagging order, it may not have been challenged. Microsoft and other technology companies may still choose to accept, or may even be forced to accept, the applicability of future US warrants to data they control, regardless of the court decision last week. One extreme approach to compel this approach would be for the US to jail employees until their demands are complied with.

For this reason, I have argued that control over data is more important than where data resides. If a company does not have the technical capability to comply with an order, it is easier for them to defend their case, and so protects both the company’s customers and staff. Microsoft have taken precisely this approach for their new German datacenters, which will be operated by staff in Germany working for a German “data trustee” (Deutsche Telekom). In contrast to their Irish datacenter, Microsoft staff will be unable to access customer data, except with the permission of and oversight from the data trustee.

While the data trustee model resists information being obtained through improper legal means, a malicious employee could still break rules for personal gain, or the systems designed to process legal requests could be hacked into. With modern security techniques it is possible to do better. End-to-end encryption for instant messaging is one such example, because (if designed properly) the communications provider does not have access to messages they carry. A more sophisticated approach is “distributed consensus”, where a decision is only taken if a majority of participants agree. The consensus process is automated and enforced through cryptography, ensuring that rules are respected even if some participants are malicious. Critical decisions in the Tor network and in Bitcoin are taken this way. More generally, there is a growing recognition that purely legal or procedural mechanisms are insufficient to protect privacy. This is one of the common threads present in much of the research presented at the Privacy Enhancing Technologies Symposium, being held this week in Darmstadt: recognising that there will always be imperfections in software, people and procedures and showing that nevertheless individual’s privacy can still be protected.

Smart contracts beyond the age of innocence

Why have Bitcoin, with its distributed consistent ledger, and now Ethereum with its support for fully fledged “smart contracts,” captured the imagination of so many people, both within and beyond the tech industry? The promise to replace obscure stores of information and arcane contract rules – with their inefficient, ambiguous, and primitive human interpretations – with publicly visible decentralized ledgers reflects the growing technological zeitgeist in their guarantee that all participants would know and be able to foresee the consequences of both their own actions and the actions of all others. The precise specification of contracts as code, with clauses automatically executed depending on certain sets of events and permissible user actions, represents for some a true state of utopia.

Regardless of one’s views on the potential for distributed ledgers, one of the most notable innovations that smart contracts have enabled thus far is the idea of a DAO (Decentralized Autonomous Organization), which is a specific type of investment contract, by which members individually contribute value that then gets collectively invested under some governance model.  In truly transparent fashion, the details of this governance model, including who can vote and how many votes are required for a successful proposal, are all encoded in a smart contract that is published (and thus globally visible) on the distributed ledger.

Today, this vision met a serious stumbling block: a “bug” in the contract of the first majorly successful DAO (which broke records by raising 11 million ether, the equivalent of 150 million USD, in its first two weeks of operation) allowed third parties to start draining its funds, and to eventually make off with 4% of all ether. The immediate response of the Ethereum and DAO community was to suspend activity – seemingly an anathema for a ledger designed to provide high resiliency and availability – and propose two potential solutions: a “soft-fork” that would impose additional rules on miners in order to exclude all future transactions that try to use the stolen ether, or, more drastically (and running directly contrary to the immutability of the ledger),  a “hard-fork” that would roll back the transactions in which the attack took place, in addition to the many legitimate transactions that took place concurrently.  Interestingly, a variant of the bug that enabled the hack was known to and dismissed by the creators of the DAO (and the wider Ethereum community).

While some may be surprised by this series of events, Maurice Wilkes, designer of the EDSAC, one of the first computers, reflected that “[…] the realization came over me with full force that a good part of the remainder of my life was going to be spent in finding errors in my own programs.” It is not the case that because a program is precisely defined it is easy to foresee what it will do once executed on its own under the control of users.  In fact, Rice’s theorem explicitly states that it is not possible in general to show that the result of programs, and thus smart contracts, will have any specific non-trivial property.

This forms the basis on which modern verification techniques operate: they try to define subsets of programs for which it is possible to prove some properties (e.g., through typing), or attempt to prove properties in a post-hoc way (e.g., through verification), but under the understanding that they may fail in general.  There is thus no scientific basis on which one can assert generally that smart contracts can easily provide clarity into and foresight of their consequences.

The unfolding story of the DAO and its consequences for the Ethereum community offers two interesting insights. First, as a sign that the field is maturing, there is an explicit call for understanding the computational space of safe contracts, and contracts with foreseeable consequences. Second, it suggests the need for smart contracts protecting significant assets to include external, possibly social, mechanisms in order to unlock significant value transfers. The willingness of exchanges to suspend trading and of the Ethereum developers to suggest a hard-fork is a last-resort example of such a social mechanism. Thus, politics – the discipline of collective management – reasserts itself as having primacy over human affairs.

Come work with us!

I’m very pleased to announce that — along with George Danezis and Tomaso Aste, head of our Financial Computing group — I’ve been awarded a grant to continue our work on distributed ledgers (aka “blockchain-like things”) for the next three years.

Our group has already done a lot of research in this space, including George’s and my recent paper on centrally banked cryptocurrencies (at NDSS 2016) and Jens’ paper (along with Markulf Kohlweiss, a frequent UCL collaborator) on efficient ring signatures and applications to Zerocoin-style cryptocurrencies (at Eurocrypt 2015).  It’s great to have this opportunity to further investigate the challenges in this space and develop our vision for the future of these technologies, so big thanks to the EPSRC!

Anyway, the point of this post is to advertise, as part of this grant, three positions for postdoctoral researchers.  We are also seeking collaboration with any industrial partners investigating the potential usage of distributed ledgers, and in particular ones looking at the application of these ledgers across the following settings (or with a whole new setting in mind!):

  • Identity management. How can identities be stored, shared, and issued in a way that preserves privacy, prevents theft and fraud, and allows for informal forms of identity in places where no formal ones exist?
  • Supply chain transparency. How can supply chain information be stored in a way that proves integrity, preserves the privacy of individual actors, and can be presented to the end customer in a productive way?
  • Financial settlement. How can banking information be stored in a way that allows banks to easily perform gross settlement, reduces the burden on a central bank, and enables auditability of the proper functioning of the system?
  • Administration of benefits. How can benefits be administered to and used by disadvantaged populations in a way that preserves privacy, provides useful visibility into their spending, and protects against potential abuses of the system?

We expect the postdoctoral researchers to work with us and with each other on the many exciting problems in this space, which are spread across cryptography, computer and network security, behavioural economics, distributed systems, usable security, human-computer interaction, and software engineering (just to name a few!).  I encourage anyone interested to reach out to me (Sarah) to discuss this further, whether or not they’ve already done research on the particular topic of distributed ledgers.

That’s all for now, but please get in touch with me if you have any questions, and in the years to come I hope to invite many people to come work with us in London and to announce the various outcomes of this exciting project!

Bitcoin workshop at Financial Crypto 2016

On 26 February 2016 the 3rd workshop on Bitcoin and Blockchain Research in association with Financial Cryptography 2016 took place in Barbados. This workshop aims to bring together researchers interested in cryptocurrencies to present their latest work and discuss together the future of Bitcoin. The program chairs were Sarah Meiklejohn from University College London and Jeremy Clark from Concordia University. The themes addressed during the workshop included blockchain architecture, anonymity, and proof of work alternatives. This event was also a great way for researchers with similar interests to network and share their ideas.

The workshop consisted of 2 keynotes and 4 plenary sessions: Bitcoin network analysis, Enhancing Bitcoin, Ethereum, and Blockchain Architecture.

Nathaniel Popper kicked off the day with a keynote presentation. Nathaniel is a journalist from the New York Times and author of the book ‘Digital Gold: The Untold story of Bitcoin’. He went on to speak about the history of Bitcoin covering Silk Road, Mt Gox, as well as the role of governments.

Then the first session, about Bitcoin network analysis, included two talks. The first one, Stressing Out: Bitcoin Stress Testing, by Khaled Baqer et al., was about DoS attack on Bitcoin, and was presented by Ross Anderson due to visa issues. The second one was Why buy when you can rent? Bribery attacks on Bitcoin-style consensus, by Joseph Bonneau on bribery attacks and cloud mining.

The next session, Enhancing Bitcoin, started with a talk by Ethan Heilman, Blindly Signed Contracts: Anonymous On-Blockchain and Off-Blockchain Bitcoin Transactions, on how to enhance Bitcoin anonymity. Then Mathieu Turuani gave a talk on Automated Verification of Electrum wallet, followed by Aggelos Kiayias on Proof of Proof of Work. Today many light-weight clients use SPV verification instead of full verification. Is it possible to have an even lighter verification? They introduce a modification of the Bitcoin blockchain protocol with sublinear complexity in the length of the chain.

Continue reading Bitcoin workshop at Financial Crypto 2016

Our contributions to the UK Distributed Ledger Technology report

The UK Government Office for Science, has published its report on “Distributed ledger technology: beyond block chain” to which UCL’s Sarah Meiklejohn, Angela Sasse and myself (George Danezis) contributed parts of the security and privacy material. The review, looks largely at economic, innovation and social aspects of these technologies. Our part discusses potential threats to ledgers, as well as opportunities to build robust security systems using ledgers (Certificate Transparency & CONIKS), and overcome privacy challenges, including a mention of the z.cash technology.

You can listen to the podcast interview Sarah gave on the report’s use cases, recommendations, but also more broadly future research directions for distributed ledgers, such as better privacy protection.

In terms of recommendation, I personally welcome the call for the Government Digital Services, and other innovation bodies to building capacity around distributed ledger technologies. The call for more research for efficient and secure ledgers (and the specific mention of cryptography research) is also a good idea, and an obvious need. When it comes to the specific security and privacy recommendation, it simply calls for standards to be established and followed. Sadly this is mildly vague: a standards based approach to designing secure and privacy-friendly systems has not led to major successes. Instead openness in the design, a clear focus on key end-to-end security properties, and the involvement of a wide community of experts might be more productive (and less susceptible to subversion).

The report is well timed: our paper on “Centrally Banked Crypto-Currencies” will be presented in February at a leading security conference, NDSS 2016, by Sarah Meiklejohn, largely inspired by the research agenda published by the Bank of England. It provides some answers to the problems of scalability and eco-friendliness of current proof-of-work based ledger design.

Sarah Meiklejohn – Security and Cryptography

Sarah Meiklejohn As a child, Sarah Meiklejohn thought she might become a linguist, largely because she was so strongly interested in the work being done to decode the ancient Greek writing systems Linear A and Linear B.

“I loved all that stuff,” she says. “And then I started doing mathematics.” At that point, with the help of Simon Singh’s The Code Book, she realised the attraction was codebreaking rather than human languages themselves. Simultaneously, security and privacy were increasingly in the spotlight.

“I’m a very private person, and so privacy is near and dear to my heart,” she says. “It’s an important right that a lot of people don’t seem interested in exercising, but it’s still a right. Even if no one voted we would still agree that it was important for people to be able to vote.”

It was during her undergraduate years at Brown, which included a fifth-year Masters degree, that she made the transition from mathematics to cryptography and began studying computer science. She went on to do her PhD at the University of California at San Diego. Her appointment at UCL, which is shared between the Department of Computer Science and the Department of Crime Science, is her first job.

Probably her best-known work is A Fistful of Bitcoins: Characterizing Payments Among Men with No Names (PDF), written with Marjori Pomarole, Grant Jordan, Kirill Levchenko, Damon McCoy, Geoffrey M. Voelker, and Stefan Savage and presented at USENIX 2013, which studied the question of how much anonymity bitcoin really provides.

“The main thing I was trying to focus on in that paper is what bitcoin is used for,” she says. The work began with buying some bitcoin (in 2012, at about £3 each), and performing some transactions with them over a period of months. Using the data collected this way allowed her to uncover some “ground truth” data.

“We developed these clustering techniques to get down to single users and owners.” The result was that they could identify which addresses belonged to which exchanges and enabled them to get a view of what was going on in the network. “So we could say this many bitcoins passed through this exchange per month, or how many were going to underground services like Silk Road.”

Continue reading Sarah Meiklejohn – Security and Cryptography