Chainspace: A Sharded Smart Contracts Platform

Thanks to their resilience, integrity, and transparency properties, blockchains have gained much traction recently, with applications ranging from banking and energy sector to legal contracts and healthcare. Blockchains initially received attention as Bitcoin’s underlying technology. But for all its success as a popular cryptocurrency, Bitcoin suffers from scalability issues: with a current block size of 1MB and 10 minute inter-block interval, its throughput is capped at about 7 transactions per second, and a client that creates a transaction has to wait for about 10 minutes to confirm that it has been added to the blockchain. This is several orders of magnitude slower that what mainstream payment processing companies like Visa currently offer: transactions are confirmed within a few seconds, and have ahigh throughput of 2,000 transactions per second on average, peaking up to 56,000 transactions per second. A reparametrization of Bitcoin can somewhat assuage these issues, increasing throughput to to 27 transactions per second and 12 second latency. Smart contract platforms, such as Ethereum inherit those scalability limitations. More significant improvements, however, call for a fundamental redesign of the blockchain paradigm.

This week we published a pre-print of our new Chainspace system—a distributed ledger platform for high-integrity and transparent processing of transactions within a decentralized system. Chainspace uses smart contracts to offer extensibility, rather than catering to specific applications such as Bitcoin for a currency, or certificate transparency for certificate verification. Unlike Ethereum, Chainspace’s sharded architecture allows for a ledger linearly scalable since only the nodes concerned with the transaction have to process it. Our modest testbed of 60 cores achieves 350 transactions per second. In comparison, Bitcoin achieves a peak rate of less than 7 transactions per second for over 6k full nodes, and Ethereum currently processes 4 transactions per second (of a theoretical maximum of 25). Moreover, Chainspace is agnostic to the smart contract language, or identity infrastructure, and supports privacy features through modern zero-knowledge techniques. We have released the Chainspace whitepaper, and the code is available as an open-source project on GitHub.

System Overview

The figure above illustrates the system design of Chainspace. Chainspace is comprised of a network of infrastructure nodes that manage valid objects and ensure that only valid transactions on those objects are committed.  Let’s look at the data model of Chainspace first. An object represents a unit of data in the Chainspace system (e.g., a bank account), and is in one of the following three states: active (can be used by a transaction), locked (is being processed by an existing transaction), or inactive (was used by a previous transaction).  Objects also have a type that determines the unique identifier of the smart contract that defines them. Smart contract procedures can operate on active objects only, while inactive objects are retained just for the purposes of audit. Chainspace allows composition of smart contracts from different authors to provide ecosystem features. Each smart contract is associated with a checker to enable private processing of transactions on infrastructure nodes since checkers do not take any secret local parameters. Checkers are pure functions (i.e., deterministic, and have no side-effects) that return a boolean value.

Now, a valid transaction accepts active input objects along with other ancillary information, and generates output objects (e.g., transfers money to another bank account). To achieve high transaction throughput and low latency, Chainspace organizes nodes into shards that manage the state of objects, keep track of their validity, and record transactions aborted or committed. We implemented this using Sharded Byzantine Atomic Commit (S-BAC)—a protocol that composes existing Byzantine Fault Tolerant (BFT) agreement and atomic commit primitives in a novel way. Here is how the protocol works:

  • Intra-shard agreement. Within each shard, all honest nodes ensure that they consistently agree on accepting or rejecting a transaction.
  • Inter-shard agreement. Across shards, nodes must ensure that transactions are committed if all shards are willing to commit the transaction, and rejected (or aborted) if any shards decide to abort the transaction.

Consensus on committing (or aborting) transactions takes place in parallel across different shards. A nice property of S-BAC’s atomic commit protocol is that the entire shard—rather than a third party—acts as a coordinator. This is in contrast to other sharding-based systems with cryptocurrency application like OmniLedger or RSCoin where an untrusted client acts as the coordinator, and is incentivized to act honestly. Such incentives do not hold for a generalized platform like Chainspace where objects may have shared ownership.

Continue reading Chainspace: A Sharded Smart Contracts Platform

The end of the billion-user Password:Impossible

XKCD: “Password Strength”

This week, the Wall Street Journal published an article by Robert McMillan containing an apology from Bill Burr, a man whose name is unknown to most but whose work has caused daily frustration and wasted time for probably hundreds of millions of people for nearly 15 years. Burr is the author of the 2003 Special Publication 800-63. Appendix A from the US National Institute of Standards and Technology: eight pages that advised security administrators to require complex passwords including special characters, capital letters, and numbers, and dictate that they should be frequently changed.

“Much of what I did I now regret,” Burr told the Journal. In June, when NIST issued a completely rewritten document, it largely followed the same lines as the NCSCs password guidance, published in 2015 and based on prior research and collaboration with the UK Research Institute in Science of Cyber Security (RISCS), led from UCL by Professor Angela Sasse. Yet even in 2003 there was evidence that Burr’s approach was the wrong one: in 1999, Sasse did the first work pointing out the user-unfriendliness of standard password policies in the paper Users Are Not the Enemy, written with Anne Adams.

How much did that error cost in lost productivity and user frustration? Why did it take the security industry and research community 15 years to listen to users and admit that the password policies they were pushing were not only wrong but actively harmful, inflicting pain on millions of users and costing organisations huge sums in lost productivity and administration? How many other badly designed security measures are still out there, the cyber equivalent of traffic congestion and causing the same scale of damage?

For decades, every password breach has led to the same response, which Einstein would readily have recognised as insanity: ridiculing users for using weak passwords, creating policies that were even more difficult to follow, and calling users “stupid” for devising coping strategies to manage the burden. As Sasse, Brostoff, and Weirich wrote in 2001 in their paper Transforming the ‘Weakest Link’, “…simply blaming users will not lead to more effective security systems”. In his 2009 paper So Long, and No Thanks for the Externalities, Cormac Herley (Microsoft Research) pointed out that it’s often quite rational for users to reject security advice that ignores the indirect costs of the effort required to implement it: “It makes little sense to burden all users with a daily task to spare 0.01% of them a modest annual pain,” he wrote.

When GCHQ introduced the new password guidance, NCSC head Ciaran Martin noted the cognitive impossibility of following older policies, which he compared to trying to memorise a new 600-digit number every month. Part of the basis for Martin’s comments is found in more of Herley’s research. In Password Portfolios and the Finite-Effort User, Herley, Dinei Florencio, and Paul C. van Oorschot found that the cognitive load of managing 100 passwords while following the standard advice to use a unique random string for every password is equivalent to memorising 1,361 places of pi or the ordering of 17 packs of cards – a cognitive impossibility. “No one does this”, Herley said in presenting his research at a RISCS meeting in 2014.

The first of the three questions we started with may be the easiest to answer. Sasse’s research has found that in numerous organisations each staff member may spend as much as 30 minutes a day on entering, creating, and recovering passwords, all of it lost productivity. The US company Imprivata claims its system can save clinicians up to 45 minutes per day just in authentication; in that use case, the wasted time represents not just lost profit but potentially lost lives.

Add the cost of disruption. In a 2014 NIST diary study, Sasse, with Michelle Steves, Dana Chisnell, Kat Krol, Mary Theofanos, and Hannah Wald, found that up to 40% of the time leading up to the “friction point” – that is, the interruption for authentication – is spent redoing the primary task before users can find their place and resume work. The study’s participants recorded on average 23 authentication events over the 24-hour period covered by the study, and in interviews they indicated their frustration with the number, frequency, and cognitive load of these tasks, which the study’s authors dubbed “authentication fatigue”. Dana Chisnell has summarised this study in a video clip.

The NIST study identified a more subtle, hidden opportunity cost of this disruption: staff reorganise their primary tasks to minimise exposure to authentication, typically by batching the tasks that require it. This is a similar strategy to deciding to confine dealing with phone calls to certain times of day, and it has similar consequences. While it optimises that particular staff member’s time, it delays any dependent business process that is designed in the expectation of a continuous flow from primary tasks. Batching delays result not only in extra costs, but may lose customers, since slow responses may cause them to go elsewhere. In addition, staff reported not pursuing ideas for improvement or innovation because they couldn’t face the necessary discussions with security staff.

Unworkable security induces staff to circumvent it and make errors – which in turn lead to breaches, which have their own financial and reputational costs. Less obvious is the cost of lost staff goodwill for organisations that rely on free overtime – such as US government departments and agencies. The NIST study showed that this goodwill is dropping: staff log in less frequently from home, and some had even returned their agency-approved laptops and were refusing to log in from home or while travelling.

It could all have been so different as the web grew up over the last 20 years or so, because the problems and costs of password policies are not new or newly discovered. Sasse’s original 1999 research study was not requested by security administrators but by BT’s accountants, who balked when the help desk costs of password problems were tripling every year with no end in sight. Yet security people have continued to insist that users must adapt to their requirements instead of the other way around, even when the basis for their ideas is shown to be long out of date. For example, in a 2006 blog posting Purdue University professor Gene Spafford explained that the “best practice” (which he calls “infosec folk wisdom”) of regular password changes came from non-networked military mainframes in the 1970s – a far cry from today’s conditions.

Herley lists numerous other security technologies that are as much of a plague as old-style password practices: certificate error warnings, all of which are false positives; security warnings generally; and ambiguous and non-actionable advice, such as advising users not to click on “suspicious” links or attachments or “never” reusing passwords across accounts.

All of these are either not actionable, or just too difficult to put into practice, and the struggle to eliminate them has yet to bear fruit. Must this same story continue for another 20 years?

 

This article also appears on the Research Institute in Science of Cyber Security (RISCS) blog.

Creating scalable distributed ledgers for DECODE

Since the introduction of Bitcoin in 2008, blockchains have gone from a niche cryptographic novelty to a household name. Ethereum expanded the applicability of such technologies, beyond managing monetary value, to general computing with smart contracts. However, we have so far only scratched the surface of what can be done with such “Distributed Ledgers”.

The EU Horizon 2020 DECODE project aims to expand those technologies to support local economy initiatives, direct democracy, and decentralization of services, such as social networking, sharing economy, and discursive and participatory platforms. Today, these tend to be highly centralized in their architecture.

There is a fundamental contradiction between how modern services harness the work and resources of millions of users, and how they are technically implemented. The promise of the sharing economy is to coordinate people who want to provide resources with people who want to use them, for instance spare rooms in the case of Airbnb; rides in the case of Uber; spare couches of in the case of couchsurfing; and social interactions in the case of Facebook.

These services appear to be provided in a peer-to-peer, and disintermediated fashion. And, to some extent, they are less mediated at the application level thanks to their online nature. However, the technical underpinnings of those services are based on the extreme opposite design philosophy: all users technically mediate their interactions through a very centralized service, hosted on private data centres. The big internet service companies leverage their centralized position to extract value out of user or providers of services – becoming de facto monopolies in many case.

When it comes to privacy and security properties, those centralized services force users to trust them absolutely, and offer little on the way of transparency to even allow users to monitor the service practices to ground that trust. A recent example illustrating this problem was Uber, the ride sharing service, providing a different view to drivers and riders about the fare that was being paid for a ride – forcing drivers to compare what they receive with what riders pay to ensure they were getting a fair deal. Since Uber, like many other services, operate in a non-transparent manner, its functioning depends on users absolute to ensure fairness.

The lack of user control and transparency of modern online services goes beyond monetary and economic concerns. Recently, the Guardian has published the guidelines used by Facebook to moderate abusive or illegal user postings. While, moderation has a necessary social function, the exact boundaries of what constitutes abuse came into question: some forms of harms to children or holocaust denial were ignored, while material of artistic or political value has been suppressed.

Even more worryingly, the opaque algorithms being used to promote and propagate posts have been associated with creating a filter bubble effect, influencing elections, and dark adverts, only visible to particular users, are able to flout standards of fair political advertising. It is a fact of the 21st century that a key facet of the discursive process of democracy will take place on online social platforms. However, their centralized, opaque and advertising-driven form is incompatible with their function as a tool for democracy.

Finally, the revelations of Edward Snowden relating to mass surveillance, also illustrate how the technical centralization of services erodes privacy at an unprecedented scale. The NSA PRISM program coerced internet services to provide access to data on their services under a FISA warrant, not protecting the civil liberties of non-American persons. At the same time, the UPSTREAM program collected bulk information between data centres making all economic, social and political activities taking place on those services transparent to US authorities. While users struggle to understand how those services operate, governments (often foreign) have total visibility. This is a complete inversion of the principles of liberal democracy, where usually we would expect citizens to have their privacy protected, while those in position of authority and power are expected to be accountable.

The problems of accountability, transparency and privacy are social, but are also based on the fundamental centralized architecture underpinning those services. To address them, the DECODE project brings together technical, legal, social experts from academia, alongside partners from local government and industry. Together they are tasked to develop architectures that are compatible with the social values of transparency, user and community control, and privacy.

The role of UCL Computer Science, as a partner, is to provide technical options into two key technical areas: (1) the scalability of secure decentralized distributed ledgers that can support millions or billions of users while providing high-integrity and transparency to operations; (2) mechanisms for protecting user privacy despite the decentralized and transparent infrastructure. The latter may seem like an oxymoron: how can transparency and privacy be reconciled? However, thanks to advances in modern cryptography, it is possible to ensure that operations were correctly performed on a ledger, without divulging private user data – a family of techniques known as zero-knowledge.

I am particularly proud of the UCL team we have put together that is associated with this project, and strengthens considerably our existing expertise in distributed ledgers.

I will be leading and coordinating the work. I have a long standing interest, and track record, in privacy enhancing technologies and peer-to-peer computing, as well as scalable distributed ledgers – such as the RSCoin currency proposal. Shehar Bano, an expert on systems and networking, has joined us as a post-doctoral researcher after completing her thesis at Cambridge. Alberto Sonnino will be doing his thesis on distributed ledgers and privacy, as well as hardware and IoT applications related to ledgers, after completing his MSc in Information Security at UCL last year. Mustafa Al-Bassam, is also associated with the project and works on high-integrity and scalable ledger technologies, after completing his degree at Kings College London – he is funded by the Turing Institute to work on such technologies. Those join our wider team of UCL CS faculty, with research interests in distributed ledgers, including Sarah Meiklejohn, Nicolas Courtois and Tomaso Aste and their respective teams.

 

This post also appears on the DECODE project blog.