Battery Status readout as a privacy risk

Privacy risks and threats arise even in seemingly innocuous mechanisms. It is a fairly regular issue.

Over a year ago, I was researching the risk of the W3C Battery Status API. The mechanism allows a web site to read the battery level of a device (smartphone, laptop, etc.). One of the positive use cases may be, for example, stopping the execution of intensive operations if the battery is running low.

Our privacy analysis of Battery Status API revealed interesting results.

Privacy analysis of Battery API

The battery status provides the following information:

  • the current level of battery (format: 0.0–1.0, for empty and full battery, respectively)
  • time to a full discharge of battery (in seconds)
  • time to a full charge of battery, if connected to a charger (in seconds)

These items are updated whenever a new value is supplied by the operating system

It turns out that privacy risks may surface even in this kind of – seemingly innocuous – data and access mechanisms.

Frequency of changes

The frequency of changes in the reported readouts from Battery Status API potentially allow the monitoring of users’ computer use habits; for example, potentially enabling analyzing of how frequently the user’s device is under heavy use. This could lead to behavioral analysis.

Additionally, identical installations of computer deployments in standard environments (e.g. at schools, work offices, etc.) are often are behind a NAT. In simple terms, NAT allows a number of users to browse the Internet with an – externally seen – single IP address. The ability of observing any differences between otherwise identical computer installations – potentially allows particular users to be identified (and targeted?).

Battery readouts as identifiers

The information provided by the Battery Status API is not always subject to rapid changes. In other words, this information may be static for a period of time; this in turn may give rise to a short-lived identifier. The situation gets especially interesting when we consider a scenario of users sometimes clearing standard web identifiers (such as cookies). In such a case, a web script could potentially analyse identifiers provided by Battery Status API, and this information then could possibly even lead to re-creation of other identifiers. A simple sketch follows.

Continue reading Battery Status readout as a privacy risk

Steven Murdoch – Privacy and Financial Security

Probably not too many academic researchers can say this: some of Steven Murdoch’s research leads have arrived in unmarked envelopes. Murdoch, who has moved to UCL from the University of Cambridge, works primarily in the areas of privacy and financial security, including a rare specialty you might call “crypto for the masses”. It’s the financial security aspect that produces the plain, brown envelopes and also what may be his most satisfying work, “Trying to help individuals when they’re having trouble with huge organisations”.

Murdoch’s work has a twist: “Usability is a security requirement,” he says. As a result, besides writing research papers and appearing as an expert witness, his past includes a successful start-up. Cronto, which developed a usable authentication device, was acquired by VASCO, a market leader in authentication and is now used by banks such as Commerzbank and Rabobank.

Developing the Cronto product was, he says, an iterative process that relied on real-world testing: “In research into privacy, if you build unusable system two things will go wrong,” he says. “One, people won’t use it, so there’s a smaller crowd to hide in.” This issue affects anonymising technologies such as Mixmaster and Mixminion. “In theory they have better security than Tor but no one is using them.” And two, he says, “People make mistakes.” A non-expert user of PGP, for example, can’t always accurately identify which parts of the message are signed and which aren’t.

The start-up experience taught Murdoch how difficult it is to get an idea from research prototype to product, not least because what works in a small case study may not when deployed at scale. “Selling privacy remains difficult,” he says, noting that Cronto had an easier time than some of its forerunners since the business model called for sales to large institutions. The biggest challenge, he says, was not consumer acceptance but making a convincing case that the predicted threats would materialise and that a small company could deliver an acceptable solution.

Continue reading Steven Murdoch – Privacy and Financial Security

Moving towards security and privacy experiments for the real world

Jono and I recently presented our joint paper with Simon and Angela at the Learning from Authoritative Security Experiment Results (LASER) Workshop in San Jose, CA, USA. The workshop was co-located with the IEEE Security and Privacy Symposium. LASER has a different focus each year; in 2016, presented papers explored new approaches to computer security experiments that are repeatable and can be shared across communities.

Through our LASER paper, “Towards robust experimental design for user studies in security and privacy”, we wanted to advance the quest for better experiment design and execution. We proposed the following five principles for conducting robust experiments into usable security and privacy:

  1. Give participants a primary task
  2. Ensure participants experience realistic risk
  3. Avoid priming the participants
  4. Perform experiments double-blind whenever possible
  5. Define these elements precisely: threat model; security; privacy and usability

Understanding users and their interaction with security is a blind spot for many security practitioners and designers. Learning from prior studies within and outside our research group, we have defined principles for conducting robust experiments into usable security and privacy. These principles are informed by efforts in other fields such as biology, qualitative research methods, and medicine, where four overarching experiment-design factors guided our principles:

Internal validity – The experiment is of “suitable scope to achieve the reported results” and is not “susceptible to systematic error”.

External validity – The result of the experiment “is not solely an artifact of the laboratory setting”.

Containment  – There are no “confounds” in the results, and no experimental “effects are a threat to safety” of the participants, the environment, or society generally.

Transparency – “There are no explanatory gaps in the experimental mechanism” and the explanatory “diagram for the experimental mechanism is complete”, in that it covers all relevant entities and activities.

Continue reading Moving towards security and privacy experiments for the real world

Analyzing privacy aspects of the W3C Vibration API

When making web standards, multiple scenarios possibly affecting privacy are considered. This includes even extreme ones; and this is a good thing. It’s best to predict the creative use and abuse of web features, before they are exploited.

Vibration API

The mechanism allowing websites to utilize a device’s vibration motor is called the Vibration API. The mechanism allows a device to be vibrated in particular patterns. The argument to the vibration() function is a list called a pattern. The list’s odd indices cause a vibration for a specific length of time, and even values are the still periods. For example, a web designer can make the device to vibrate for a specific duration, say 50 ms and follow that with a still period of 100 ms using the following call:

navigator.vibration([50,100])

In certain circumstances this can create several interesting potential privacy risks. Let’s look at the Vibration API from a privacy point of view. I will consider a number of scenarios on various technical levels.

Toy de-anonymisation scenario

One potential risk is the identification of a particular person in real life. Imagine several people in the same room placing their devices on a table. At some point, one person’s device vibrates in specific patterns. This individual might then become marked to a potential observer.

How could such a script be delivered? One possibility is though web advertising infrastructures. These offer capabilities of targeting individuals with a considerable accuracy (with respect to their location).

Continue reading Analyzing privacy aspects of the W3C Vibration API

Microsoft Ireland: winning the battle for privacy but losing the war

On Thursday, Microsoft won an important federal appeals court case against the US government. The case centres on a warrant issued in December 2013, requiring Microsoft to disclose emails and other records for a particular msn.com email address which was related to a narcotics investigation. It transpired that these emails were stored in a Microsoft datacenter in Ireland, but the US government argued that, since Microsoft is a US company and can easily copy the data into the US, a US warrant would suffice. Microsoft argued that the proper way for the US government to obtain the data is through the Mutual Legal Assistance Treaty (MLAT) between the US and Ireland, where an Irish court would decide, according to Irish law, whether the data should be handed over to US authorities. Part of the US government’s objection to this approach was that the MLAT process is sometimes very slow, although though the Irish government has committed to consider any such request “expeditiously”.

The appeal court decision is an important victory for Microsoft (following two lower courts ruling against them) because they sell their european datacenters as giving their european customers confidence that their data will be subject to the more stringent european privacy laws. Microsoft’s case was understandably supported by other technology companies in the same position, as well as civil liberties organisations such as the Electronic Frontier Foundation in the US and the Open Rights Group in the UK. However, I have mixed opinions about the outcome: while probably the right decision in this case, the wider consequences could be detrimental to privacy.

Both sides of the case wanted to set a precedent (if not legally, at least in practice). The US government wanted US law to apply to data held by US companies, wherever in the world the data resides. Microsoft wanted the location of the data to imply which legal regime applied, and so their customers could be confident that their own country’s laws will be respected, provided Microsoft have a datacenter in their own country (or at least one with compatible laws). My concern is that this ruling will give false assurance to customers of US companies, because in other circumstances a different decision could quite easily be taken.

We know about this case because Microsoft chose to challenge it in court, and were able to do so. This is the first time Microsoft has challenged a US warrant for data stored in their Irish datacenter despite it being in operation for three years prior to the case. Had the email address been associated with a more serious crime, or the demand for emails accompanied by a gagging order, it may not have been challenged. Microsoft and other technology companies may still choose to accept, or may even be forced to accept, the applicability of future US warrants to data they control, regardless of the court decision last week. One extreme approach to compel this approach would be for the US to jail employees until their demands are complied with.

For this reason, I have argued that control over data is more important than where data resides. If a company does not have the technical capability to comply with an order, it is easier for them to defend their case, and so protects both the company’s customers and staff. Microsoft have taken precisely this approach for their new German datacenters, which will be operated by staff in Germany working for a German “data trustee” (Deutsche Telekom). In contrast to their Irish datacenter, Microsoft staff will be unable to access customer data, except with the permission of and oversight from the data trustee.

While the data trustee model resists information being obtained through improper legal means, a malicious employee could still break rules for personal gain, or the systems designed to process legal requests could be hacked into. With modern security techniques it is possible to do better. End-to-end encryption for instant messaging is one such example, because (if designed properly) the communications provider does not have access to messages they carry. A more sophisticated approach is “distributed consensus”, where a decision is only taken if a majority of participants agree. The consensus process is automated and enforced through cryptography, ensuring that rules are respected even if some participants are malicious. Critical decisions in the Tor network and in Bitcoin are taken this way. More generally, there is a growing recognition that purely legal or procedural mechanisms are insufficient to protect privacy. This is one of the common threads present in much of the research presented at the Privacy Enhancing Technologies Symposium, being held this week in Darmstadt: recognising that there will always be imperfections in software, people and procedures and showing that nevertheless individual’s privacy can still be protected.

Exceptional access provisions in the Investigatory Powers Bill

The Investigatory Powers Bill, being debated in Parliament this week, proposes the first wide-scale update in 15 years to the surveillance powers of the UK law-enforcement and intelligence agencies.

The Bill has several goals: to consolidate some existing surveillance powers currently either scattered throughout other legislation or not even publicly disclosed, to create a wide range of new surveillance powers, and to change the process of authorisation and oversight surrounding the use of surveillance powers. The Bill is complex and, at 245 pages long, makes scrutiny challenging.

The Bill has had its first and second readings in the House of Commons, and has been examined by relevant committees in the Commons. The Bill will now be debated in the ‘report stage’, where MPs will have the chance to propose amendments following committee scrutiny. After this it will progress to a third reading, and then to the House of Lords for further debate, followed by final agreement by both Houses.

So far, four committee reports have been published examining the draft Bill, from the Intelligence and Security Committee of Parliament, the joint House of Lords/House of Commons committee specifically set up to examine the draft Bill, the House of Commons Science and Technology committee (to which I served as technical advisor) and the Joint Committee on Human Rights.

These committees were faced with a difficult task of meeting an accelerated timetable for the Bill, with the government aiming to have it become law by the end of 2016. The reason for the haste is that the Bill would re-instate and extend the ability of the government to compel companies to collect data about their users, even without there being any suspicion of wrongdoing, known as “data retention”. This power was previously set out in the EU Data Retention Directive, but in 2014 the European Court of Justice found it be unlawful.

Emergency legislation passed to temporarily permit the government to continue their activities will expire in December 2016 (but may be repealed earlier if an appeal to the European Court of Justice succeeds).

The four committees which examined the Bill together made 130 recommendations but since the draft was published, the government only slightly changed the Bill, and only a few minor amendments were accepted by the Public Bills committee.

Many questions remain about whether the powers granted by the Bill are justifiable and subject to adequate oversight, but where insights from computer security research are particularly relevant is on the powers to grant law enforcement the ability to bypass normal security mechanisms, sometimes termed “exceptional access”.

Continue reading Exceptional access provisions in the Investigatory Powers Bill

Come work with us!

I’m very pleased to announce that — along with George Danezis and Tomaso Aste, head of our Financial Computing group — I’ve been awarded a grant to continue our work on distributed ledgers (aka “blockchain-like things”) for the next three years.

Our group has already done a lot of research in this space, including George’s and my recent paper on centrally banked cryptocurrencies (at NDSS 2016) and Jens’ paper (along with Markulf Kohlweiss, a frequent UCL collaborator) on efficient ring signatures and applications to Zerocoin-style cryptocurrencies (at Eurocrypt 2015).  It’s great to have this opportunity to further investigate the challenges in this space and develop our vision for the future of these technologies, so big thanks to the EPSRC!

Anyway, the point of this post is to advertise, as part of this grant, three positions for postdoctoral researchers.  We are also seeking collaboration with any industrial partners investigating the potential usage of distributed ledgers, and in particular ones looking at the application of these ledgers across the following settings (or with a whole new setting in mind!):

  • Identity management. How can identities be stored, shared, and issued in a way that preserves privacy, prevents theft and fraud, and allows for informal forms of identity in places where no formal ones exist?
  • Supply chain transparency. How can supply chain information be stored in a way that proves integrity, preserves the privacy of individual actors, and can be presented to the end customer in a productive way?
  • Financial settlement. How can banking information be stored in a way that allows banks to easily perform gross settlement, reduces the burden on a central bank, and enables auditability of the proper functioning of the system?
  • Administration of benefits. How can benefits be administered to and used by disadvantaged populations in a way that preserves privacy, provides useful visibility into their spending, and protects against potential abuses of the system?

We expect the postdoctoral researchers to work with us and with each other on the many exciting problems in this space, which are spread across cryptography, computer and network security, behavioural economics, distributed systems, usable security, human-computer interaction, and software engineering (just to name a few!).  I encourage anyone interested to reach out to me (Sarah) to discuss this further, whether or not they’ve already done research on the particular topic of distributed ledgers.

That’s all for now, but please get in touch with me if you have any questions, and in the years to come I hope to invite many people to come work with us in London and to announce the various outcomes of this exciting project!

Privately gathering statistics and training simple models

Last week, Luca Melis has presented our NDSS16 paper “Efficient Private Statistics with Succinct Sketches“, where we show how to privately and efficiently aggregate data from many sources and/or large streams, and then use the aggregate to extract useful statistics and train simple machine learning models.

Our work is motivated by a few “real-world” problems:

  • Media broadcasting providers like the BBC (with which we collaborate) routinely collect data from their users about videos they have watched (e.g., on BBC’s iPlayer) in order to provide users with personalized suggestions for other videos, based on recommender systems like Item k-Nearest Neighbor (ItemKNN)
  • Urban and transport planning committees, such as London’s mass transport operators, need to gather statistics about people’s movements and commutes, e.g., to improve transportation services and predict near-future trends and anomalies on a short notice.
  • Network infrastructures like the Tor network need to gather traffic statistics, like the number of, and traffic generated by, Tor hidden services, in order to tune design decisions as well as convince their founders the infrastructure is used for the intended purposes.

While different in their application, these examples exhibit a common feature: the need for providers to aggregate large amounts of sensitive information from large numbers of data sources, in order to produce aggregate statistics and possibly train machine learning models.

Prior work has proposed a few cryptographic tools for privacy-enhanced computation that could be use for private collection of statistics. For instance, by relying on homomorphic encryption and/or secret sharing, an untrusted aggregator can receive encrypted readings from users and only decrypt their sum. However, these require users to perform a number of cryptographic operations, and transmit a number of ciphertexts, linear in the size of their inputs, which makes it impractical for the scenarios discussed above, whereby inputs to be aggregated are quite large. For instance, if we use ItemKNN for the recommendations, we would need to aggregate values for “co-views” (i.e., videos that have been watched by the same user) of hundreds of videos at the time – thus, each user would have to encrypt and transfer hundreds of thousands of values at the time.

Scaling private aggregation

We tackle the problem from two points of view: an “algorithmic” one and a “system” one. That is, we have worked both on the design of the necessary cryptographic and data structure tools, as well as on making it easy for application developers to easily support these tools in web and mobile applications.

Our intuition is that, in many scenarios, it might be enough to collect estimates of statistics and trade off an upper-bounded error with significant efficiency gains. For instance, the accuracy of a recommender system might not be really affected if the statistics we need to train the model are approximated with a small error.

Continue reading Privately gathering statistics and training simple models

“Do you see what I see?” ask Tor users, as a large number of websites reject them but accept non-Tor users

If you use an anonymity network such as Tor on a regular basis, you are probably familiar with various annoyances in your web browsing experience, ranging from pages saying “Access denied” to having to solve CAPTCHAs before continuing. Interestingly, these hurdles disappear if the same website is accessed without Tor. The growing trend of websites extending this kind of “differential treatment” to anonymous users undermines Tor’s overall utility, and adds a new dimension to the traditional threats to Tor (attacks on user privacy, or governments blocking access to Tor). There is plenty of anecdotal evidence about Tor users experiencing difficulties in browsing the web, for example the user-reported catalog of services blocking Tor. However, we don’t have sufficient detail about the problem to answer deeper questions like: how prevalent is differential treatment of Tor on the web; are there any centralized players with Tor-unfriendly policies that have a magnified effect on the browsing experience of Tor users; can we identify patterns in where these Tor-unfriendly websites are hosted (or located), and so forth.

Today we present our paper on this topic: “Do You See What I See? Differential Treatment of Anonymous Users” at the Network and Distributed System Security Symposium (NDSS). Together with researchers from the University of Cambridge, University College London, University of California, Berkeley and International Computer Science Institute (Berkeley), we conducted comprehensive network measurements to shed light on websites that block Tor. At the network layer, we scanned the entire IPv4 address space on port 80 from Tor exit nodes. At the application layer, we fetch the homepage from the most popular 1,000 websites (according to Alexa) from all Tor exit nodes. We compare these measurements with a baseline from non-Tor control measurements, and uncover significant evidence of Tor blocking. We estimate that at least 1.3 million IP addresses that would otherwise allow a TCP handshake on port 80 block the handshake if it originates from a Tor exit node. We also show that at least 3.67% of the most popular 1,000 websites block Tor users at the application layer.

Continue reading “Do you see what I see?” ask Tor users, as a large number of websites reject them but accept non-Tor users

Insecure by design: protocols for encrypted phone calls

The MIKEY-SAKKE protocol is being promoted by the UK government as a better way to secure phone calls. The reality is that MIKEY-SAKKE is designed to offer minimal security while allowing undetectable mass surveillance, through the introduction a backdoor based around mandatory key-escrow. This weakness has implications which go further than just the security of phone calls.

The current state of security for phone calls leaves a lot to be desired. Land-line calls are almost entirely unencrypted, and cellphone calls are also unencrypted except for the radio link between the handset and the phone network. While the latest cryptography standards for cellphones (3G and 4G) are reasonably strong it is possible to force a phone to fall back to older standards with easy-to-break cryptography, if any. The vast majority of phones will not reveal to their user whether such an attack is under way.

The only reason that eavesdropping on land-line calls is not commonplace is that getting access to the closed phone networks is not as easy compared to the more open Internet, and cellphone cryptography designers relied on the equipment necessary to intercept the radio link being only affordable by well-funded government intelligence agencies, and not by criminals or for corporate espionage. That might have been true in the past but it certainly no longer the case with the necessary equipment now available for $1,500. Governments, companies and individuals are increasingly looking for better security.

A second driver for better phone call encryption is the convergence of Internet and phone networks. The LTE (Long-Term Evolution) 4G cellphone standard – under development by the 3rd Generation Partnership Project (3GPP) – carries voice calls over IP packets, and desktop phones in companies are increasingly carrying voice over IP (VoIP) too. Because voice calls may travel over the Internet, whatever security was offered by the closed phone networks is gone and so other security mechanisms are needed.

Like Internet data encryption, voice encryption can broadly be categorised as either link encryption, where each intermediary may encrypt data before passing it onto the next, or end-to-end encryption, where communications are encrypted such that only the legitimate end-points can have access to the unencrypted communication. End-to-end encryption is preferable for security because it avoids intermediaries being able to eavesdrop on communications and gives the end-points assurance that communications will indeed be encrypted all the way to their other communication partner.

Current cellphone encryption standards are link encryption: the phone encrypts calls between it and the phone network using cryptographic keys stored on the Subscriber Identity Module (SIM). Within the phone network, encryption may also be present but the network provider still has access to unencrypted data, so even ignoring the vulnerability to fall-back attacks on the radio link, the network providers and their suppliers are weak points that are tempting for attackers to compromise. Recent examples of such attacks include the compromise of the phone networks of Vodafone in Greece (2004) and Belgacom in Belgium (2012), and the SIM card supplier Gemalto in France (2010). The identity of the Vodafone Greece hacker remains unknown (though the NSA is suspected) but the attacks against Belgacom and Gemalto were carried out by the UK signals intelligence agency – GCHQ – and only publicly revealed from the Snowden leaks, so it is quite possible there are others attacks which remain hidden.

Email is typically only secured by link encryption, if at all, with HTTPS encrypting access to most webmail and Transport Layer Security (TLS) sometimes encrypting other communication protocols that carry email (SMTP, IMAP and POP). Again, the fact that intermediaries have access to plaintext creates a vulnerability, as demonstrated by the 2009 hack of Google’s Gmail likely originating from China. End-to-end email encryption is possible using the OpenPGP or S/MIME protocols but their use is not common, primarily due to their poor usability, which in turn is at least partially a result of having to stay compatible with older insecure email standards.

In contrast, instant messaging applications had more opportunity to start with a clean-slate (because there is no expectation of compatibility among different networks) and so this is where much innovation in terms of end-to-end security has taken place. Secure voice communication however has had less attention than instant messaging so in the remainder of the article we shall examine what should be expected of a secure voice communication system, and in particular see how one of the latest and up-coming protocols, MIKEY-SAKKE, which comes with UK government backing, meets these criteria.

MIKEY-SAKKE and Secure Chorus

MIKEY-SAKKE is the security protocol behind the Secure Chorus voice (and also video) encryption standard, commissioned and designed by GCHQ through their information security arm, CESG. GCHQ have announced that they will only certify voice encryption products through their Commercial Product Assurance (CPA) security evaluation scheme if the product implements MIKEY-SAKKE and Secure Chorus. As a result, MIKEY-SAKKE has a monopoly over the vast majority of classified UK government voice communication and so companies developing secure voice communication systems must implement it in order to gain access to this market. GCHQ can also set requirements of what products are used in the public sector and as well as for companies operating critical national infrastructure.

UK government standards are also influential in guiding purchase decisions outside of government and we are already seeing MIKEY-SAKKE marketed commercially as “government-grade security” and capitalising on their approval for use in the UK government. For this reason, and also because GCHQ have provided implementers a free open source library to make it easier and cheaper to deploy Secure Chorus, we can expect wide use MIKEY-SAKKE in industry and possibly among the public. It is therefore important to consider whether MIKEY-SAKKE is appropriate for wide-scale use. For the reasons outlined in the remainder of this article, the answer is no – MIKEY-SAKKE is designed to offer minimal security while allowing undetectable mass surveillance though key-escrow, not to provide effective security.

Continue reading Insecure by design: protocols for encrypted phone calls