Measuring mobility without violating privacy – a case study of the London Underground

In the run-up to this year’s Privacy Enhancing Technologies Symposium (PETS 2019), I noticed some decidedly non-privacy-enhancing behaviour. Transport for London (TfL) announced they will be tracking the wifi MAC addresses of devices being carried on London Underground stations. Before storing a MAC address it will be hashed with a key, but since this key will remain unchanged for an extended period (2 years), it will be possible to track the movements of an individual over this period through this pseudonymous ID. These traces are likely enough to link records back to the individual with some knowledge of that person’s distinctive travel plans. Also, for as long as the key is retained it would be trivial for TfL (or someone who stole the key) to convert the someone’s MAC address into its pseudonymised form and indisputably learn that that person’s movements.

TfL argues that under the General Data Protection Regulations (GDPR), they don’t need the consent of individuals they monitor because they are acting in the public interest. Indeed, others have pointed out the value to society of knowing how people typically move through underground stations. But the GDPR also requires that organisations minimise the amount of personal data they collect. Could the same goal be achieved if TfL irreversibly anonymised wifi MAC addresses rather than just pseudonymising them? For example, they could truncate the hashed MAC address so that many devices all have the same truncated anonymous ID. How would this affect the calculation of statistics of movement patterns within underground stations? I posed these questions in a presentation at the PETS 2019 rump session, and in this article, I’ll explain why a set of algorithms designed to violate people’s privacy can be applied to collect wifi mobility information while protecting passenger privacy.

It’s important to emphasise that TfL’s goal is not to track past Underground customers but to predict the behaviour of future passengers. Inferring past behaviours from the traces of wifi records may be one means to this end, but it is not the end in itself, and TfL creates legal risk for itself by holding this data. The inferences from this approach aren’t even going to be correct: wifi users are unlikely to be typical passengers and behaviour will change over time. TfL’s hope is the inferred profiles will be useful enough to inform business decisions. Privacy-preserving measurement techniques should be judged by the business value of the passenger models they create, not against how accurate they are at following individual passengers around underground stations in the past. As the saying goes, “all models are wrong, but some are useful”.

Simulating privacy-preserving mobility measurement

To explore this space, I built a simple simulation of Euston Station inspired by one of the TfL case studies. In my simulation, there are two platforms (A and B) and six types of passengers. Some travel from platform A to B; some from B to A; others enter and leave the station at one platform (A or B). Of the passengers that travel between platforms, they can take either the fast route (taking 2 minutes on average) or the slow route (taking 4 minutes on average). Passengers enter the station at a Poisson arrival rate averaging one per second. The probabilities that each new passenger is of a particular type are shown in the figure below. The goal of the simulation is to infer the number of passengers of each type from observations of wifi measurements taken at platforms A and B.

Continue reading Measuring mobility without violating privacy – a case study of the London Underground

Beyond Regulators’ Concerns, Facebook’s Libra Cryptocurrency Faces another Big Challenge: The Risk of Fraud

Facebook has attracted attention through the announcement of their blockchain-based payment network, Libra. This won’t be the first payment system Facebook has launched, but what makes Facebook’s Libra distinctive is that rather than transferring Euros or dollars, the network is designed for a new cryptocurrency, also called Libra. This currency is backed by a reserve of nationally-issued currencies, and so Facebook hopes it will avoid the high volatility of cryptocurrencies like Bitcoin. As a result, Libra won’t be attractive to currency speculators, but Facebook hopes that it will, therefore, be useful for its stated goal – to be a “simple global currency and financial infrastructure that empowers billions of people.”

Reducing currency volatility is only one step towards meeting this goal of scaling cryptocurrencies to billions of users. The Libra blockchain design addresses how the network can maintain the high throughput and low transaction fees needed to compete with existing payment networks like Visa or MasterCard. However, a question that is equally important but as yet unanswered is how Facebook will develop a secure authentication and fraud prevention system that can scale to billions of users while maintaining good usability and low cost.

Facebook designed the Libra network, but in contrast to traditional payment networks, the Libra network is open. Anyone can send transactions through the network, and anyone can write programs (known as “smart contracts”) that control how, and under what conditions, funds can move between Libra accounts. To comply with anti-money-laundering regulations, Know Your Customer (KYC) checks will be performed, but only when Libra enters or leaves the network through exchanges. Transactions moving funds within the network should be accepted if they meet the criteria set out in the applicable smart contract, regardless of who sent them.

The Libra network isn’t even restricted to transactions transferring the Libra currency. Facebook has explicitly designed the Libra blockchain to make it easy for anyone to implement their own currency and benefit from the same technical facilities that Facebook designed for its currency. Other blockchains have tried this. For example, Ethereum has spawned hundreds of special-purpose currencies. But programming a smart contract to implement a new currency is difficult, and errors can be costly. The programming language for smart contracts within the Libra network is designed to help developers avoid some of the most common mistakes.

Facebook’s Libra and Securing the Calibra Wallet

There’s more to setting up an effective currency than just the technology: regulatory compliance, a network of exchanges, and monetary policy are essential. Facebook, through setting up the Libra Association, is focusing its efforts here solely on the Libra currency. The widespread expectation is, therefore, at least initially, the Libra cryptocurrency will be the dominant usage of the network, and most users will send and receive funds through the Calibra wallet smartphone app, developed by a Facebook subsidiary. From the perspective of the vast majority of the world, the Calibra wallet will be synonymous with Facebook’s Libra, and so damage to trust in Calibra will damage the reputation of Libra as a whole.

Continue reading Beyond Regulators’ Concerns, Facebook’s Libra Cryptocurrency Faces another Big Challenge: The Risk of Fraud

Digital Exclusion and Fraud – the Dark Side of Payments Authentication

Today, the Which? consumer rights organisation released the results from its study of how people are excluded from financial services as a result of banks changing their rules to mandate that customers use new technology. The research particularly focuses on banks now requiring that customers register a mobile phone number and be able to receive security codes in SMS messages while doing online banking or shopping. Not only does this change result in digital exclusion – customers without mobile phones or good network coverage will struggle to make payments – but as I discuss in this post, it’s also bad for security.

SMS-based security codes are being introduced to help banks meet their September 2019 deadline to comply with the Strong Customer Authentication requirements of the EU Payment Services Directive 2. These rules state that before making a payment from a customer’s account, the bank must independently verify that the customer really intended to make this payment. UK banks almost universally have decided to meet their obligation by sending a security code in an SMS message to the customer’s mobile phone and asking the customer to type this code into their web browser.

The problem that Which? identified is that some customers don’t have mobile phones, some that do have mobile phones don’t trust their bank with the number, and even those who are willing to share their mobile phone number with the bank might not have network coverage when they need to make a payment. A survey of Which? members found that nearly 1 in 5 said they would struggle to receive the security code they need to perform online banking transactions or online card payments. Remote locations have poorer network coverage than average and it is these areas that are likely to be disproportionately affected by the ongoing bank branch closure programmes.

Outsourcing security

The aspect of this scenario that I’m particularly interested in is why banks chose SMS messages as a security technology in the first place, rather than say sending out dedicated authentication devices to their customers or making a smartphone app. SMS has the advantage that customers don’t need to install an app or have the inconvenience of having to carry around an extra authentication device. The bank also saves the cost of setting up new infrastructure, other than hooking up their payment systems to the phone network. However, SMS has disadvantages – not only does it exclude customers in areas of poor network coverage, but it also effectively outsources security from the bank to the phone networks.

Continue reading Digital Exclusion and Fraud – the Dark Side of Payments Authentication

TESSERACT’s evaluation framework and its use of MaMaDroid

In this blog post, we will describe and comment on TESSERACT, a system introduced in a paper to appear at USENIX Security 2019, and previously published as a pre-print. TESSERACT is a publicly available framework for the evaluation and comparison of systems based on statistical classifiers, with a particular focus on Android malware classification. The authors used DREBIN and our MaMaDroid paper as examples of this evaluation. Their choice is because these are two of the most important state-of-the-art papers, tackling the challenge from different angles, using different models, and different machine learning algorithms. Moreover, DREBIN has already been reproduced by researchers even though the code is not available anymore; MaMaDroid’s code is publicly available (the parsed data and the list of samples are available under request). I am one of MaMaDroid’s authors, and I am particularly interested in projects like TESSERACT. Therefore, I will go through this interesting framework and attempt to clarify a few misinterpretations made by the authors about MaMaDroid.

The need for evaluation frameworks

The information security community and, in particular, the systems part of it, feels that papers are often rejected based on questionable decisions or, on the other hand, that papers should be more rigorous, trying to respect certain important characteristics. Researchers from Dutch universities published a survey of papers published to top venues in 2010 and 2015 where they evaluated if these works were presenting “crimes” affecting completeness, relevancy, soundness, and reproducibility of the work. They have shown how the newest publications present more flaws. Even though the authors included their works in the analyzed ones and did not word the paper as a wall of shame by pointing the finger against specific articles, the paper has been seen as an attack to the community rather than an incitement to produce more complete papers. To the best of my knowledge, unfortunately, the paper has not yet been accepted for publication. TESSERACT is another example of researchers’ effort in trying to make the community work more rigorous: most system papers present accuracies that are close to 100% in all the tests done; however, when some of them have been tested on different datasets, their accuracy was worse than a coin toss.

These two works are part of a trend that I personally find important for our community, to allow works that are following other ones on the chronological aspects to be evaluated in a more fair way. I explain with a personal example: I recall when my supervisor told me that at the beginning he was not optimistic about MaMaDroid being accepted at the first attempt (NDSS 2017) because most of the previous literature shows results always over 98% accuracy and that gap of a few percentage points can be enough for some reviewers to reject. When we asked an opinion of a colleague about the paper, before we submitted it for peer-review, this was his comment on the ML part: “I actually think the ML part is super solid, and I’ve never seen a paper with so many experiments on this topic.” We can see completely different reactions over the same specific part of the work.

TESSERACT

The goal of this post is to show TESSERACT’s potential while pointing out the small misinterpretations of MaMaDroid present in the current version of the paper. The authors contacted us to let us read the paper and see whether there has been any misinterpretation. I had a constructive meeting with the authors where we also had the opportunity to exchange opinions on the work. Following the TESSERACT description, there will be a section related to MaMaDroid’s misinterpretations in the paper. The authors told me that the newest versions would be updated according to what we discussed.

Continue reading TESSERACT’s evaluation framework and its use of MaMaDroid

Security code AutoFill: is this new iOS feature a security risk for online banking?

A new feature for iPhones in iOS 12 – Security Code AutoFill – is supposed to improve the usability of Two Factor Authentication but could place users at risk of falling victim to online banking fraud.

Two Factor Authentication (2FA), which is often referred to as Two Step Verification, is an essential element for many security systems, especially those online and accessed remotely. In most cases, it provides extended security by checking if the user has access to a device. In SMS-based 2FA, for example, a user registers their phone number with an online service. When this service sees a login attempt for the corresponding user account, it sends a One Time Password (OTP), e.g. four to six digits, to the registered phone number. The legitimate user then receives this code and is able to quote it during the login process, but an impersonator won’t.

In a recent development by Apple, announced at its developer conference WWDC18, they are set to automate this last step to improve user experience with 2FA with a new feature that is set to be introduced to iOS in version 12. The Security Code AutoFill feature, currently available to developers in a beta version, will allow the mobile device to scan incoming SMS messages for such codes and suggest them at the top of the default keyboard.

Description of new iOS 12 Security Code AutoFill feature (source: Apple)

Currently, these SMS codes rely on the user actively switching apps and memorising the code, which can take a couple of seconds. Some users deploy alternative try strategies such as memorising the code from the preview banner and hastily typing it down. Apple’s new iOS feature will require only a single tap from the user. This will make the login process faster and less error prone, a significant improvement to the usability of 2FA. It could also translate into an increased uptake of 2FA among iPhone users.

Example of Security Code AutoFill feature in operation on iPhone (source: Apple)

If users synchronise SMS with their MacBook or iMac, the existing Text Message Forwarding feature will push codes from their iPhone and enable Security Code AutoFill in Safari.

Example of Security Code AutoFill feature synchronised with macOS Mojave (source: Apple)

Reducing friction in user interaction to improve technology uptake for new users, and increase the usability and satisfaction for existing users, is not a new concept. It has not only been discussed in academia at length but is also a common goal within industry, e.g. in banking. This is evident in how the financial and payment industry has encouraged contactless (Near Field Communication – NFC) payments, which makes transactions below a certain threshold much quicker than traditional Chip and PIN payments.

Continue reading Security code AutoFill: is this new iOS feature a security risk for online banking?

MAMADROID: Detecting Android Malware by Building Markov Chains of Behavioral Models

Now making up 85% of mobile devices, Android smartphones have become profitable targets for cybercriminals, allowing them to bypass two factor authentication or steal sensitive information such as credit cards details or login credentials.

Smartphones have limited battery and memory available, therefore, the defences that can be deployed on them have limitations. For these reasons, malware detection is usually performed in a centralised fashion by Android market operators. As previous work and even recent news have shown, however, even Google Play Store is not able to detect all malicious apps; to make things even worse, there are countries in which Google Play Store is blocked. This forces users to resort to third party markets, which are usually performing less careful malware checks.

Previous malware detection studies focused on models based on permissions or on specific API calls. While the first method is prone to false positives, the latter needs constant retraining, because apps as well as the Android framework itself are constantly changing.

Our intuition is that, while malicious and benign apps may call the same API calls during their execution, the reason why those calls are made may be different, resulting in them being called in a different order. For this reason, we decided to rely on sequences of calls that, as explained later, we abstract to higher level for performance, feasibility, and robustness reasons. To implement this idea we created MaMaDroid, a system for Android malware detection.

MaMaDroid

MaMaDroid is built by combining four different phases:

  • Call graph extraction: starting from the apk file of an app, we extract the call graph of the analysed sample.
  • Sequence extraction: from the call graph, we extract the different potential paths as sequences of API calls and abstract all those calls to higher levels.
  • Markov Chain modelling: all the samples got their sequences of abstracted calls, and these sequences can be modelled as transitions among states of a Markov Chain.
  • Classification: Given the probabilities of transition between states of the chains as features set, we apply machine learning to detect malicious apps.
Four phases of MaMaDroid

Call graph extraction

MaMaDroid is a system based only on static analysis. To analyse the app, we use off-the-shelf tools, such as Soot and FlowDroid for the first step of the system.

Sequence Extraction

Taking the call graph as input, we extract the sequences of functions potentially called by the program and, by identifying the set of entry nodes, enumerate all the possible paths and output them as sequences of API calls.

Example call graph in which we can observe 3 different potential paths, or sequences, starting from the root node

Continue reading MAMADROID: Detecting Android Malware by Building Markov Chains of Behavioral Models

Battery Status Not Included: Assessing Privacy in W3C Web Standards

Designing standards with privacy in mind should be a standard in itself. Historically this was not always the case and the idea of designing systems with privacy is relatively new – it dates from the beginning of this decade. One of the milestones is accepting this view on the IETF level, dating 2013.

This note and the referenced research work focusses on designing standards with privacy in mind. The World Wide Web Consortium (W3C) is one of the most important standardization bodies. W3C is standardizing something everyone – billions of people – use every day for entertainment or work, the web and how web browsers work. It employs a focused, lengthy, but very open and consensus-driven environment in order to make sure that community voice is always heard during the drafting of new specifications. The actual inner workings of W3C are surprisingly complex. One interesting observation is that the W3C Process Document document does not actually mention privacy reviews at all. Although in practice this kind of reviews are always the case.

However, as the web along with its complexity and web browser features constantly grow, and as browsers and the web are expected to be designed in an ever-rapid manner, considering privacy becomes a challenge. But this is what is needed in order to obtain a good specification and well-thought browser features, with good threat models, and well designed privacy strategies.

My recent work, together with the Princeton team, Arvind Narayanan and Steven Englehardt, is providing insight into the standardization process at W3C, specifically the privacy areas of standardization. We point out a number of issues and room for improvements. We also provide recommendations as well as broad evidence of a wide misuse and abuse (in tracking and fingerprinting) of a browser mechanism called Battery Status API.

Battery Status API is a browser feature that was meant to allow websites access the information concerning the battery state of a user device. This seemingly innocuous mechanism initially had no identified privacy concerns. However, following my previous research work in collaboration with Gunes Acar this view has changed (Leaking Battery and a later note).

The work I authored in 2015 identifies a number or privacy risks of the API. Specifically – the issue of potential exploitation of the API to tracking, as well as the potential possibility of recovering the raw value of the battery capacity – something that was not foreseen to be accessed by web sites. Ultimately, this work has led to a fix in Firefox browser and an update to W3C specification. But the matter was not over yet.

Continue reading Battery Status Not Included: Assessing Privacy in W3C Web Standards

On the security and privacy of the ultrasound tracking ecosystem

In April 2016, the US Federal Trade Commission (FTC) sent warning letters to 12 Google Play app developers. The letters were addressed to those who incorporated the SilverPush framework in their apps, and reminded developers who used tracking software to explicitly inform their users (as seen in Section 5 of the Federal Trade Commission Act). The incident was covered by popular press and privacy concerns were raised. Shortly after, SilverPush claimed no active partnerships in the US and the buzz subsided.

Unfortunately, as the incident was resolved relatively fast, very few technical details of the technology were made public. To fill in this gap and understand the potential security implications, we conducted an in-depth study of the SilverPush framework and all the associated technologies.

The development of the framework was motivated by a fast-increasing interest of the marketing industry in products performing high-accuracy user tracking, and their derivative monetization schemes. This resulted in a high demand for cross-device tracking techniques with increased accuracy and reduced prerequisites.

The SilverPush framework fulfilled both of these requirements, as it provided a novel way to track users between their devices (e.g., TV, smartphone), without any user actions (e.g., login to a single platform from all their devices). To achieve that, the framework realized a previously unseen cross-device tracking technique (i.e., ultrasound cross-device tracking, uXDT) that offered high tracking accuracy, and came with various desirable features (e.g., easy to deploy, imperceptible by users). What differentiated that framework from existing ones was the use of high-frequency, inaudible ultrasonic beacons (uBeacons) as a medium/channel for identifier transmission between the user’s devices. This is also offered a major advantage to uXDT, against other competing technologies, as uBeacons can be emitted by most commercial speakers and captured by most commercial microphones. This eliminates the need for specialized emission and/or capturing equipment.

Aspects of a little-known ecosystem

The low deployment cost of the technology fueled the growth of a whole ecosystem of frameworks and applications that use uBeacons for various purposes, such as proximity marketing, audience analytics, and device pairing. The ecosystem is built around the near-ultrasonic transmission channel, and enables marketers to profile users.

Unfortunately, users are often given limited information on the ecosystem’s inner workings. This lack of transparency has been the target of great criticism from the users, the security community and the regulators. Moreover, our security analysis revealed a false assumption in the uBeacon threat model that can be exploited by state-level adversaries to launch complex attacks, including one that de-anonymizes the users of anonymity networks (e.g., Tor).

On top of these, a more fundamental shortcoming of the ecosystem is the violation of the least privilege principle, as a consequence of the access to the device’s microphone. More specifically, any app that wants to employ ultrasound-based mechanisms needs to gain full access to the device’s microphone, as there is currently no way to gain access only to the ultrasound spectrum. This clearly violates the least privilege principle, as the app has now access to all audible frequencies and allows a potentially malicious developer to request access to the microphone for ultrasound-pairing purposes, and then use it to eavesdrop the user’s discussions. This also results in any ultrasound-enabled apps to risk being perceived as “potentially malicious” by the users.

Mitigation

To address these shortcomings, we developed a set of countermeasures aiming to provide protection to the users in the short and medium term. The first one is an extension for the Google Chrome browser, which filters out all ultrasounds from the audio output of the websites loaded by the user. The extension actively prevents web pages from emitting inaudible sounds, and thus completely thwarts any unsolicited ultrasound-tracking attempts. Furthermore, we developed a patch for the Android permission system that allows finer-grained control over the audio channel, and forces applications to declare their intention to capture sound from the inaudible spectrum. This will properly separate the permissions for listening to audible sound and sound in the high-frequency spectrum, and will enable the end users to selectively filter the ultrasound frequencies out of the signal acquired by the smartphone microphone.

More importantly, we argue that the ultrasound ecosystem can be made secure only with the standardization of the ultrasound beacon format. During this process, the threat model will be revised and the necessary security features for uBeacons will be specified. Once this process is completed, APIs for handling uBeacons can be implemented in all major operating systems. Such an API would provide methods for uBeacon discovery, processing, generation and emission, similar to those found in the Bluetooth Low Energy APIs. Thereafter, all ultrasound-enabled apps will need access only to this API, and not to the device’s microphone. Thus, solving the problem of over-privileging that exposed the user’s sensitive data to third-party apps.

Discussion

Our work provides an early warning on the risks looming in the ultrasound ecosystem, and lays the foundations for the secure use of this set of technologies. However, it also raises several questions regarding the security of the audio channel. For instance, in a recent incident a journalist accidentally injected commands to several amazon echo devices, which then allegedly tried to order products online. This underlines the need for security features in the audio channel. Unfortunately, due to the variety of use cases, a universal solution that could be applied to the lower communication layers seems unlikely. Instead, solutions must be sought in the higher communications layers (e.g., application layers), and should be the outcome of careful threat modeling.

Privacy analysis of the W3C Proximity Sensor specification

Mobile developers are familiar with proximity sensors. These provide information about the distance of an object from the device providing access to the sensor (i.e. smartphone, tablet, laptop, or another Web of Things device). This object is usually the user’s head or hand.

Proximity sensors provide binary values such as far (from the object) or near (respectively), or a more verbose readout, in centimeters.

There are many useful applications for proximity. For example, an application can turn off the screen if the user holds the device close to his/her head (face detection). It is also handy for avoiding the execution of undesired actions, possibly arising from scratching the head with the phone’s screen.

The Proximity Sensor API specification is being standardized by W3C. Every web site will be able to access this information. Let’s focus on the privacy engineering point of view.

Proximity is the distance between an object and device. The latest version of W3C Proximity Sensors API takes advantage of a soon-to-be standard Generic Sensors API. The sensor provides the proximity distance in centimeters.

The use of Proximity Sensors is simple; an example displaying the current distance in centimeters using the modern syntax of Generic Sensors API is as follows:

let sensor = new ProximitySensor();

sensor.start();

sensor.onchange = function(){console.log(event.reading.distance)};  

This syntax might later allow requesting various proximity sensors (if the device has more than one), including those based on the sensor’s position (such as “rear” or “front”), but the current implementations generally still use the legacy version from the previous edition of the spec:

window.addEventListener('deviceproximity', function(event) {  
console.log(event.value)  
});

From the perspective of privacy considerations, there is currently no significant difference between those two API versions.

Privacy analysis of Proximity Sensor API

Proximity sensor readout is not providing much data, so it may appear there are no privacy implications, but there are good reasons for performing such analysis. Let’s list two:

Firstly, designing new standards and systems with privacy in mind — privacy by design — is a required practice and a good idea. Secondly, in some circumstances even potentially insignificant mechanisms can still bring consequences from privacy point of view. For example, multiple identifiers might be helpful in deanonymizing users. Non-obvious data leaks can surface, as well.

So let’s discuss some of the possible issues.

The average distance from the device to the user’s face (i.e. object) could be used to differentiate and discriminate between users. While the severity is hard to establish, future changes (i.e. influencing with other external data) cannot be ruled out.

If proximity patterns would be individually-attributable, this would offer a possibility to enhance user profiling based on the analysis of device use patterns. For example, the following could be obtained and analyzed:

  • Frequency of the user’s patterns of device use (i.e. waving a hand in front of the proximity sensor)
  • Frequency of the user’s zooming in and out (i.e. device close to the user’s head)
  • Patterns of use. Does the way the user hold the device vary during the day? How?
  • What are the mechanics of holding the device close to the user’s head? Can the distance vary?

Some users may use the mobile operating system’s zoom capability to increase the font or images, but others might casually prefer to hold the device closer to their eyes. Such behavioral differences can also be of note.

Proximity sensors can provide the following data: the current proximity distance and max, the maximum distance readout supported by the device. In case the value of max would differ between various implementations (e.g. among browsers, devices), it could form an identifier. And the two (max, distance) combined could also be used as short-lived identifiers. Moreover, it is not so difficult to imagine a situation where those identifiers are actually not so short-lived.

Recommendations

First of all, is there a need to provide a verbose proximity readout at all? For example, is providing readouts of proximity (distance) value up to 150 cm necessary?

In general, a device (browser, a Web of Things device) should be capable of informing the user when a web site accesses proximity information. The user should also be able to inspect which web sites – and how frequently – accessed the API.

Finally, the sensors should provide an adequately verbose readout of the distance.

Proximity Sensors should also be subject to permissions.

Demonstration

At the moment, the implementations of proximity sensors are limited. But a demonstration is running on my SensorsPrivacy research project (tested against Firefox Mobile on Android). In my case (Nexus 5), Firefox offered data in centimeters, but the verbosity was limited to 5 centimeters on my device. Again, this is a matter of hardware and software. In principle, the accuracy could be much higher and one can imagine an accuracy a sensor providing more accurate readouts (even up to 1 meter).

The screen dump below shows how the example demonstrates the use of proximity data (and whether it works on a browser). The site changes its background color if an object is placed near the proximity sensor of a device.

The readout from the demonstration shows a relative time between events (first column), and the proximity distance (second column).

398  5  
1011 0  
404  5  
607  0  

The sensor on my device is reporting two values of distance: 0 (near) and 5 (far), in centimeters. In general, the granularity is subject to the following: standard, implementation and hardware.

And this suggests something, of course. So let’s complement the privacy analysis from the previous section to incorporate a timing analysis.

Even such a limited readout can help performing behavioural analysis, simply by profiling based on time series. As we can see, the sensor readout is quite sensitive in respect to time and can measure with sub-200 ms granularity. The demonstration proof of principle is also showing the minimum relative detected time between events. Feel free to test the performance of your fingers and/or hardware!

Summary

When designing a standard project or an implementation, paying attention to details is imperative. This includes consideration of even potential risks. This is especially the case if the software or systems will be used by millions or hundreds of millions of users, i.e. you are aiming for success.

Finally, standards and specifications are ideal for issuing guidance and good practices

 

This post originally appeared on Security, Privacy & Tech Inquiries, the blog of Lukasz Olejnik. An accompanying demonstration is available on SensorsPrivacy (a project studing privacy of web sensors).

Battery Status readout as a privacy risk

Privacy risks and threats arise even in seemingly innocuous mechanisms. It is a fairly regular issue.

Over a year ago, I was researching the risk of the W3C Battery Status API. The mechanism allows a web site to read the battery level of a device (smartphone, laptop, etc.). One of the positive use cases may be, for example, stopping the execution of intensive operations if the battery is running low.

Our privacy analysis of Battery Status API revealed interesting results.

Privacy analysis of Battery API

The battery status provides the following information:

  • the current level of battery (format: 0.0–1.0, for empty and full battery, respectively)
  • time to a full discharge of battery (in seconds)
  • time to a full charge of battery, if connected to a charger (in seconds)

These items are updated whenever a new value is supplied by the operating system

It turns out that privacy risks may surface even in this kind of – seemingly innocuous – data and access mechanisms.

Frequency of changes

The frequency of changes in the reported readouts from Battery Status API potentially allow the monitoring of users’ computer use habits; for example, potentially enabling analyzing of how frequently the user’s device is under heavy use. This could lead to behavioral analysis.

Additionally, identical installations of computer deployments in standard environments (e.g. at schools, work offices, etc.) are often are behind a NAT. In simple terms, NAT allows a number of users to browse the Internet with an – externally seen – single IP address. The ability of observing any differences between otherwise identical computer installations – potentially allows particular users to be identified (and targeted?).

Battery readouts as identifiers

The information provided by the Battery Status API is not always subject to rapid changes. In other words, this information may be static for a period of time; this in turn may give rise to a short-lived identifier. The situation gets especially interesting when we consider a scenario of users sometimes clearing standard web identifiers (such as cookies). In such a case, a web script could potentially analyse identifiers provided by Battery Status API, and this information then could possibly even lead to re-creation of other identifiers. A simple sketch follows.

Continue reading Battery Status readout as a privacy risk