On the security and privacy of the ultrasound tracking ecosystem

In April 2016, the US Federal Trade Commission (FTC) sent warning letters to 12 Google Play app developers. The letters were addressed to those who incorporated the SilverPush framework in their apps, and reminded developers who used tracking software to explicitly inform their users (as seen in Section 5 of the Federal Trade Commission Act). The incident was covered by popular press and privacy concerns were raised. Shortly after, SilverPush claimed no active partnerships in the US and the buzz subsided.

Unfortunately, as the incident was resolved relatively fast, very few technical details of the technology were made public. To fill in this gap and understand the potential security implications, we conducted an in-depth study of the SilverPush framework and all the associated technologies.

The development of the framework was motivated by a fast-increasing interest of the marketing industry in products performing high-accuracy user tracking, and their derivative monetization schemes. This resulted in a high demand for cross-device tracking techniques with increased accuracy and reduced prerequisites.

The SilverPush framework fulfilled both of these requirements, as it provided a novel way to track users between their devices (e.g., TV, smartphone), without any user actions (e.g., login to a single platform from all their devices). To achieve that, the framework realized a previously unseen cross-device tracking technique (i.e., ultrasound cross-device tracking, uXDT) that offered high tracking accuracy, and came with various desirable features (e.g., easy to deploy, imperceptible by users). What differentiated that framework from existing ones was the use of high-frequency, inaudible ultrasonic beacons (uBeacons) as a medium/channel for identifier transmission between the user’s devices. This is also offered a major advantage to uXDT, against other competing technologies, as uBeacons can be emitted by most commercial speakers and captured by most commercial microphones. This eliminates the need for specialized emission and/or capturing equipment.

Aspects of a little-known ecosystem

The low deployment cost of the technology fueled the growth of a whole ecosystem of frameworks and applications that use uBeacons for various purposes, such as proximity marketing, audience analytics, and device pairing. The ecosystem is built around the near-ultrasonic transmission channel, and enables marketers to profile users.

Unfortunately, users are often given limited information on the ecosystem’s inner workings. This lack of transparency has been the target of great criticism from the users, the security community and the regulators. Moreover, our security analysis revealed a false assumption in the uBeacon threat model that can be exploited by state-level adversaries to launch complex attacks, including one that de-anonymizes the users of anonymity networks (e.g., Tor).

On top of these, a more fundamental shortcoming of the ecosystem is the violation of the least privilege principle, as a consequence of the access to the device’s microphone. More specifically, any app that wants to employ ultrasound-based mechanisms needs to gain full access to the device’s microphone, as there is currently no way to gain access only to the ultrasound spectrum. This clearly violates the least privilege principle, as the app has now access to all audible frequencies and allows a potentially malicious developer to request access to the microphone for ultrasound-pairing purposes, and then use it to eavesdrop the user’s discussions. This also results in any ultrasound-enabled apps to risk being perceived as “potentially malicious” by the users.

Mitigation

To address these shortcomings, we developed a set of countermeasures aiming to provide protection to the users in the short and medium term. The first one is an extension for the Google Chrome browser, which filters out all ultrasounds from the audio output of the websites loaded by the user. The extension actively prevents web pages from emitting inaudible sounds, and thus completely thwarts any unsolicited ultrasound-tracking attempts. Furthermore, we developed a patch for the Android permission system that allows finer-grained control over the audio channel, and forces applications to declare their intention to capture sound from the inaudible spectrum. This will properly separate the permissions for listening to audible sound and sound in the high-frequency spectrum, and will enable the end users to selectively filter the ultrasound frequencies out of the signal acquired by the smartphone microphone.

More importantly, we argue that the ultrasound ecosystem can be made secure only with the standardization of the ultrasound beacon format. During this process, the threat model will be revised and the necessary security features for uBeacons will be specified. Once this process is completed, APIs for handling uBeacons can be implemented in all major operating systems. Such an API would provide methods for uBeacon discovery, processing, generation and emission, similar to those found in the Bluetooth Low Energy APIs. Thereafter, all ultrasound-enabled apps will need access only to this API, and not to the device’s microphone. Thus, solving the problem of over-privileging that exposed the user’s sensitive data to third-party apps.

Discussion

Our work provides an early warning on the risks looming in the ultrasound ecosystem, and lays the foundations for the secure use of this set of technologies. However, it also raises several questions regarding the security of the audio channel. For instance, in a recent incident a journalist accidentally injected commands to several amazon echo devices, which then allegedly tried to order products online. This underlines the need for security features in the audio channel. Unfortunately, due to the variety of use cases, a universal solution that could be applied to the lower communication layers seems unlikely. Instead, solutions must be sought in the higher communications layers (e.g., application layers), and should be the outcome of careful threat modeling.