Adblocking and Counter-Blocking: A Slice of the Arms Race

anti-adblocking message from WIRED
If you use an adblocker, you are probably familiar with messages of the kind shown above, asking you to either disable your adblocker, or to consider supporting the host website via a donation or subscription. This is the battle du jour in the ongoing adblocking arms race — and it’s one we explore in our new report Adblocking and Counter-Blocking: A Slice of the Arms Race.

The reasons for the rising popularity of adblockers include improved browsing experience, better privacy, and protection against malvertising. As a result, online advertising revenue is gravely threatened by adblockers, prompting publishers to actively detect adblock users, and subsequently block them or otherwise coerce the user to disable the adblocker — practices we refer to as anti-adblocking. While there has been a degree of sound and fury on the topic, until now we haven’t been able to understand the scale, mechanism and dynamics of anti-adblocking. This is the gap we have started to address, together with researchers from the University of Cambridge, Stony Brook University, University College London, University of California Berkeley, Queen Mary University of London and International Computer Science Institute (Berkeley). We address some of these questions by leveraging a novel approach for identifying third-party services shared across multiple websites to present a first characterization of anti-adblocking across the Alexa Top-5K websites.

We find that at least 6.7% of Alexa Top-5K websites employ anti-adblocking, with the practices finding adoption across a diverse mix of publishers; particularly publishers of “General News”, “Blogs/Wiki”, and “Entertainment” categories. It turns out that these websites owe their anti-adblocking capabilities to 14 unique scripts pulled from 12 unique domains. Unsurprisingly, the most popular domains are those that have skin in the game — Google, Taboola, Outbrain, Ensighten and Pagefair — the latter being a company that specialises in anti-adblocking services. Then there are in-house anti-adblocking solutions that are distributed by a domain to client websites belonging to the same organisation: TripAdvisor distributes an anti-adblocking script to its eight websites with different country code top-level domains, while adult websites (all hosted by MindGeek) turn to DoublePimp. Finally, we visited a sample website for each anti-adblocking script via AdBlock Plus, Ghostery and Privacy Badger, and discovered that half of the 12 anti-adblocking suppliers are counter-blocked by at least one adblocker — suggesting that the arms race has already entered the next level.

It is hard to say how many levels deeper the adblocking arms race might go. While anti-adblocking may provide temporary relief to publishers, it is essentially band-aid solution to mask a deeper issue — the disequilibrium between ads (and, particularly, their behavioural tracking back-end) and information. Any long term solution must address the reasons that brought users to adblockers in the first place. In the meantime, as the arms race continues to escalate, we hope that studies such as ours will bring transparency to this opaque subject, and inform policy that moves us out of the current deadlock.

 

“Ad-Blocking and Counter Blocking: A Slice of the Arms Races” by Rishab Nithyanand, Sheharbano Khattak, Mobin Javed, Narseo Vallina-Rodriguez, Marjan Falahrastegar, Julia E. Powles, Emiliano De Cristofaro, Hamed Haddadi, and Steven J. Murdoch. arXiv:1605.05077v1 [cs.CR], May 2016.

This post also appears on the University of Cambridge Computer Laboratory Security Group blog, Light Blue Touchpaper.

On the hunt for Facebook’s army of fake likes

As social networks are increasingly relied upon to engage with people worldwide, it is crucial to understand and counter fraudulent activities. One of these is “like farming” – the process of artificially inflating the number of Facebook page likes. To counter them, researchers worldwide have designed detection algorithms to distinguish between genuine likes and artificial ones generated by farm-controlled accounts. However, it turns out that more sophisticated farms can often evade detection tools, including those deployed by Facebook.

What is Like Farming?

Facebook pages allow their owners to publicize products and events and in general to get in touch with customers and fans. They can also promote them via targeted ads – in fact, more than 40 million small businesses reportedly have active pages, and almost 2 million of them use Facebook’s advertising platform.

At the same time, as the number of likes attracted by a Facebook page is considered a measure of its popularity, an ecosystem of so-called “like farms” has emerged that inflate the number of page likes. Farms typically do so either to later sell these pages to scammers at an increased resale/marketing value or as a paid service to page owners. Costs for like farms’ services are quite volatile, but they typically range between $10 and $100 per 100 likes, also depending on whether one wants to target specific regions — e.g., likes from US users are usually more expensive.

Screenshot from http://www.getmesomelikes.co.uk/
Screenshot from http://www.getmesomelikes.co.uk/

How do farms operate?

There are a number of possible way farms can operate, and ultimately this dramatically influences not only their cost but also how hard it is to detect them. One obvious way is to instruct fake accounts, however, opening a fake account is somewhat cumbersome, since Facebook now requires users to solve a CAPTCHA and/or enter a code received via SMS. Another strategy is to rely on compromised accounts, i.e., by controlling real accounts whose credentials have been illegally obtained from password leaks or through malware. For instance, fraudsters could obtain Facebook passwords through a malicious browser extension on the victim’s computer, by hijacking a Facebook app, via social engineering attacks, or finding credentials leaked from other websites (and dumped on underground forums) that are also valid on Facebook.

Continue reading On the hunt for Facebook’s army of fake likes

Come work with us!

I’m very pleased to announce that — along with George Danezis and Tomaso Aste, head of our Financial Computing group — I’ve been awarded a grant to continue our work on distributed ledgers (aka “blockchain-like things”) for the next three years.

Our group has already done a lot of research in this space, including George’s and my recent paper on centrally banked cryptocurrencies (at NDSS 2016) and Jens’ paper (along with Markulf Kohlweiss, a frequent UCL collaborator) on efficient ring signatures and applications to Zerocoin-style cryptocurrencies (at Eurocrypt 2015).  It’s great to have this opportunity to further investigate the challenges in this space and develop our vision for the future of these technologies, so big thanks to the EPSRC!

Anyway, the point of this post is to advertise, as part of this grant, three positions for postdoctoral researchers.  We are also seeking collaboration with any industrial partners investigating the potential usage of distributed ledgers, and in particular ones looking at the application of these ledgers across the following settings (or with a whole new setting in mind!):

  • Identity management. How can identities be stored, shared, and issued in a way that preserves privacy, prevents theft and fraud, and allows for informal forms of identity in places where no formal ones exist?
  • Supply chain transparency. How can supply chain information be stored in a way that proves integrity, preserves the privacy of individual actors, and can be presented to the end customer in a productive way?
  • Financial settlement. How can banking information be stored in a way that allows banks to easily perform gross settlement, reduces the burden on a central bank, and enables auditability of the proper functioning of the system?
  • Administration of benefits. How can benefits be administered to and used by disadvantaged populations in a way that preserves privacy, provides useful visibility into their spending, and protects against potential abuses of the system?

We expect the postdoctoral researchers to work with us and with each other on the many exciting problems in this space, which are spread across cryptography, computer and network security, behavioural economics, distributed systems, usable security, human-computer interaction, and software engineering (just to name a few!).  I encourage anyone interested to reach out to me (Sarah) to discuss this further, whether or not they’ve already done research on the particular topic of distributed ledgers.

That’s all for now, but please get in touch with me if you have any questions, and in the years to come I hope to invite many people to come work with us in London and to announce the various outcomes of this exciting project!

Bitcoin workshop at Financial Crypto 2016

On 26 February 2016 the 3rd workshop on Bitcoin and Blockchain Research in association with Financial Cryptography 2016 took place in Barbados. This workshop aims to bring together researchers interested in cryptocurrencies to present their latest work and discuss together the future of Bitcoin. The program chairs were Sarah Meiklejohn from University College London and Jeremy Clark from Concordia University. The themes addressed during the workshop included blockchain architecture, anonymity, and proof of work alternatives. This event was also a great way for researchers with similar interests to network and share their ideas.

The workshop consisted of 2 keynotes and 4 plenary sessions: Bitcoin network analysis, Enhancing Bitcoin, Ethereum, and Blockchain Architecture.

Nathaniel Popper kicked off the day with a keynote presentation. Nathaniel is a journalist from the New York Times and author of the book ‘Digital Gold: The Untold story of Bitcoin’. He went on to speak about the history of Bitcoin covering Silk Road, Mt Gox, as well as the role of governments.

Then the first session, about Bitcoin network analysis, included two talks. The first one, Stressing Out: Bitcoin Stress Testing, by Khaled Baqer et al., was about DoS attack on Bitcoin, and was presented by Ross Anderson due to visa issues. The second one was Why buy when you can rent? Bribery attacks on Bitcoin-style consensus, by Joseph Bonneau on bribery attacks and cloud mining.

The next session, Enhancing Bitcoin, started with a talk by Ethan Heilman, Blindly Signed Contracts: Anonymous On-Blockchain and Off-Blockchain Bitcoin Transactions, on how to enhance Bitcoin anonymity. Then Mathieu Turuani gave a talk on Automated Verification of Electrum wallet, followed by Aggelos Kiayias on Proof of Proof of Work. Today many light-weight clients use SPV verification instead of full verification. Is it possible to have an even lighter verification? They introduce a modification of the Bitcoin blockchain protocol with sublinear complexity in the length of the chain.

Continue reading Bitcoin workshop at Financial Crypto 2016

Biometrics for payments

HSBC and First Direct recently announced that they are introducing fingerprint and voice recognition authentication for customers of online and telephone banking. In my own research, I first found nearly 20 years ago that people who have a multitude of passwords and PINs cannot manage them as security experts want them to. As the number of digital devices and services we use has increased rapidly, managing dozens of login details has become a headache for most people. We recently reported that most bank customers juggle multiple PINs, and are unable to follow the rules that banks set in their contracts. Our research also found that many people dislike the 2-factor token solutions that are currently used by many UK banks.

Passwords as most people use them today are not particularly secure. Attackers can easily attempt to collect information on individuals, using leaks of password files not properly protected by some websites, “phishing” scams or malware planted on people’s computers. Reusing a banking password on other websites – something that many of us do because we cannot remember dozens of different passwords – is also a significant security risk.

The introduction of fingerprint recognition on smartphones – such as the iPhone – has delighted many users fed up with entering their PINs dozens of times a day. So the announcement that HSBC and other banks will be able to use the fingerprint sensor on their smartphones for banking means that millions of consumers will finally be able to end their battle with passwords and PINs and use biometrics instead. Other services people access from their smartphones are likely to follow suit. And given the negative impact that cumbersome authentication via passwords and PINs has on staff productivity and morale in many organisations, we can expect to see biometrics deployed in work contexts, too.

But while biometrics – unlike passwords – do not require mental gymnastics from users, there are different usability challenges. Leveraging the biometric from the modality of interaction – e.g. voice recognition phone-based interactions – makes authentication an easy task, but it will work considerably better in quiet environments than noisy ones – such as a train stations or with many people talking in the background. As many smartphone users have learnt, fingerprint sensors have a hard time recognising cold and wet fingers. And – as we report in a paper presented at IEEE Identity, Security and Behavior Analysis last week – privacy concerns mean some users ‘don’t like putting their face on the Internet’. Biometrics can’t come soon enough for most users, but there is still a lot of design and testing work to be done to make biometrics work for different interaction, physical and social contexts.

Privately gathering statistics and training simple models

Last week, Luca Melis has presented our NDSS16 paper “Efficient Private Statistics with Succinct Sketches“, where we show how to privately and efficiently aggregate data from many sources and/or large streams, and then use the aggregate to extract useful statistics and train simple machine learning models.

Our work is motivated by a few “real-world” problems:

  • Media broadcasting providers like the BBC (with which we collaborate) routinely collect data from their users about videos they have watched (e.g., on BBC’s iPlayer) in order to provide users with personalized suggestions for other videos, based on recommender systems like Item k-Nearest Neighbor (ItemKNN)
  • Urban and transport planning committees, such as London’s mass transport operators, need to gather statistics about people’s movements and commutes, e.g., to improve transportation services and predict near-future trends and anomalies on a short notice.
  • Network infrastructures like the Tor network need to gather traffic statistics, like the number of, and traffic generated by, Tor hidden services, in order to tune design decisions as well as convince their founders the infrastructure is used for the intended purposes.

While different in their application, these examples exhibit a common feature: the need for providers to aggregate large amounts of sensitive information from large numbers of data sources, in order to produce aggregate statistics and possibly train machine learning models.

Prior work has proposed a few cryptographic tools for privacy-enhanced computation that could be use for private collection of statistics. For instance, by relying on homomorphic encryption and/or secret sharing, an untrusted aggregator can receive encrypted readings from users and only decrypt their sum. However, these require users to perform a number of cryptographic operations, and transmit a number of ciphertexts, linear in the size of their inputs, which makes it impractical for the scenarios discussed above, whereby inputs to be aggregated are quite large. For instance, if we use ItemKNN for the recommendations, we would need to aggregate values for “co-views” (i.e., videos that have been watched by the same user) of hundreds of videos at the time – thus, each user would have to encrypt and transfer hundreds of thousands of values at the time.

Scaling private aggregation

We tackle the problem from two points of view: an “algorithmic” one and a “system” one. That is, we have worked both on the design of the necessary cryptographic and data structure tools, as well as on making it easy for application developers to easily support these tools in web and mobile applications.

Our intuition is that, in many scenarios, it might be enough to collect estimates of statistics and trade off an upper-bounded error with significant efficiency gains. For instance, the accuracy of a recommender system might not be really affected if the statistics we need to train the model are approximated with a small error.

Continue reading Privately gathering statistics and training simple models

“Do you see what I see?” ask Tor users, as a large number of websites reject them but accept non-Tor users

If you use an anonymity network such as Tor on a regular basis, you are probably familiar with various annoyances in your web browsing experience, ranging from pages saying “Access denied” to having to solve CAPTCHAs before continuing. Interestingly, these hurdles disappear if the same website is accessed without Tor. The growing trend of websites extending this kind of “differential treatment” to anonymous users undermines Tor’s overall utility, and adds a new dimension to the traditional threats to Tor (attacks on user privacy, or governments blocking access to Tor). There is plenty of anecdotal evidence about Tor users experiencing difficulties in browsing the web, for example the user-reported catalog of services blocking Tor. However, we don’t have sufficient detail about the problem to answer deeper questions like: how prevalent is differential treatment of Tor on the web; are there any centralized players with Tor-unfriendly policies that have a magnified effect on the browsing experience of Tor users; can we identify patterns in where these Tor-unfriendly websites are hosted (or located), and so forth.

Today we present our paper on this topic: “Do You See What I See? Differential Treatment of Anonymous Users” at the Network and Distributed System Security Symposium (NDSS). Together with researchers from the University of Cambridge, University College London, University of California, Berkeley and International Computer Science Institute (Berkeley), we conducted comprehensive network measurements to shed light on websites that block Tor. At the network layer, we scanned the entire IPv4 address space on port 80 from Tor exit nodes. At the application layer, we fetch the homepage from the most popular 1,000 websites (according to Alexa) from all Tor exit nodes. We compare these measurements with a baseline from non-Tor control measurements, and uncover significant evidence of Tor blocking. We estimate that at least 1.3 million IP addresses that would otherwise allow a TCP handshake on port 80 block the handshake if it originates from a Tor exit node. We also show that at least 3.67% of the most popular 1,000 websites block Tor users at the application layer.

Continue reading “Do you see what I see?” ask Tor users, as a large number of websites reject them but accept non-Tor users

Are Payment Card Contracts Unfair?

While US bank customers are almost completely protected against fraudulent transactions, in Europe banks are entitled to refuse to reimburse victims of fraud under certain circumstances. The EU Payment Services Directive (PSD) is supposed to protect customers but if the bank can show that the customer has been “grossly negligent” in following the terms and conditions associated with their account then the PSD permits the bank to pass the cost of any fraud on to the customer. The bank doesn’t have to show how the fraud happened, just that the most likely explanation for the fraud is that the customer failed to follow one of the rules set out by the bank on how to protect the account. To be certain of obtaining a refund, a customer must be able to show that he or she complied with every security-related clause of the terms and conditions, or show that the fraud was a result of a flaw in the bank’s security.

The bank terms and conditions, and how customers comply with them, are therefore of critical importance for consumer protection. We set out to answer the question: are these terms and conditions fair, taking into account how customers use their banking facilities? We focussed on ATM payments and in particular how customers manage PINs because ATM fraud losses are paid for by the banks and not retailers, so there is more incentive for the bank to pass losses on to the customer. In our paper – “Are Payment Card Contracts Unfair?” – published at Financial Cryptography 2016 we show that customers have too many PINs to remember them unaided and therefore it is unrealistic to expect customers to comply with all the rules banks set: to choose unguessable PINs, not write them down, and not use them elsewhere (even with different banks). We find that, as a result of these unrealistic expectations, customers do indeed make use of coping mechanisms which reduce security and violate terms and conditions, which puts them in a weak position should they be the victim of fraud.

We surveyed 241 UK bank customers and found that 19% of customers have four or more PINs and 48% of PINs are used at most once a month. As a result of interference (one memory being confused with another) and forgetting over time (if a memory is not exercised frequently it will be lost) it is infeasible for typical customers to remember all their bank PINs unaided. It is therefore inevitable that customers forget PINs (a quarter of our participants had forgot a 4-digit PIN at least once) and take steps to help them recall PINs. Of our participants, 33% recorded their PIN (most commonly in a mobile phone, notebook or diary) and 23% re-used their PIN elsewhere (most commonly to unlock their mobile phone). Both of these coping mechanisms would leave customers at risk of being found liable for fraud.

Customers also use the same PIN on several cards to reduce the burden of remembering PINs – 16% of our participants stated they used this technique, with the same PIN being used on up to 9 cards. Because each card allows the criminal 6 guesses at a PIN (3 on the card itself, and 3 at an ATM) this gives criminals an excellent opportunity to guess PINs and again leave the customer responsible for the losses. Such attacks are made easier by the fact that customers can change their PIN to one which is easier to remember, but also probably easier for criminals to guess (13% of our participants used a mnemonic, most commonly deriving the PIN from a specific date). Bonneau et al. studied in more detail exactly how bank customers select PINs.

Finally we found that PINs are regularly shared with other people, most commonly with a spouse or partner (32% of our participants). Again this violates bank terms and conditions and so puts customers at risk of being held liable for fraud.

Holding customers liable for not being able to follow unrealistic, vague and contradictory advice is grossly unfair to fraud victims. The Payment Services Directive is being revised, and in our submission to the consultation by the European Banking Authority we ask that banks only be permitted to pass fraud losses on to customers if they use authentication mechanisms which are feasible to use without undue effort, given the context of how people actually use banking facilities in normal life. Alternatively, regulators could adopt the tried and tested US model of strong consumer protection, and allow banks to manage risks through fraud detection. The increased trust from this approach might increase transaction volumes and profit for the industry overall.

 

“Are Payment Card Contracts Unfair?” by Steven J. Murdoch, Ingolf Becker, Ruba Abu-Salma, Ross Anderson, Nicholas Bohm, Alice Hutchings, M. Angela Sasse, and Gianluca Stringhini will be presented at Financial Cryptography and Data Security, Barbados, 22–26 February 2016.

Jens Groth – Non-interactive zero knowledge proofs, efficient enough to be used in practice

The UCL information security group’s Jens Groth, a cryptographer, is one of 17 UCL researchers who have been awarded a Starting Grant by the European Research Council. The five-year grant will fund his work on the cryptographic building block known as “zero-knowledge proofs”, a widely applicable technique that underpins both security and trust. ERC Starting Grants are intended to support up-and-coming research leaders who are beginning to set up a research team and conduct independent research. Groth’s focus is on making zero- knowledge proofs more efficient so that they can become cheap enough to become a commonly used, standard security technology. Groth is also the recipient of a second grant from the Engineering and Physical Sciences Research Council to fund his work on another related topic, structure-preserving pairing-based cryptography.

“My line of thinking,” says Groth, “is that there’s been a lot of research into zero-knowledge proofs, but I don’t know of any groups taking entire systems from theory through to very practical implementations. I am hoping to build a group that will cover this entire span, and by covering it thoroughly get some very significant gains in efficiency.” Covering that entire spectrum from the purely abstract to the built system is important, he says, because “Practice can influence theory and give us some insight into what we should be looking at. Also, when you start implementing things, lots of surprising discoveries can come up.”

Unlike other types of cryptographic tools, such as public key cryptography, used in such widely used mass-market applications as SSL (used to secure data passed over the Web while in transit), Groth notes that zero-knowledge proofs are more likely to be a behind-the-scenes technology that end users will never touch directly.

“It will be hidden inside the system,” he says. “The main properties we want are completeness, soundness – and zero-knowledge.” Completeness means the prover can convince the verifier when a statement is true. Soundness means the prover cannot convince the verifier when the statement is false. Finally, zero-knowledge means that there is no leakage of information even if the prover is interacting with a fraudulent verifier.

Continue reading Jens Groth – Non-interactive zero knowledge proofs, efficient enough to be used in practice

Our contributions to the UK Distributed Ledger Technology report

The UK Government Office for Science, has published its report on “Distributed ledger technology: beyond block chain” to which UCL’s Sarah Meiklejohn, Angela Sasse and myself (George Danezis) contributed parts of the security and privacy material. The review, looks largely at economic, innovation and social aspects of these technologies. Our part discusses potential threats to ledgers, as well as opportunities to build robust security systems using ledgers (Certificate Transparency & CONIKS), and overcome privacy challenges, including a mention of the z.cash technology.

You can listen to the podcast interview Sarah gave on the report’s use cases, recommendations, but also more broadly future research directions for distributed ledgers, such as better privacy protection.

In terms of recommendation, I personally welcome the call for the Government Digital Services, and other innovation bodies to building capacity around distributed ledger technologies. The call for more research for efficient and secure ledgers (and the specific mention of cryptography research) is also a good idea, and an obvious need. When it comes to the specific security and privacy recommendation, it simply calls for standards to be established and followed. Sadly this is mildly vague: a standards based approach to designing secure and privacy-friendly systems has not led to major successes. Instead openness in the design, a clear focus on key end-to-end security properties, and the involvement of a wide community of experts might be more productive (and less susceptible to subversion).

The report is well timed: our paper on “Centrally Banked Crypto-Currencies” will be presented in February at a leading security conference, NDSS 2016, by Sarah Meiklejohn, largely inspired by the research agenda published by the Bank of England. It provides some answers to the problems of scalability and eco-friendliness of current proof-of-work based ledger design.