An untapped resource to reproduce studies

Science is generally accepted to operate by conducting specially-designed structured observations (such as experiments and case studies) and then interpreting the results to build generalised knowledge (sometimes called theories or models). An important, nay necessary, feature of the social operation of science is transparency in the design, conduct, and interpretation of these structured observations. We’re going to work from the view that security research is science just like any other, though of course as its own discipline it has its own tools, topics, and challenges. This means that studies in security should be replicable, reproducible, or at least able to be corroborated. Spring and Hatleback argue that transparency is just as important for computer science as it is for experimental biology. Rossow et al. also persuasively argue that transparency is a key feature for malware research in particular. But how can we judge whether a paper is transparent enough? The natural answer would seem to be if it is possible to make a replication attempt from the materials and information in the paper. Forget how often the replications succeed for now, although we know that there are publication biases and other factors that mess with that.

So how many security papers published in major conferences contain enough information to attempt a reproduction? In short, we don’t know. From anecdotal evidence, Jono and a couple students looked through the IEEE S&P 2012 proceedings in 2013, and the results were pretty grim. But heroic effort from a few interested parties is not a sustainable answer to this question. We’re here to propose a slightly more robust solution. Master’s students in security should attempt to reproduce published papers as their capstone thesis work. This has several benefits, and several challenges. In the following we hope to convince you that the challenges can be mitigated and the benefits are worth it.

This should be a choice, but one that master’s students should want to make. If anyone has a great new idea to pursue, they should be encouraged to do so. However, here in the UK, the dissertation process is compressed into the summer and there’s not always time to prototype and pilot study designs. Selecting a paper to reproduce, with a documented methodology in place, lets the student get to work faster. There is still a start-up cost; students will likely have to read several abstracts to shortlist a few workable papers, and then read these few papers in detail to select a good candidate. But learning to read, shortlist, and study academic papers is an important skill that all master’s students should be attempting to, well, master. This style of project would provide them with an opportunity to practice these skills.

Briefly, let’s be clear what we mean by reproduction of published work.
Reproduction isn’t just one thing. There’s reproduce and replicate and corroborate and controlled variation (see Feitelson for details). Not everything is amenable to reproduction. For example, case studies (such as attack papers) or natural experiments are often interesting because they are unique. Corroborating some aspect of the case may be possible with a new study, and such study is also valuable. But this not the sort of reproduction we have in mind to advocate here.

Continue reading An untapped resource to reproduce studies

Underground abraCARDabra: Understanding carding forums

Paying for dinner? A taxi ride? A tropical drink? Sure. Swipe or tap your card and it is done. Convenient. Payment cards make it easy for us to make payments at “brick-and-mortar” locations and online marketplaces. However, they are also attractive targets for cybercriminals seeking to steal funds from the accounts linked to payment cards, as seen in this recent high-profile theft of credit cards affecting more than 1,000 hotels, for instance.

Theft of payment card information via phishing, skimming, or hacking, is usually the first step in the chain of payment card fraud. Other steps include sales, validation, and monetisation of the stolen data. These illicit deals are aided by underground online forums where cybercriminals actively trade stolen credit card information. To tackle payment card fraud, it is therefore important to understand the characteristics of these forums and the activity of miscreants using them. In our paper, presented at the 2017 APWG Symposium on Electronic Crime Research (eCrime2017), we analyse and discuss the characteristics of underground carding forums. We focus on the available products and prices, characteristics of sellers, and features of the forums. We won the Best Paper Award at eCrime2017.

Products

The main products available on carding forums are credit card numbers, dumps, and fullz. Credit card numbers comprise the information actually printed on credit cards, that is, cardholder name, card number (16 digits on most cards), expiry date, and the security code on the back of the card (usually 3 digits).

Dumps comprise stolen information from the tracks of magnetic stripe of a credit card. Dumps are usually obtained via skimmers. Skimmers are devices attached to Automated Teller Machines (ATMs) and Point of Sale (POS) terminals by miscreants to steal data from unsuspecting victims. Afterwards, the miscreants create clones of the skimmed credit cards and monetise the clones, for instance, by making illicit purchases with them.

Fullz contain further information about the cardholder. In other words, fullz usually comprise information printed on the card plus additional information such as bank account information, cardholder’s date of birth, Social Security number, etc.

Sellers

Generally, there are several types of participants on carding forums: sellers, buyers, intermediaries, mules, administrators, and others. These roles are not mutually exclusive; sellers may simultaneously be buyers. In this study, we focus on sellers since they come before buyers in the fraud chain.

Our approach

We studied previous work on underground marketplaces and forums, and derived the following hypotheses from the insights gained. We then searched for names of carding forums, found 25 names, and collected data from 5 active forums. We then tested the hypotheses on the data.

Hypothesis 1. Prices of fullz (credit card numbers and additional cardholder information) are higher than prices of credit card numbers.
Hypothesis 2. A small number of traders are responsible for a large
proportion of traffic.
Hypothesis 3. Most traders sell only one product type (that is, they are specialised).
Hypothesis 4. Specialised traders sell their products at lower prices than unspecialised traders.
Hypothesis 5. Carding forums have working reputation systems that are as sophisticated as those of legal marketplaces (for instance, eBay).
Hypothesis 6. The vast majority of actors do not operate on more than
one forum.

Summary of findings

Our analyses confirmed Hypothesis 1, Hypothesis 2, and Hypothesis 6. In other words, prices of fullz are indeed higher than prices of credit card numbers (credit card numbers: mean = $10.08, median = $10.00; fullz: mean = $31.82, median = $30.00). Also, a small number of traders are responsible for a large proportion of traffic. Finally, most sellers focus their efforts on a single forum, as expected.

Hypothesis 4 was partially rejected, while Hypothesis 3 and Hypothesis 5 were completely rejected. In other words, specialised sellers do not always sell their products at lower prices than the unspecialised ones, most sellers advertise more than one type of product, and most of the carding forums under study do not have working reputation systems that are as elaborate as those of legitimate online marketplaces.

In conclusion, dumps and fullz are relatively expensive; they are more than three times as expensive as credit card numbers. This may be due to the effort needed to obtain or monetise the data, the amount of available information, or differing supply and demand. Sellers have varying success. Even though some sellers complete hundreds of transactions, most sellers do not succeed in selling anything. This means that the trading sections of the forums are profitable distribution channels for high-profile actors. Finally, specialisation is not a key characteristic of sellers, not even of high-profile sellers.

Further details can be found in the full paper All Your Cards Are Belong To Us: Understanding Online Carding Forums, by Andreas Haslebacher, Jeremiah Onaolapo, and Gianluca Stringhini.

Understanding the Use of Leaked Webmail Credentials in the Wild

Online accounts enable us to store and access documents, make purchases, and connect to new friends, among many other capabilities. Even though online accounts are convenient to use, they also expose users to risks such as inadvertent disclosure of private information and fraud. In recent times, data breaches and subsequent exposure of users to attacks have become commonplace. For instance, over the last four years, account credentials of millions of users from Dropbox, Yahoo, and LinkedIn have been stolen in massive attacks conducted by cybercriminals.

After online accounts are compromised by cybercriminals, what happens to the accounts? In our paper, presented today at the 2016 ACM Internet Measurement Conference, we answer this question. To do so, we needed to monitor the compromised accounts. This is hard to do, since only large online service providers have access to data from such compromised accounts, for instance Google or Yahoo. As a result, there is sparse research literature on the use of compromised online accounts. To address this problem, we developed an infrastructure to monitor the activity of attackers on Gmail accounts. We did this to enable researchers to understand what happens to compromised webmail accounts in the wild, despite the lack of access to proprietary data on compromised accounts.

Cybercriminals usually sell the stolen credentials on the underground black market or use them privately, depending on the value of the compromised accounts. Such accounts can be used to send spam messages to other online online accounts, or to retrieve sensitive personal or corporate information from the accounts, among a myriad of malicious uses. In the case of compromised webmail accounts, it is not uncommon to find password reset links, financial information, and authentication credentials of other online accounts inside such webmail accounts. This makes webmail accounts particularly attractive to cybercriminals, since they often contain a lot of sensitive information that could potentially be used to compromise other accounts. For this reason, we focus on webmail accounts.

Our infrastructure works as follows. We embed scripts based on Google Apps Script in Gmail accounts, so that the accounts send notifications of activity to us. Such activity includes the opening of email messages, creation of email drafts, sending of email messages, and “starring” of email messages. We also record details of accesses including IP addresses, browser information, and access times of visitors to the accounts. Since we designed the Gmail accounts to lure cybercriminals to interact with them (in the sense of a honeypot system), we refer to the accounts as honey accounts.

To study webmail accounts stolen via malware, we also developed a malware sandbox infrastructure that executes information-stealing malware samples inside virtual machines (VMs). We supply honey credentials to the VMs, which drive web browsers and login to the honey accounts automatically. The login action triggers the malware in the VMs to steal and exfiltrate the honey credentials to Command-and-Control servers under the control of botmasters.

Continue reading Understanding the Use of Leaked Webmail Credentials in the Wild