Loading...

This post looks at the main difficulty faced while using a classifier to block attacks: handling mistakes and uncertainty such that the overall system remains secure and usable.

This is the third post in a series of four that is dedicated to providing a concise overview of how to use artificial intelligence (AI) to build robust anti-abuse protections. The first post explained why AI is key to building robust protection that meet user expectations and defeat increasingly sophisticated attacks. Following the natural progression of building and launching an AI-based defense system, the second post covered the challenges related to training classifiers. The fourth and final post will look at how attackers go about attacking AI-based defenses.

This series of posts is modeled after the talk I gave at RSA 2018. Here is a re-recording of this talk:

How to Successfully Harness AI to Combat Fraud and Abuse - RSA 2018 - YouTube

You can also get the slides here.

Disclaimer: This series is intended as an overview for everyone interested in the subject of harnessing AI for anti-abuse defense, and it is a potential blueprint for those who are making the jump. Accordingly, this series focuses on providing a clear high-level summary, deliberately not delving into technical details. That being said, if you are an expert, I am sure you will find ideas and techniques that you haven’t heard about before, and hopefully you will be inspired to explore them further.

At a high level, the main difficulty faced when using a classifier to block attacks is how to handle mistakes. More precisely, the need to handle errors correctly can be broken down into two challenges: how to strike the right balance between false positives and false negatives, to ensure that your product remains safe when your classifier makes an error; and how to explain why something was blocked, both to inform users and for debugging purposes.

Let’s get started!

Striking the right balance between false positives and false negatives

The single most important decision you have to make when putting a classifier in production is how to balance your classifier error rates. This decision deeply affects the security and usability of your system. This struggle is best understood through a real-life example, so let’s start by looking at one of the toughest: the account recovery process.

When a user loses access to their account, they have the option to go through the account recovery process, supply information to prove they are who they claim to be, and get their account access back. At the end of the recovery process the classifier has to decide, based on the information provided and other signals, whether or not to let the person claiming to be the user recover the account.

The key question here is what the classifier should do when it is not clear what the decision should be. Technically, this is done by adjusting the false positive and false negative rates ; this is also known as classifier sensitivity and specificity . There are two options:

  1. Make the classifier cautious, which is to favor reducing false positives (hacker break-in) at the expense of increasing false negatives (legitimate user denied).
  2. Make the classifier optimistic, which is to favor reducing false negatives (legitimate user denied) at the expense of increasing false positives (hacker break-in)

While both types of error are bad, it is clear that for account recovery, letting a hacker break into a user’s account is not an option. Accordingly, for that specific use case, the classifier must be tuned to err on the cautious side. Technically, this means we are willing to reduce the false positive rate at the expense of slightly increasing the false negative rate.

It is important to note that the relation between a false positive and a false negative is not linear, as illustrated in the figure above. In practical terms, this means that the more you reduce one at the expense of the other, the higher your overall error rate will be. There is no free lunch :-)

For example, you might be able to reduce the false negative rate from 0.3% to 0.2% by slightly increasing the false positive rate from 0.3% to 0.42%. However, reducing the false negative rate further, to 0.1%, will increase your false positive rate to a whopping 2%. Those numbers are made up but they do illustrate how the nonlinear relation that exists between false positives and false negatives plays out.

To sum up, the first challenge faced when using classifiers in production to detect attack is that:

In fraud and abuse the margin for error is often nonexistent, and some error types are more costly than others.

This challenge is addressed by paying extra attention to how classifier error rates are balanced, to ensure that your systems are as safe and usable as possible. Here are three key points you need to consider when balancing your classifier:

  1. Use manual reviews:When the stakes are high and the classifier is not confident enough, it might be worth relying on a human to make the final determination.
  2. Adjust your false positive and false negative rates: Skew your model errors in one direction or the other, adjusting it so that it errs on the correct side to keep your product safe.
  3. Implement catch-up mechanisms: No classifier is perfect, so implementing catch-up mechanisms to mitigate the impact of errors is important. Catch-up mechanisms include an appeal system and in-product warnings.

To wrap up this section, let’s consider how Gmail spam classifier errors are balanced.

Gmail users really don’t want to miss an important email but are okay with spending a second or two to get rid of spam in their inboxes, provided it doesn’t happen too often. Based on this insight, we made the conscious decision to bias the Gmail spam classifier to ensure that the false positives rate (which means good emails that end up in the spam folder) is as low as possible. Reducing the false positives rate to 0.05% is achieved at the expense of a slightly higher false negatives rate (which means spam in user inboxes) of 0.1%.

Predicting is not explaining

Our second challenge is that being able to predict if something is an attack does not mean you are able to explain why it was detected.

Classification is a binary decision. Explaining it requires additional information.

Fundamentally, dealing with attacks and abuse attempts is a binary decision: you either block something or you don’t. However, in many cases, especially when the classifier makes an error, your users will want to know why something was blocked. Being able to explain how the classifier reached a particular decision requires additional information that must be gathered through additional means.

Here are three potential directions that will allow you to collect the additional information you need to explain your classifier’s decisions.

1. Use similarity to known attacks

First you can look at how similar a given blocked attack is to known attacks. If it is very similar to one of them, then it is very likely that the blocked attack was a variation of it. Performing this type of explanation is particularly easy when your model uses embedding because you can directly apply distance computation to those embedding to find related items. This has been applied successfully to words and faces for example.

2. Train specialized models

Instead of having a single model that classifies all attacks, you can use a collection of more specialized models that target specific classes of attacks. Splitting the detection into multiple classifiers makes it easier to attribute a decision to a specific attack class because there is a one-to-one mapping between the attack type and the classifier that detected it. Also, in general, specialized models tend to be more accurate and easier to train, so you should rely on those if you can.

3. Leverage model explainability

Last but not least, you can analyze the inner state of the model to glean insights about why a decision was made. As you can see in the screenshot above, for example, a class-specific saliency map (see this recent paper ) helps us to understand which section of the image contributed the most to the decision. Model explainability is a very active field of research and a lot of great tools , techniques and analysis have been released recently.

Gmail as an example

Gmail makes use of explainability to help users understand better why something is in the spam folder and why it is dangerous. As visible in the screenshot above, on top of each a spam email we add a red banner that explains why the email is dangerous, and does so in simple terms meant to be understandable to every user.

Conclusion

Overall this post can be summarized as follow:

Successful applying AI to abuse fighting requires to handle classifier errors is a safe way and be able to understand how a particular decision was reached.

The next post will discuss attacks against classifiers and how to mitigate them.

Thank you for reading this post till the end! If you enjoyed it, don’t forget to share it on your favorite social network so that your friends and colleagues can enjoy it too and learn about AI and anti-abuse.

To get notified when my next post is online, follow me on Twitter , Facebook , Google+ , or LinkedIn . You can also get the full posts directly in your inbox by subscribing to the mailing list or via RSS .

A bientôt!

Read Full Article
Visit website
  • Show original
  • .
  • Share
  • .
  • Favorite
  • .
  • Email
  • .
  • Add Tags 

This post looks at the four main challenges that arise when training a classifier to combat fraud and abuse.

This is the second post of a series of four that is dedicated to provide a concise overview of how to harness AI to build robust anti-abuse protections. The first post explains why AI is key to build robust anti-defenses that keep up with user expectations and increasingly sophisticated attackers. Following the natural progression of building and launching an AI-based defense system, the third post will examine classification issues, and the last post will look at how attackers go about attacking AI-based defenses.

This series of posts is modeled after the talk I gave at RSA 2018 . Here is a re-recording of this talk:

How to Successfully Harness AI to Combat Fraud and Abuse - RSA 2018 - YouTube

You can also get the slides here .

Disclaimer: This series is meant to provide an overview for everyone interested in the subject of harnessing AI for anti-abuse defense, and it is a potential blueprint for those who are making the jump. Accordingly, this series focuses on providing a clear high-level summary, purposely avoiding delving into technical details. That being said, if you are an expert, I am sure you will find ideas and techniques that you haven’t heard about before, and hopefully you will be inspired to explore them further.

At a high level, training a classifier to detect fraud and abuse unique is that it deals with adversarial data. AFAIK no other AI field has to deal with adversaries that are actively trying to undermine your training.

The first two challenges look at the main consequences of having an adversarial: 1) abuse-fighting is a non-stationary problem, and 2) it is hard to collect accurate training data. The third challenge relates to dealing with data and taxonomy ambiguity. Last, but not least, the last challenge is how to apply AI to products that are not AI friendly because they don’t have rich content and features but still need to be protected.

Let’s get started!

Non-stationary problem

Traditionally, when applying AI to a given problem, you are able to reuse the same data over and over again because the problem definition is stable. This is not the case when combating abuse because attacks never stop evolving. As a result to ensure that anti-abuse classifiers remain accurate, their training data need to be constantly refreshed to incorporate the latest type of attacks.

Let me give you a concrete example so it is clear what the difference between a stable problem and an unstable/non-stationary one is.

Let’s say you would like to create a classifier that recognizes cats and other animals. This is considered to be a stable problem because animals are expected to look roughly the same for the next few hundred years (barring a nuclear war). Accordingly, to train this type of classifier, you only need to collect and annotate animal images once at the beginning of the project.

On the other hand, if you would like to train a classifier that recognizes phishing pages, this “collect once” approach doesn’t work because phishing pages keep evolving and look drastically different over time, as visible in the screenshot above.

More generally, while training classifiers to combat abuse, the first key challenge is that:

Past training examples become obsolete as attacks evolve

While there are no silver bullet to deal with this obsolescence, here are three complementary strategies that helps coping with ever-changing data:

  1. Automate model retraining: You need to automate model retraining on fresh data so your model keeps up with the evolution of attacks. When you automate model retraining, it is a good practice to have a validation set that ensures the new model performs correctly and doesn’t introduce regressions. It is also useful to add hyperparameter optimization to your retraining process to maximize your model accuracy.
  2. Build highly generalizable models: Your models have to be designed in a way that ensures they can generalize enough to detect new attacks. While ensuring that a model generalizes well is complex making sure you model have enough (but not too much) capacity (i.e., enough neurons) and quite a lot of training data is a good starting point. If you don’t have enough real attack examples, you can supplement your training data with data augmentation techniques that increase the size of your corpus by generating slight variation of your attack examples. As visible in the table above, taken from this paper , data augmentation make models more robust and do increase accuracy significantly. Finally you should consider other finer, well-documented technical aspects, such as tuning the learning rate and using dropout .

  3. Set up monitoring and in-depth defense: Finally, you have to assume your model will be bypassed at some point, so you need to build defense in depth to mitigate this issue. You also need to set up monitoring that will alert you when this occurs. Monitoring for a drop in the number of detected attacks or a spike in user reports is a good starting point.

Quite often, I get asked how quickly attacks are evolving in practice. While I don’t have a general answer, here is a key statistic that I hope will convince you that attackers indeed mutate their attack incredibly quickly: 97 percent of Gmail malicious attachments blocked today are different from the ones blocked yesterday.

Fortunately those new malicious attachments are variations of recent attacks and therefore can be blocked by systems that generalize well and are trained regularly.

Lack of ground truth data

For most classification tasks, collecting training data is fairly easy because you can leverage human expertise. For example, if you want to build an animal classifier, you could ask people to take a picture of animals and tell you which animals are in it.

On the other hand, collecting ground truth (training data) for anti-abuse purposes is not that easy because bad actors try very hard to impersonate real users. As a result, it is very hard even for humans to tease apart what is real and what is fake. For instance, the screenshot above showcases two Play store reviews. Would you be able to tell me which one is real and which one is fake?

Obviously telling them apart is impossible because they are both well written and over the top. This struggle to collect abusive content accurately exists all across the board whether it is for reviews, comments, fake accounts or network attacks. By the way, both reviews are real in case you were wondering.☺️

Accordingly, the second challenge on the quest to train a successful classifier is that:

Abusers try to hide their activities, which makes it hard to collect ground truth data

While no definitive answers exist on how to overcome this challenge, here are three techniques to collect ground truth data that can help alleviate the issue:

  1. Applying clustering methods: First, you can leverage clustering methods to expand upon known abusive content to find more of it. It is often hard to find the right balance while doing so because if you are clustering too much, you end up flagging good content as bad, and if you don’t cluster enough, you won’t collect enough data.

  2. Collecting ground truth with honeypots: Honeypots )-controlled settings ensure you that they will only collect attacks. The main difficulty with honeypots is to make sure that the collected data is representative of the set of the attacks experienced by production systems. Overall, honeypots are very valuable, but it takes a significant investment to get them to collect meaningful attacks.

  3. Leverage generative adversarial networks: A new and promising direction is to leverage the recent advance in machine learning and use a Generative Adversarial Network (main paper), better known as GAN, to reliably increase your training dataset . The screenshot above, taken from this paper, show you an example of face generation using it: only the top left image is real. While still very experimental, here is one of the last paper on the topic, this approach is exciting as it paves the way to generate meaningful attack variations at scale.

Ambiguous data & taxonomy

The third challenge that arises when building a classifier is that what we consider bad is often ill defined, and there are a lot of borderline cases where even humans struggle to make a decision.

For example, the sentence “I am going to kill you” can either be viewed as the sign of a healthy competition if you are playing a video game with your buddies or it can be a threat if it is used in a serious argument. More generally, it is important to realize that:

Unwanted content is inherently context, culture and settings dependent

Accordingly, is it impossible, except for very specific use cases such as profanity or gibberish detection, to build universal classifiers that will work across all products and for all users.

When you think about it, even the well-established concept of SPAM is ill defined and means different things for different people. For example, countless Gmail users decide that the emails coming from a mailing list they willingly subscribed to a long time ago are now spam because they lost interest in the topic.

Here are three way to help your classifier deal with ambiguity:

  1. Model context, culture and settings: Easier said than done! Add features that represent the context in which the classification is performed. This will ensure that the classifier is able to reach a different decision when the same data is used in different settings.

  2. Use personalized models: Your models need to be architectured in a way that takes into account user interests and levels of tolerance. This can be done by adding some features (pioneer paper) that model user behavior.

  3. Offer users additional meaningful choices: You can reduce ambiguity by providing users with alternative choices that are more meaningful than a generic reporting mechanism. Those more precise choices reduce ambiguity by reducing the number of use cases that are clamped behind a single ill-defined concept, such as spam.

Here is a concrete example of how the addition of meaningful choices reduces ambiguity. Back in 2015, Gmail started offering its users the ability to easily unsubscribe from mailing lists and block senders, giving them more control over their inboxes. Under the hood, this new options helps the classifiers as they reduce the ambiguity of what is marked as spam.

Lack of obvious features

Our fourth and last training challenge is that some products lack obvious features. Until now, we have focused on classifying rich content such as text, binary and image, but not every product has such rich content.

For example, Youtube has to be defended against fake views, and not a lot of obvious features that can be leveraged to do so. Looking at the view count timeline for the famous Gangnam style video, you will notice two anomalous peaks. These might be from spammers or simply because the video had huge spikes due to virality. It is impossible to tell by just looking at how the view count grew over time.

In general, AI thrives on feature-rich problems such as text or image classification; however, abuse fighters have to make AI work across the board to protect all users and products. This need to cover the entire attack surface led us to use AI to tackle use-cases that are less and ideal, and sooner or later we have to face a hard truth:

Some products in need of protection don’t have the rich features AI thrives on

Fortunately, you can (partially) work around the lack of rich features. In a nutshell, the way to build an accurate classifier when you don’t have enough content features is to leverage auxiliary data as much as possible. Here are three key sources of auxiliary data you can potentially use:

  1. Context: Everything related to the client software or network can be used, including the user agent, the client IP address and the screen resolution.

  2. Temporal behavior: Instead of looking at an event in isolation, you can model the sequence of actions that is generated by each user. You can also look at the sequence of actions that target a specific artifact, such as a given video. Those temporal sequences provide a rich set of statistical features.

  3. Anomaly detection: It is impossible for an attacker to fully behave like a normal user, so anomaly features can almost always be used to boost detection accuracy.

The last point is not as obvious as it seems so let’s deep dive into it.

At its core, what separates rudimentary attackers from advanced ones is their ability to accurately impersonate legitimate user behavior. However, because attackers aim at gaming the system, there always will be some behaviors that they can’t spoof.

It is those unspoofable behaviors that we aim at detecting using one-class classification . Introduced circa 1996 , the idea behind one-class classification is to use AI to find all the entities belonging to a single class (the normal behavior in our case) out of all entities that exist in a dataset. Every entity that is not member of that class is then considered an outlier.

For abuse purposes, one-class classification allows to detect anomaly/potential attacks even when you have no attack examples. For example, the figure above shows in red a set of malicious IPs attacking Google products that were detected using this type of approach.

Overall, one-class classification is a great complement to more traditional AI systems as its requirements are fundamentally different. As mentioned earlier, you can even take this one step further and feed the result of your one-class classifier to a standard one (binary class) to boost its accuracy.

This wraps up our deep dive into the challenges faced while training an anti-abuse classifier. The next post covers the challenges that arise when you start running your classifier in production.

Thank you for reading this post till the end! If you enjoyed it, don’t forget to share it on your favorite social network so that your friends and colleagues can enjoy it too and learn about AI and anti-abuse.

To get notified when my next post is online, follow me on Twitter , Facebook , Google+ , or LinkedIn . You can also get the full posts directly in your inbox by subscribing to the mailing list or via RSS .

A bientôt!

Read Full Article
Visit website
  • Show original
  • .
  • Share
  • .
  • Favorite
  • .
  • Email
  • .
  • Add Tags 

While machine learning is integral to innumerable anti-abuse systems including spam and phishing detection, the road to reap its benefits is paved with numerous abuse-specific challenges. Drawing from concrete examples this session will discuss how these challenges are addressed at Google and providea roadmap to anyone interested in applying machine learning to fraud and abuse problems. Watching this talk will allow you to:

  1. Learn how machine learning helps combat fraud and abuse.
  2. Discover how to overcome challenges faced when using machine learning to anti-abuse.
  3. Understand what the unsolved challenges are in the space.
Read Full Article
Visit website
  • Show original
  • .
  • Share
  • .
  • Favorite
  • .
  • Email
  • .
  • Add Tags 

This post explains why artificial intelligence (AI) is the key to building anti-abuse defenses that keep up with user expectations and combat increasingly sophisticated attacks. This is the first post of a series of four posts dedicated to provide a concise overview of how to harness AI to build robust anti-abuse protections.

The remaining three posts will delve into the top 10 anti-abuse specific challenges encountered while applying AI to abuse fighting, and how to overcome them. Following the natural progression of building and launching an AI-based defense system, the second post will cover the challenges related to training, the third will delve into classification issues and the last one will look at how attackers attempt to attack AI-based defenses.

This series of posts is modeled after the talk I gave at RSA 2018. Here is a re-recording of this talk:

How to Successfully Harness AI to Combat Fraud and Abuse - RSA 2018 - YouTube

You can also get the slides here .

Disclaimer: This series is meant to provide an overview for everyone interested in the subject of harnessing AI for anti-abuse defense, and it is a potential blueprint for those who are making the jump. Accordingly, this series focuses on providing a clear high-level summary, purposely avoiding delving into technical details. That being said, if you are an expert, I am sure you will find ideas and techniques that you haven’t heard about before, and hopefully you will be inspired to explore them further.

Let’s kickoff this series with a motivating example.

I am an avid reader of The New York Times, and one of my favorite moments on the site is when I find a comment that offers an insightful perspective that helps me better understand the significance of the news reported. Knowing this, you can imagine my unhappiness, back in September 2017, when The New York Times announced its decision to close the comment sections because it couldn't keep up with the trolls that rentelessy attempted to derail the conversation :-(

This difficult decision created a backlash from their readership, which felt censored and didn’t understand the reasoning behind it. This led The New York Times to go on record a few days after to explain that it couldn’t keep up with the troll onslaught and felt it had no other choice than closing the comments, in order to maintain the quality of the publication.

This difficult decision created a backlash from their readership, which felt censored and didn’t understand the reasoning behind it. This led The New York Times to go on record a few days after to explain that it couldn’t keep up with the troll onslaught and felt it had no other choice than closing the comments, in order to maintain the quality of the publication.

Conventional protections are failing

The New York Times case is hardly an exception. Many other publications have disallowed comments due to trolling. More generally, many online services, including games and recommendation services, are struggling to keep up with the continuous onslaught of abusive attempts. These struggles are the symptom of a larger issue:

Conventional abuse defenses are falling behind

Three major underlying factors contribute to the failure of conventional protections:

  1. User expectations and standards have dramatically increased. These days, users perceive the mere presence of a single abusive comment, spam email or bad images as a failure of the system to protect them.
  2. The amount and diversity of user-generated content has exploded. Dealing with this explosion requires anti-abuse systems to scale up to cover a large volume of diverse content and a wide range of attacks.
  3. Attacks have become increasingly sophisticated. Attackers never stop evolving, and online services are now facing well-executed, coordinated attacks that systematically attempt to target their defense’s weakest points.
AI is the way forward

So, if conventional approaches are failing, how do we build anti-abuse protection that is able to keep up with those ever-expanding underlying factors? Based on our experience at Google, I argue that:

AI is key to building protections that keep up with user expectations and combat increasingly sophisticated attacks.

I know! The word AI is thrown around a lot these days, and skepticism surrounds it. However, as I am going to explain, there are fundamental reasons why AI is currently the best technology to build effective anti-fraud and abuse protections.

AI to the rescue of The NYT

Before delving into those fundamental reasons, let’s go back to The New York Times story so I can tell you how it ended.

The New York Times story has an awesome ending : not only were the comments reopened, but they were also extended to many more articles.

What made this happy ending possible, under the hood, is an AI system developed by Google and Jigsaw that empowered The NYT to scale up its comment moderation.

This system, called Perspective API , leverages deep learning to assign a toxicity score to the 11,000 comments posted daily on The New York Times site. The NYT comments review team leverages those scores to scale up by only focusing on the potentially toxic comments. Since its release, many websites have adopted Perspective API, including Wikipedia and The Guardian .

The fundamental reasons behind the ability of AI to combat abuse

Fundamentally, AI allows to build robust abuse protections because it is able to do the following better than any other systems:

  1. Data generalization: Classifiers are able to accurately block content that matches ill-defined concepts, such as spam, by generalizing efficiently from their training examples.
  2. Temporal extrapolation: AI systems are able to identify new attacks based on the ones observed previously.
  3. Data maximization: By nature, an AI is able to optimally combine all the detection signals to come up with the best decision possible. In particular, it is able to exploit the nonlinear relations that exist between the various data inputs.

The final piece of the puzzle that explains why AI is overtaking anti-abuse fighting, and many other fields, is the rise of deep learning. What makes deep learning so powerful is that deep neural networks, in contrast to previous AI algorithms, scale up as more data and computational resources are used.

From an abuse-fighting perspective, this ability to scale up is a game changer because it moves us from a world where more data means more problems to a world where more data means better defense for users.

How deep learning helps Gmail to stay ahead of spammers

Every week, Gmail’s anti-spam filter automatically scans hundred of billions of emails to protect its billion-plus users from phishing, spam and malware.

The component that keeps Gmail’s filter ahead of spammers is its deep learning classifier. The 3.5% additional coverage it provides comes mostly from its ability to the detect advanced spam and phishing attacks that are missed by the other part of the filter, including the previous generation linear classifier.

Deep learning is for everyone

Now, some of you might think that deep learning only works for big companies like Google or that it is still very experimental or too expensive. Nothing could be further from the truth.

Over the last three years, deep learning has become very mature. Between Cloud APIs and free frameworks, it is very easy and quick to start benefiting from deep learning. For example, Tensorflow and Keras provide a very performant, robust and well-documented framework that empowers you to build state-of-the-art classifiers with just a few lines of code. You can find pre-trained models here , a list of keras related ressources here and one for Tensorflow here .

Challenges ahead

While it is clear that AI is the way forward to build robust defenses, this does not mean that the road to success is without challenges. The next three posts will delve into the top 10 anti-abuse specific challenges encountered while applying AI to abuse fighting, and how to overcome them.

Those 10 challenges are grouped into the following three categories/posts that follow the natural progression of building and launching an AI-based defense system:

  1. Training: This post looks at how to overcome the four main challenges faced while training anti-abuse classifiers, as those challenges are the ones you will encounter first.
  2. Classification: This post delves into the two key problems that arise when you put your classifier in production and start blocking attacks.
  3. Attacks: The last post of the series discusses the four main ways attackers try to derail classifiers and how to mitigate them.

Thank you for reading this post till the end! If you enjoyed it, don’t forget to share it on your favorite social network so that your friends and colleagues can enjoy it too and learn about AI and anti-abuse.

To get notified when my next post is online, follow me on Twitter , Facebook , Google+ , or LinkedIn . You can also get the full posts directly in your inbox by subscribing to the mailing list or via RSS .

A bientôt!

Read Full Article
Visit website
  • Show original
  • .
  • Share
  • .
  • Favorite
  • .
  • Email
  • .
  • Add Tags 

This post provides an in-depth analysis of Gooligan monetization schemas and recounts how Google took it down with the help of external partners.

This post is the final post of the series dedicated to the hunt and take down of Gooligan that we did at Google in collaboration with Check Point in November 2016. The first post recounts the Gooligan origin story and offers an overview of how it works. The second one provides an in-depth analysis of Gooligan’s inner workings and an analysis of its network infrastructure. As this post builds on the previous two, I encourage you to read them if you haven’t done so already.

This series of posts is modeled after the talk I gave on the subject at Botconf in December 2017. Here is a recording of the talk:

Hunting down Gooligan - Botconf 2017 - YouTube

You can also get the slides here , but they are pretty bare.

Monetization

Gooligan’s goal was to monetize the infected devices through two main fraudulent schemas: Ad fraud and Android app boosting.

Ad fraud

As shown in the screenshot above, periodically Gooligan will use its root privileges to overlay an ad popup for a legitimate app on top of any activity the user was currently doing. Under the hood, Gooligan knows when the user is looking at the phone, as it monitors various key events, including when the screen is turned on.

We don’t have much insight on how effective those ad campaigns were or who was reselling them, as they don’t abuse Google’s ads network, and they use a gazillion HTTP redirects, which makes attribution close to impossible. However we believe that ad fraud was the main driver of Gooligan revenue, given its volume and the fact that we blocked its fake installs as discussed below.

App Boosting

The second way Gooligan attempted to monetize infected devices was by performing Android app boosting. An app boosting package is a bundle of searches for a specific query on the Play store, followed by an install and a review. The search is used in an attempt to rank the app for a given term. This tactic is commonly peddled in App Store Optimization (ASO) guides.

The reason Gooligan went through the trouble of stealing OAuth tokens and manipulating the Play store is probably that the defenses we put in place are very effective at detecting and discounting fake synthetic installs. Using real devices with real accounts was the Gooligan authors’ attempt to evade our detection systems. Overall, it was a total failure on their side: We caught all the fake installs, and suspended the abusive apps and developers.

As illustrated in the diagram above, the app boosting was done in four steps:

  1. Token stealing: The malware extracts the phone’s long term token from the phone’s accounts.

  2. Taking order: Gooligan reports phone information to the central command and control system, and receives in response a reply telling it which app to boost, including which search term to use and which comment to leave (if any). Phone information is exfiltrated because Gooligan authors also had access to non-compromised phones and were trying to use information obtained from Gooligan to fake requests from those phones.

  3. Token exchange: The long term token is exchanged for a short term token that allows Gooligan to access the Play store. We are positive that no user data was compromised by Gooligan, as no other data was ever requested by Gooligan.

  4. Boosting: The fake search, installation, and potential review is carried out through the manipulated Play store app.

Clean-up

Cleaning up Gooligan was challenging for two reasons: First, as discussed in the infection post , its reset persistence mechanism meant that doing a factory reset was not enough to clean up the old unpatched devices. Second, the Oauth tokens had been exfiltrated to Gooligan servers.

Asking users to reflash their devices would have been unreasonable and issuing an OTA (Over The Air) update would have take too long. Given this difficult context and the need to act quickly to protect our users we went for an alternative solution that we rarely use: orchestrating a takedown with the help of third parties.

Takedown

With the help of Shadowserver foundation and domain registrars we sinkholed Gooligan domains and got them to point to Shadowserver controlled IPs instead of IPs controlled by Gooligan authors. This sinkholing ensured that infected devices couldn’t exfiltrate token or receive fraud commands, as they would connect to sinkhole servers instead of the real command and control servers. As shown in the graph above, our takedown was very successful: It blocked over 50M attempts to connect to Gooligan’s control server in 2017.

Notifications

With the sinkhole in place, the second part of the remediation involved resecuring the accounts that were compromised, by disabling the exfiltrated tokens and notifying the users. Notification at that scale is very complex, for three key reasons:

  • Reaching users in a timely fashion across a wide range of devices is difficult. We ended up using a combination of SMS, email, and Android messaging, depending on what communication channel was available.

  • It was important to make the notification understandable and useful to all users. Explaining what was happening clearly and simply took a lot of iteration. We ended up with the notification shown in the screenshot above.

  • Once crafted, the text of the notification and help page had to be translated into the languages spoken by our users. Performing high quality internationalization for over 20 languages very quickly was quite a feat.

Epilogue

Overall, in order to respond to Gooligan, many people, including myself, ended up working long hours through the Thanksgiving weekend (an important holiday in the U.S.). Our commitment to quickly eradicate this threat paid off: On the evening of Monday, November 29th, the takedown took place, followed the next day by the resecuring of the compromised accounts. All in all, this takedown took a mere few days, which is blazing fast when you compare it to other similar ones. For example, the Avalanche botnet ) takedown took four years of intensive efforts.

To conclude, Gooligan was a very challenging malware to tackle, due to its scale and unconventional tactics. We were able to meet this challenge and defeat it, thanks to a cross-industry effort and the involvement of many teams at Google that didn’t go home until users were safe.

Thanks for reading this post all the way to the end. I hope it showcases how we approach botnet fighting and sheds some light on some of the lesser known, yet still critical, activities that our research team assists with.

Thank you for reading this post till the end! If you enjoyed it, don’t forget to share it on your favorite social network so that your friends and colleagues can enjoy it too and learn about Gooligan.

To get notified when my next post is online, follow me on Twitter , Facebook , Google+ , or LinkedIn . You can also get the full posts directly in your inbox by subscribing to the mailing list or via RSS .

A bientôt!

Read Full Article
Visit website
  • Show original
  • .
  • Share
  • .
  • Favorite
  • .
  • Email
  • .
  • Add Tags 

This post provides an in-depth analysis of the inner workings of Gooligan, the infamous Android OAuth stealing botnet.

This is the second post of a series dedicated to the hunt and takedown of Gooligan that we did at Google, in collaboration with Check Point, in November 2016. The first post recounts Gooligan’s origin story and provides an overview of how it works. The final post discusses Gooligan’s various monetization schemas and its take down. As this post builds on the previous one , I encourage you to read it, if you haven’t done so already.

This series of posts is modeled after the talk I gave at Botconf in December 2017. Here is a re-recording of the talk:

Hunting down Gooligan - Botconf 2017 - YouTube

You can also get the slides here but they are pretty bare.

Infection

Initially, users are tricked into installing Gooligan’s staging app on their device under one false pretense or another. Once this app is executed, it will fully compromise the device by performing the five steps outlined in the diagram below:

As emphasized in the chart above, the first four stages are mostly borrowed from Ghost Push . Gooligan authors main addition is the code needed to instrument the Play Store app using a complex injection process. This heavy code reuse initially made it difficult for us to separate Ghost Push samples from Gooligan ones. However, as soon as we had the full kill chain analyzed, we were able to write accurate detection signatures.

Payload decoding

Most Gooligan samples hide their malicious payload in a fake image located in assets/close.png. This file is encrypted with a hardcoded [XOR encryption] function. This encryption is used to escape the signatures that detect the code that Gooligan borrows from previous malware. Encrypting malicious payload is a very old malware trick that has been used by Android malware since at least 2011.

Besides its encryption function, one of the most prominent Gooligan quirks is its weird (and poor) integrity verification algorithm. Basically, the integrity of the close.png file is checked by ensuring that the first ten bytes match the last ten. As illustrated in the diagram above, the oddest part of this schema is that the first five bytes (val 1) are compared with the last five, while bytes six through ten (val 2) are compared with the first five.

Phone rooting

As alluded to earlier, Gooligan, like Snappea and Ghostpush, weaponizes the Kingroot exploit kit to gain root access. Kingroot operates in three stages: First, the malware gathers information about the phone that are sent to the exploit server. Next, the server looks up its database of exploits (which only affect Android 3.x and 4.x) and builds a payload tailored for the device. Finally, upon payload reception, the malware runs the payload to gain root access.

The weaponization of known exploits by cyber-criminals who lack exploit development capacity (or don't want to invest into it) is as old as crimeware itself. For example, DroidDream exploited Exploid and RageAgainstTheCage back in 2011. This pattern is common across every platform. For example, recently NSA-leaked exploit Eternal Blue was weaponized by the fake ransomware NoPetya. If you are interested in ransomware actors, check my posts on the subject.

Persistence setup

Upon rooting the device, Gooligan patches the install-recovery.sh script to ensure that it will survive a factory reset. This resilience mechanism was the most problematic aspect of Gooligan, from a remediation perspective, because for the oldest devices, it only left us with OTA (over the air) update and device re-flashing as a way to remove it. This situation was due to the fact that very old devices don't have verified boot , as it was introduced in Android 4.4.

This difficult context, combined with the urgent need to help our users, led us to resort to a strategy that we rarely use: a coordinated takedown. The goal of this takedown was to disable key elements of the Gooligan infrastructure in a way that would ensure that the malware would be unable to work or update. As discussed in depth at the end of the post, we were able to isolate and take down Gooligan’s core server in less than a week thanks to a wide cross-industry effort. In particular, Kjell from the NorCert worked around the clock with us during the Thanksgiving holidays (thanks for all the help, Kjell!).

Play store app manipulation

The final step of the infection is the injection of a shared library into the Play store app. This shared library allows Gooligan to manipulate the Play store app to download apps and inject review.

We traced the injection code back to publicly shared code . The library itself is very bare: the authors added only the code needed to call Play store functions. All the fraud logic is in the main app, probably because the authors are more familiar with Java than C.

Impacted devicesGeo-distribution

Looking at the set of devices infected during the takedown revealed that most of the affected devices were from India, Latin America, and Asia, as visible in the map above. 19% of the infections were from India, and the top eight countries affected by Gooligan accounted for more than 50% of the infections.

Make

In term of devices, as shown in the barchart above, the infections are spread across all the big brands, with Samsung and Micromax being unsurprisingly the most affected given their market share. Micromax is the leading Indian phone maker, which is not very well known in the U.S. and Europe because it has no presence there. It started manufacturing Android One devices in 2014 and is selling in quite a few countries besides India, most notably Russia.

AttributionInitial clue

Buried deep inside Gooligan patient zero code, Check Point researchers Andrey Polkovnichenko , Yoav Flint Rosenfeld , and Feixiang He , who worked with us during the escalation, found the very unusual text string oversea_adjust_read_redis. This string led to the discovery of a Chinese blog post discussing load balancer configuration, which in turn led to the full configuration file of Gooligan backend services.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
#Ads API
        acl is_ads path_beg /overseaads/
        use_backend overseaads if is_ads
        …

#Payment API
        acl is_paystatis path_beg /overseapay/admin/
        use_backend overseapaystatis if is_paystatis
        ...

# Play install
        acl is_appstore path_beg /appstore/
        use_backend overseapaystatis if is_appstore
       ...

Analyzing the exposed HAproxy configuration allowed us to pinpoint where the infrastructure was located and how the backend services were structured. As shown in the annotated configuration snippet above, the backend had API for click fraud, receiving payment from clients, and Play store abuse. While not visible above, there was also a complex admin and statistic-related API.

Infrastructure

Combining the API endpoints and IPs exposed in the HAproxy configuration with our knowledge of Gooligan binary allowed us to reconstruct the infrastructure charted above. Overall, Gooligan was split into two main data centers: one in China and one overseas in the US, which was using Amazon AWS IPs. After the takedown, all the infrastructure ended up moving back to China.

Note: in the above diagram, the Fraud end-point appears twice. This is not a mistake: at Gooligan peak, its authors splited it out to sustain the load and better distribute the requests.

Actor

So, who is behind Gooligan? Based on this infrastructure analysis and other data, we strongly believe that it is a group operating from mainland China. Publicly, the group claims to be a marketing company, while under the hood it is mostly focused on running various fraudulent schema. The apparent authenticity of its front explains why some reputable companies ended up being scammed by this group. Bottom line: be careful who you buy ads or install from: If it is too good to be true...

In the final post of the serie, I discusses Gooligan various monetization schemas and its takedown. See you there!

Thank you for reading this post till the end! If you enjoyed it, don’t forget to share it on your favorite social network so that your friends and colleagues can enjoy it too and learn about Gooligan.

To get notified when my next post is online, follow me on Twitter , Facebook , Google+ , or LinkedIn . You can also get the full posts directly in your inbox by subscribing to the mailing list or via RSS .

A bientôt!

Read Full Article
Visit website
  • Show original
  • .
  • Share
  • .
  • Favorite
  • .
  • Email
  • .
  • Add Tags 

This series of posts recounts how, in November 2016, we hunted for and took down Gooligan, the infamous Android OAuth stealing botnet. What makes Gooligan special is its weaponization of OAuth tokens, something that was never observed in mainstream crimeware before. At its peak, Gooligan had hijacked over 1M OAuth tokens in an attempt to perform fraudulent Play store installs and reviews.

Gooligan marks a turning point in Android malware evolution as the first large scale OAuth crimeware

While I rarely talk publicly about it, a key function of our research team is to assist product teams when they face major attacks. Gooligan’s very public nature and the extensive cross-industry collaboration around its takedown provided the perfect opportunity to shed some light on this aspect of our mission.

Being part of the emergency response task force is a central aspect of our team, as it allows us to focus on helping our users when they need it the most and exposes us to tough challenges in real time, as they occur. Overcoming these challenges fuels our understanding of the security and abuse landscape. Quite a few of our most successful research projects started due to these escalations, including our work on fake phone verified accounts , the study of HTTPS interception , and the analysis of mail delivery security .

Given the complexity of this subject, I broke it down into three posts to ensure that I can provide a a full debrief of what went down and cover all the major aspects of the Gooligan escalation. This first post recounts the Gooligan origin story and offers an overview of how Gooligan works. The second post provides an in-depth analysis of Gooligan’s inner workings and an analysis of its network infrastructure. The final post discusses Gooligan various monetization schemas and its takedown.

This series of posts is modeled after the talk I gave with Oren Koriat from Check Point, at Botconf in December 2017, on the subject. You can get the slides here .

As OAuth token abuse is Gooligan’s key innovation, let’s start by quickly summarizing how OAuth tokens work, so it is clear why this is such a game changer.

What are Oauth tokens?

OAuth tokens are the de facto standard for granting apps and devices restricted access to online accounts without sharing passwords and with a limited set of privileges. For example, you can use an OAuth token to only allow an app to read your Twitter timeline, while preventing it from changing your settings or posting on your behalf.

Under the hood , the service provides the app, on your behalf, with an OAuth token that is tied to the exact privileges you want to grant. In a way that is similar but not exactly the same, when you sign up with your Google account on an Android device, Google gives the device a token that allows it to access Google services on your behalf. This is the long term token that Gooligan stole in order to impersonate users on the Play Store. You can read more about Android long term tokens here .

Overview

Overall, Gooligan is made of six key components:

  • Repackaged app: This is the initial payload, which is usually a popular repackaged app that was weaponized. This APK embedded a secondary hidden/encrypted payload.
  • Registration server: Record device information when it join the botnet after being rooted.
  • Exploit server: The exploit server is the system that will deliver the exact exploit needed to root the device, based on the information provided by the secondary payload. Having the device information is essential, as Kingroot only targeted unpatched older devices (4.x and below). The post-rooting process is also responsible for backdooring the phone recovery process to enable persistence.
  • Fraudulent app and ads C&C: This infrastructure is responsible for collecting exfiltrated data and telling the malware which (non-Google related) ads to display and which Play store app to boost.
  • Play Store app module: This is an injected library that allows the malware to issue commands to the Play store through the Play store app. This complex process was set up in an attempt to avoid triggering Play store protection.
  • Ads fraud module: This is a module that would regularly display ads to the users as an overlay. The ads were benign and came from an ad company that we couldn’t identify.
Genesis

Analyzing Gooligan’s code allowed us to trace it back to earlier malware families, as it built upon their codebase. While those families are clearly related code-wise, we can't ascertain whether the same actor is behind all of them, because a lot of the shared features were extensively discussed in Chinese blogs.

SnapPea the precursor

As visible in the timeline above, Gooligan’s genesis can be traced back to the SnapPea adware that emerged in March 2015 and was discovered by Check Point in July of the same year. SnapPea’s key innovation was the weaponization of the exploit kit Kingroot , which was until then used by enthusiasts to root their phones and install custom ROMs.

SnapPea Kingroot straightforward weaponization led to a rather unusual infection vector: its authors resorted to backdooring the backup application SnapPea to be able to infect victims. After an Android device was physically connected to an infected PC, the malicious SnapPea application used Kingroot to root the device in order to install malware on the device. Gooligan is related to SnapPea because Gooligan also use Kingroot exploits to root devices but in an untethered way via a custom remote server.

Following SnapPea footsteps Gooligan weaponizes the Kingroot exploits to root old unpatched Android devices.

Ghost Push the role model

A few months after SnapPea appeared, Cheetah Mobile uncovered Ghost Push , which quickly became one of the largest Android (off-market) botnets. What set Ghost Push apart technically from SnapPea was the addition of code that allowed it to persist during the device reset. This persistence was accomplished by patching, among other things, the recovery script located in the system partition after Ghost Push gained root access in the same way Snappea did. Gooligan reused the same persistent code.

Gooligan borrowed from Ghost Push the code used to ensure its persistence across device resets.

Wrap-up

As outline in this post Gooligan is a complex malware that built on previous malware generation and extend it to a brand new vector of attack: OAuth tokens theft.

Gooligan marks a turning point in Android malware evolution as the first large scale OAuth crimeware

Building up on this post, the next one of the serie will provide in-depth analysis of Gooligan’s inner workings and an analysis of its network infrastructure. The final post will discusses Gooligan various monetization schemas and its takedown

Thank you for reading this post till the end! If you enjoyed it, don’t forget to share it on your favorite social network so that your friends and colleagues can enjoy it too and learn about Gooligan.

To get notified when my next post is online, follow me on Twitter , Facebook , Google+ , or LinkedIn . You can also get the full posts directly in your inbox by subscribing to the mailing list or via RSS .

A bientôt!

Read Full Article
Visit website
  • Show original
  • .
  • Share
  • .
  • Favorite
  • .
  • Email
  • .
  • Add Tags 

This talk provides a retrospective on how during 2017 Check Point and Google jointly hunted down Gooligan – one of the largest Android botnets at the time. Beside its scale what makes Gooligan a worthwhile case-study is its heavy reliance on stolen oauth tokens to attack Google Play’s API, an approach previously unheard of in malware.

This talk starts by providing an in-depth analysis of how Gooligan’s kill-chain works from infection and exploitation to system-wide compromise. Then building on various telemetry we will shed light on which devices were infected and how this botnet attempted to monetize the stolen oauth tokens. Next we will discuss how we were able to uncover the Gooligan infrastructure and how we were able to tie it to another prominent malware family: Ghostpush. Last but not least we will recount how we went about re-securing the affected users and takedown the infrastructure.

Read Full Article
Visit website
  • Show original
  • .
  • Share
  • .
  • Favorite
  • .
  • Email
  • .
  • Add Tags 

Ransomware is a type of malware that encrypts the files of infected hosts and demands payment, often in a crypto-currency such as Bitcoin. In this paper, we create a measurement framework that we use to perform a large-scale, two-year, end-to-end measurement of ransomware payments, victims, and operators. By combining an array of data sources, including ransomware binaries, seed ransom payments, victim telemetry from infections, and a large database of Bitcoin addresses annotated with their owners, we sketch the outlines of this burgeoning ecosystem and associated third-party infrastructure.

In particular, we trace the financial transactions, from the moment victims acquire bitcoins, to when ransomware operators cash them out. We find that many ransomware operators cashed out using BTC-e, a now-defunct Bitcoin exchange. In total we are able to track over $16 million in likely ransom payments made by 19,750 potential victims during a two-year period.

While our study focuses on ransomware, our methods are potentially applicable to other cybercriminal operations that have similarly adopted Bitcoin as their payment channel.

Read Full Article
Visit website
  • Show original
  • .
  • Share
  • .
  • Favorite
  • .
  • Email
  • .
  • Add Tags 

The "Right To Be Forgotten" (RTBF) is the landmark European ruling that governs the delisting of personal information from search results. This ruling establishes a right to privacy, whereby individuals can request that search engines delist URLs from across the Internet that contain “inaccurate, inadequate, irrelevant or excessive” information surfaced by queries containing the name of the requester. What makes this ruling unique and challenging is that it requires search engines to decide whether an individual's right to privacy outweighs the public's right to access lawful information when delisting URLs.

Since this ruling came into effect a little over three years ago (May 2014), Google has received ~2.4 millions URLs delisting requests. 43% of these URLs ended-up being delisted. Each delisting decision requires careful consideration in order to strike the right balance between respecting user privacy and ensuring open access to information via Google Search.

To be as transparent as possible about this removal process and to help the public understand how the RTBF requests impact Search results, Google has documented this removal process as part of its Transparency report since 2014.

This initial RTBF transparency report was a great first step toward detailing how the RTBF is used in practice. However inside Google we felt we could do better and that we needed to find a way to make more information available. A key challenge was ensuring that we were able to respect users’ privacy and avoid surfacing any details that could lead to de-anonymization or attract attention to specific URLs that were delisted.

So in January 2016, our RTBF reviewers started manually annotating each requested URL with additional category data, including category of site, type of content on page, and requesting entity. By December 2017, with two full years of carefully categorised additional data, it was clear that we now had the means to deliver an improved transparency dashboard -- which we made publicly available earlier this week. Together with the data that we have previously published about the Right To Be Forgotten, the new data allowed us to conduct an extensive analysis of how Europe’s right to be forgotten is being used, and how Google is implementing the European Court’s decision. The result of this analysis was published in a paper that we release alongside with the improved transparency dashboard. This blog post summarizes our paper’s key findings .

Who uses the right to be forgotten?

89% of requesters were private individuals, the default label when no other special category applied. That being said, in the last two years, non-government public figures such as celebrities requested to delist 41,213 URLs; politicians and government officials requested to delist another 33,937 URLs.

89% of right to be forgotten requests originate from private individuals, but public figures use the RTBF too. In the last two years gov. official requested to delist ~33k URLs; celebrities requested to delist ~41k URLs.

The top 1,000 requesters, 0.25% of individuals filing RTBF requests, were responsible for 15% of the requests. Many of these frequent requesters are in fact not individuals themselves, but law firms and reputation management services representing individuals.

A minority of requesters (0.25%) are responsible for a large fraction of the Right To Be Forgotten requests (~15%).

What is the RTBF used for?

Breaking down removal request by site type revealed that 31% of the requested URLs related to social media and directory services that contained personal information, while 21% of the URLs related to news outlets and government websites that in a majority of cases covered the requester's legal history. The remaining 48% of requested URLs cover a broad diversity of content on the Internet.

The two dominant intents behind the Right To Be Forgotten delisting requests are removing personal information and removing legal history.

What type of information is targeted?

The most commonly targeted content related to professional information, which rarely met the criteria for delisting: only 16.7% of the requested URLs end-up being delisted. Many of these requests pertained to information that was directly relevant or connected to the requester’s current profession and was therefore in the public interest to be indexed by Google Search.

Different countries, different usages

The way the RTBF is exercised through Europe varies by country. Variations in regional attitudes toward privacy, local laws, and media norms strongly influence the type of URLs requested for delisting. Notably the citizens of France and Germany frequently requested delisting of social media and directory pages, while requesters from Italy and the United Kingdom were three times more likely to target news sites.

The Right To Be Forgotten use is country specific. French citizens frequently request social media delisting whereas UK requesters are 3x more likely to target news site.

RTBF requests mostly target local content

Over 77% of the requests are for URLs that are hosted on domains that have the top-level domain associated with the country of the requester. For example peoplecheck.de lives under the TLD .de which is the German top-level domain. At least 86% of the requests targeting the top 25 news outlets were from requesters from the same country.

Right To Be Forgotten delisting requests are mostly used to remove local content.

Thank you for reading this post till the end! If you enjoyed it, don’t forget to share it on your favorite social network so that your friends and colleagues can enjoy it too and llearn how the Right To Be Forgotten is used through Europe.

To get notified when my next post is online, follow me on Twitter , Facebook , Google+ , or LinkedIn . You can also get the full posts directly in your inbox by subscribing to the mailing list or via RSS .

A bientôt!

Read Full Article
Visit website

Read for later

Articles marked as Favorite are saved for later viewing.
close
  • Show original
  • .
  • Share
  • .
  • Favorite
  • .
  • Email
  • .
  • Add Tags 

Separate tags by commas
To access this feature, please upgrade your account.
Start your free month
Free Preview