Loading...

FireEye recently posted some research about an attack leveraging the NetSupport Remote Access tool. The first stage of this attack uses a lot of obfuscation tricks to try to make reverse engineering more difficult.

David Ledbetter and I were chatting about some of the lengths the malware authors went through to obfuscate the content.

One of the major sources of complication is a complicated, iterative XOR:

(Image credit FireEye)

Unlike most malware that obfuscates content by XORing the content with a single byte / key, this malware appears to do something much more clever. See the content starting at ‘var tmpKeyLength = 1;’?

  1. XOR each character of the content with the first byte of the encryption key
  2. XOR characters of the content with bytes from the encryption key in the following pattern: 1, 2, 1, 2, 1, 2, 1, 2, …
  3. XOR characters of the content with bytes from the encryption key in the following pattern: 1, 2, 3, 1, 2, 3, 1, 2, 3, …

When malware uses step #1 alone -- or even a repeating single-key XOR -- I like to call it “Encraption”. It appears complicated, but is vulnerable to many forms of cryptanalysis and can be easily broken. Given that this malware did several levels of Encraption, did they manage to finally invent something more secure than a basic repeating key XOR?

Not even close.

XOR is Associative

One of the biggest challenges with using XOR in cryptography is that it is associative: you can rearrange parenthesis without impacting the final result. For example, consider again a single byte key and the following double XOR encryption:

  1. Take the content
  2. XOR each character by the value ‘123’
  3. XOR each character by the value ‘321’
  4. Emit the result

If we were to add parenthesis to describe the order of operations:

(Content XOR 123) XOR 321

Because XOR is associative, you can rearrange parenthesis (order of operations) to make it:

Content XOR (123 XOR 321)

Which gives 314:

So, encraption with two keys is still just encraption with a single key and is vulnerable to all of the same attacks.

But what about that rolling key?

The malware above used something more like a rolling key, however. It didn’t take a couple of single bytes. If the content was 100 bytes, it did 100 rounds of XOR based on the key. Surely that must be secure.

Fortunately? Unfortunately? The answer is no. If we remember the malware’s algorithm:

    1. XOR each character of the content with the first byte of the encryption key
    2. XOR characters of the content with bytes from the encryption key in the following pattern: 1, 2, 1, 2, 1, 2, 1, 2, …
    3. XOR characters of the content with bytes from the encryption key in the following pattern: 1, 2, 3, 1, 2, 3, 1, 2, 3, …

Consider the perspective of a single character. It gets encrapted by one byte of the key, and then a different byte of the key, and then a different byte of the key… and so on. And because XOR is associative, as we demonstrated above, that is the same thing as the single character being encrapted by a single byte.

A PowerShell Demonstration

Let’s take a look at a demonstration of this in PowerShell.

First, let’s look at a faithful re-implementation of the original algorithm:

001
002
003
004
005
006
007
008
009
010
011
012
013
014
015
016
017

$stringToEncrypt = [char[]] "Hello World!"
$encrypted = $stringToEncrypt.Clone()
$key = 97,4,13,252,119,31,208,156,196,56

$tmpKeyLength = 1
while($tmpKeyLength -le $key.Length)
{
    $tmpKey = $key[0..$tmpKeyLength]
    for($i = 0; $i -lt $encrypted.Length; $i++)
    {
        $encrypted[$i] = $encrypted[$i] -bxor $tmpKey[$i % $tmpKey.Length]
    }
    $tmpKeyLength++
}

"Encrypted:"
([byte[]]$encrypted) | Format-Hex | Out-String

When you take a look at the result, here’s what you get:

Pretty impressive! Look at all those non-ASCII characters. This must be unbreakable!

To get the equivalent single-pass encraption key, we can just XOR the encrapted string with the original string. How?

XOR is Commutative

We can do this because XOR is commutative as well: you can rearrange the order of terms without impacting the result.

If Encrapted is:

Content XOR Key1 XOR Key2 XOR Key3

then we do:

Encrapted XOR Content

then we get:

Content XOR Key1 XOR Key2 XOR Key3 XOR Content

Because XOR is commutative, we can rearrange terms to get:

Content XOR Content XOR Key1 XOR Key2 XOR Key3

Anything XOR’d with itself can be ignored

One of the reasons XOR encraption works is that anything XOR’d with itself can be ignored. For example:

Encrypted = Content XOR Key

Decrypted = Encrypted XOR Key

By XORing some content with a key twice, you get back the original content. So back to where we got with the last section, if we XOR the final result with the original content and rearrange, we get:

Content XOR Content XOR Key1 XOR Key2 XOR Key3

That gives us an equivalent single key that we can use: Key1 XOR Key2 XOR Key3.

Here’s an example of figuring out this new single-pass key:

001
002
003
004
005
006
007
008

$newKey = New-Object 'char[]' $stringToEncrypt.Length
for($i = 0; $i -lt $stringToEncrypt.Length; $i++)
{
    $newKey[$i] = $encrypted[$i] -bxor $stringToEncrypt[$i] 
}

""
"Equivalent key: " + (([int[]]$newKey) -join ",")

And the result:

And to prove they give equivalent results:

001
002
003
004
005
006
007
008
009

$encrypted = $stringToEncrypt.Clone()
for($i = 0; $i -lt $encrypted.Length; $i++)
{
    $encrypted[$i] = $encrypted[$i] -bxor $newKey[$i]
}

""
"Easy encrypted:"
([byte[]]$encrypted) | Format-Hex | Out-String

The Nelson Moment

This is where we get to point and laugh a little. Remember how the malware repeatedly XORs the content with various bits of the key? It did far more damage than the malware author realized. Let’s consider the character in the content at position 0 for a moment:

    1. XOR with byte 0 of the key
    2. XOR with byte 0 of the key (thereby stripping the encraption altogether!)
    3. XOR with byte 0 of the key
    4. XOR with byte 0 of the key (thereby stripping the encraption altogether!)

Let’s consider byte 1 of the content

    1. XOR with byte 0 of the key
    2. XOR with byte 1 of the key
    3. XOR with byte 1 of the key (thereby stripping the work done in step 2)
    4. XOR with byte 1 of the key
    5. XOR with byte 1 of the key (thereby stripping the work done in step 4)

Depending on the length of the key and the content, this pattern of alternating doing work and removing the work that was previously done continues. This now takes what could have been a potentially strong key of 100s of bytes down to a super expensive way to compute single key!! Here’s a demo of encrapting some GUIDs:

Notice that some characters (28 at the beginning, c9 at offset 0x48) didn’t even get encrypted at all?

Remember folks, don’t roll your own crypto, and especially don’t roll your own crapto.

Read Full Article
Visit website
  • Show original
  • .
  • Share
  • .
  • Favorite
  • .
  • Email
  • .
  • Add Tags 

K-Means clustering is a popular technique to find clusters of data based only on the data itself. This is most commonly applied to data that you can somehow describe as a series of numbers.

When you can describe the data points as a series of numbers, K-Means clustering (Lloyd’s Algorithm) takes the following steps:

  1. Randomly pick a set of group representatives. Lloyd’s algorithm generally picks random coordinates, although sometimes picks specific random data points.
  2. Assign all of the items to the nearest representative.
  3. Update the group representative to more accurately represent its members. In Lloyd’s algorithm, this means updating the location of the representative to represent the average location of every item assigned to it.
  4. Revisit all of the items, assigning them to their nearest group representative.
  5. If any items shifted groups, repeat steps 3-5.

Applying this technique directly to words is not possible, as words don’t have coordinates. Because of that:

  • Randomly picking a coordinate cannot be used to randomly create a group representative.
  • Refining a group representative based on its current word cluster is more complicated than simply averaging the coordinates of the items in the cluster.

If we follow the philosophy of Lloyd’s algorithm, however, we can still create a version of K-Means Clustering for Words. In our implementation, we:

  1. Pick random words from the provided list as group representatives.
  2. Use Levenshtein Distance (string similarity) to measure "nearest group representative".
  3. Use word averaging to update the nearest group representative. Word averaging is a new word of the "average" word length, with characters at each position created by taking the most common letter at that position.

This is very computationally expensive for large data sets, but can provide some very reasonable clustering for small data sets.

To explore this further, you can download Get-WordCluster from the PowerShell Gallery. It’s as simple as:

  1. Install-Script Get-WordCluster –Scope CurrentUser
  2. (-split "Hello world Hell word SomethingElse") | Get-WordCluster -Count 3
Read Full Article
Visit website
  • Show original
  • .
  • Share
  • .
  • Favorite
  • .
  • Email
  • .
  • Add Tags 

When you first get your ham radio license, the FCC gives you a random call sign based on your location and roughly your date of application. The resulting call sign is usually pretty impersonal, but the FCC lets you apply for a “vanity” call sign for free.

While the rules for these vanity call signs change depending on your license class (Technician, General, Extra), most of the good (shorter) vanity call signs that fall under the “extra” rules are taken. So realistically, your options will be likely be a 2x3 callsign (group C or group D).

One place to look for available call signs is at http://callsign.ualr.edu. Please be kind and don’t pound their server! If you want to do hundreds / thousands of lookups, you can download the FCC database directly.

But if you only want to look at tens / hundreds, here’s a PowerShell script to help out.

001
002
003
004
005
006
007
008
009
010
$final3 = "LEE"
$prefixes = "WX","KZ"

foreach($prefix in $prefixes) {
    foreach($number in 0..9) {
        $call = $prefix + $number + $final3
        $wr = Invoke-WebRequest http://callsign.ualr.edu/detail.php -Method Post -Body @{ call = $call }
        if($wr.Content -match "no records were found!") { $call }
    }
}

Read Full Article
Visit website
  • Show original
  • .
  • Share
  • .
  • Favorite
  • .
  • Email
  • .
  • Add Tags 

You might have run into situations in the past where you’re looking for some specific text or binary sequence, but that content is encoded with Base-64. Base-64 is an incredibly common encoding format in malware, and all kinds of binary obfuscation tools alike.

The basic idea behind Base-64 is that it takes arbitrary binary data and encodes it into 64 (naturally) ASCII characters that can be transmitted safely over any normal transmission channel. Wikipedia goes into the full details here: https://en.wikipedia.org/wiki/Base64.

Some tooling supports decoding of Base-64 automatically, but that requires some pretty detailed knowledge of where the Base-64 starts and stops.

The Problem

Pretend you’re looking for the string, “Hello World” in a log file or SIEM system, but you know that it’s been Base-64 encoded. You might use PowerShell’s handy Base-64 classes to tell you what to search for:

That seems useful. But what if “Hello World” is in the middle of a longer string? Can you still use ‘SGVobG8gV29fbGQ=’? It turns out, no. Adding a single character to the beginning changes almost everything:

Now, we’ve got ‘IEhlbGxvIFdvcmxk’.

The main problem here is the way that Base-64 works. When we’re encoding characters, Base-64 takes 3 characters (24 bits) and re-interprets them as 4 segments of 6 bits each. It then encodes each of those 6 bits into the 64 characters that you know and love. Here’s a graphical example from Wikipedia:

So when we add a character to the beginning, we shift the whole bit pattern to the right and change the encoding of everything that follows!

Another feature of Base-64 is padding. If your content isn’t evenly divisible by 24 bits, Base-64 encoding will pad the remainder with null bytes. It will use the “=” character to denote how many extra padding blocks were used:

When final padding is added, you can’t just remove those "=” characters. If additional content is added to the end of your string (i.e.: “Hello World!”), that additional content will influence both the padding bytes, as well as the character before them.

Another major challenge is when the content is Unicode rather than ASCII. All of these points still apply – but the bit patterns change. Unicode usually represents characters as two bytes (16 bits). This is why the Base-64 encoding of Unicode content representing ASCII text has so many of the ‘A’ character: that is the Base-64 representation of a NULL byte.

The Solution

When you need to search for content that’s been Base-64 encoded, then, the solution is to generate the text at all possible three-byte offsets, and remove the characters that might be influenced by the context: content either preceding what you are looking for, or the content that follows. Additionally, you should do this for both the ASCII as well as Unicode representations of the string.

An Example

One example of Base-64 content is in PowerShell’s –EncodedCommand parameter. This shows up in Windows Event Logs if you have command-line logging enabled (and of course shows up directly if you have PowerShell logging enabled).

Here’s an example of an event log like that:

Here’s an example of launching a bunch of PowerShell instances with the –EncodedCommand parameter, as well as the magical Get-Base64RegularExpression command. That command will generate a regular expression that you can use to match against that content:

As you can see in this example, searching for the Base-64 content of “My voice is my” returned all four log entries, while the “My voice is my passport” search returned the single event log that contained the whole expression.

The Script

Get-Base64RegularExpression is a pretty simple script. You can use this in PowerShell, or any event log system that supports basic regular expression searches.

001
002
003
004
005
006
007
008
009
010
011
012
013
014
015
016
017
018
019
020
021
022
023
024
025
026
027
028
029
030
031
032
033
034
035
036
037
038
039
040
041
042
043
044
045
046
047
048
049
050
051
052
053
054
055
056
057
058
059
060

## Get-Base64RegularExpression.ps1
## Get a regular expression that can be used to search for content that has been
## Base-64 encoded

param(
    ## The value that we would like to search for in Base64 encoded content
    [Parameter(Mandatory)]
    $Value
)

## Holds the various byte representations of what we're searching for
$byteRepresentations = @()

## If we got a string, look for the Unicode and ASCII representations of the string
if($Value -is [String])
{
    $byteRepresentations += 
        [System.Text.Encoding]::Unicode.GetBytes($Value),
        [System.Text.Encoding]::ASCII.GetBytes($Value)
}

## If it was a byte array directly, look for the byte representations
if($Value -is [byte[]])
{
    $byteRepresentations += ,$Value
}

## Find the safe searchable sequences for each Base64 representation of input bytes
$base64sequences = foreach($bytes in $byteRepresentations)
{
    ## Offset 0. Sits on a 3-byte boundary so we can trust the leading characters.
    $offset0 = [Convert]::ToBase64String($bytes)

    ## Offset 1. Has one byte from preceeding content, so we need to throw away the
    ## first 2 leading characters
    $offset1 = [Convert]::ToBase64String( (New-Object 'Byte[]' 1) + $bytes ).Substring(2)

    ## Offset 2. Has two bytes from preceeding content, so we need to throw away the
    ## first 4 leading characters
    $offset2 = [Convert]::ToBase64String( (New-Object 'Byte[]' 2) + $bytes ).Substring(4)

    ## If there is any terminating padding, we must remove the characters mixed with that padding. That
    ## ends up being the number of equals signs, plus one.
    $base64matches = $offset0,$offset1,$offset2 | % {
        if($_ -match '(=+)$')
        {
            $_.Substring(0, $_.Length - ($matches[0].Length + 1))
        }
        else
        {
            $_
        }
    }

    $base64matches | ? { $_ }
}

## Output a regular expression for these sequences
"(" + (($base64sequences | Sort-Object -Unique) -join "|") + ")"

Read Full Article
Visit website
  • Show original
  • .
  • Share
  • .
  • Favorite
  • .
  • Email
  • .
  • Add Tags 

Separate tags by commas
To access this feature, please upgrade your account.
Start your free month
Free Preview