Category Archives: IDS

Using binsnitch.py to detect files touched by malware

Yesterday, we released binsnitch.py – a tool you can use to detect unwanted changes to the file sytem. The tool and documentation is available here: https://github.com/NVISO-BE/binsnitch.

Binsnitch can be used to detect silent (unwanted) changes to files on your system. It will scan a given directory recursively for files and keep track of any changes it detects, based on the SHA256 hash of the file. You have the option to either track executable files (based on a static list available in the source code), or all files.

Binsnitch.py can be used for a variety of use cases, including:

  • Use binsnitch.py to create a baseline of trusted files for a workstation (golden image) and use it again later on to automatically generate a list of all modifications made to that system (for example caused by rogue executables installed by users, or dropped malware files). The baseline could also be used for other detection purposes later on (e.g., in a whitelist);
  • Use binsnitch.py to automatically generate hashes of executables (or all files if you are feeling adventurous) in a certain directory (and its subdirectories);
  • Use binsnitch.py during live malware analysis to carefully track which files are touched by malware (this is the topic of this blog post).

In this blog post, we will use binsnitch.py during the analysis of a malware sample (VirusTotal link:
https://virustotal.com/en/file/adb63fa734946d7a7bb7d61c88c133b58a6390a1e1cb045358bfea04f1639d3a/analysis/)

A summary of options available at the time of writing in binsnitchy.py:

usage: binsnitch.py [-h] [-v] [-s] [-a] [-n] [-b] [-w] dir

positional arguments:
  dir               the directory to monitor

optional arguments:
  -h, --help        show this help message and exit
  -v, --verbose     increase output verbosity
  -s, --singlepass  do a single pass over all files
  -a, --all         keep track of all files, not only executables
  -n, --new         alert on new files too, not only on modified files
  -b, --baseline    do not generate alerts (useful to create baseline)
  -w, --wipe        start with a clean db.json and alerts.log file

We are going to use binsnitch.py to detect which files are created or modified by the sample. We start our analysis by creating a “baseline” of all the executable files in the system. We will then execute the malware and run binsnitch.py again to detect changes to disk.

Creating the baseline

Capture.PNG

Command to create the baseline of our entire system.

We only need a single pass of the file system to generate the clean baseline of our system (using the “-s” option). In addition, we are not interested in generating any alerts yet (again: we are merely generating a baseline here!), hence the “-b” option (baseline). Finally, we run with the “-w” argument to start with a clean database file.

After launching the command, binsnitch.py will start hashing all the executable files it discovers, and write the results to a folder called binsnitch_data. This can take a while, especially if you scan an entire drive (“C:/” in this case).

Capture.PNG

Baseline creation in progress … time to fetch some cheese in the meantime! 🐀 🧀

After the command has completed, we check the alerts file in “binsnitch_data/alerts.log”. As we ran with the “-b” command to generate a baseline, we don’t expect to see alerts:

Capture 2.PNG

Baseline successfully created! No alerts in the file, as we expected.

Looks good! The baseline was created in 7 minutes.

We are now ready to launch our malware and let it do its thing (of-course, we do this step in a fully isolated sandbox environment).

Running the malware sample and analyzing changes

Next, we run the malware sample. After that, we canrun binsnitch.py again to check which executable files have been created (or modified):

Capture.PNG

Scanning our system again to detect changes to disk performed by the sample.

We again use the “-s” flag to do a single pass of all executable files on the “C:/” drive. In addition, we also provide the “-n” flag: this ensures we are not only alerted on modified executable files, but also on new files that might have been created since the creation of the baseline. Don’t run using the “-w” flag this time, as this would wipe the baseline results. Optionally, you could also add the “-a” flag, which would track ALL files (not only executable files). If you do so, make sure your baseline is also created using the “-a” flag (otherwise, you will be facing a ton of alerts in the next step!).

Running the command above will again take a few minutes (in our example, it took 2 minutes to rescan the entire “C:/” drive for changes). The resulting alerts file (“binsnitch_data/alerts.log”) looks as following:

Capture.PNG

Bingo! We can clearly spot suspicious behaviour now observing the alerts.log file. 🔥

A few observations based on the above:

  • The malware file itself was detected in “C:/malware”. This is normal of-course, since the malware file itself was not present in our baseline! However, we had to copy it in order to run it;
  • A bunch of new files are detected in the “C:/Program Files(x86)/” folder;
  • More suspicious though are the new executable files created in “C:/Users/admin/AppData/Local/Temp” and the startup folder.

The SHA256 hash of the newly created startup item is readily available in the alerts.log file: 8b030f151c855e24748a08c234cfd518d2bae6ac6075b544d775f93c4c0af2f3

Doing a quick VirusTotal search for this hash results in a clear “hit” confirming our suspicion that this sample is malicious (see below). The filename on VirusTotal also matches the filename of the executable created in the C:/Users/admin/AppData/Local/Temp folder (“A Bastard’s Tale.exe”).

Screen Shot 2017-05-17 at 00.28.05.png

VirusTotal confirms that the dropped file is malicious.

You can also dive deeper into the details of the scan by opening “binsnitch_data/data.json” (warning, this file can grow huge over time, especially when using the “-a” option!):

Capture.PNG

Details on the scanned files. In case a file is modified over time, the different hashes per file will be tracked here, too.

From here on, you would continue your investigation into the behaviour of the sample (network, services, memory, etc.) but this is outside the scope of this blog post.

We hope you find binsnitch.py useful during your own investigations and let us know on github if you have any suggestions for improvements, or if you want to contribute yourself!

Squeak out! 🐁

Daan

Developing complex Suricata rules with Lua – part 2

In part 1 we showed a Lua program to have Suricata detect PDF documents with obfuscated /JavaScript names. In this second part we provide some tips to streamline the development of such programs.

When it comes to developing Lua programs, Suricata is not the best development environment. The “write code & test”-cycle with Suricata can be quite tedious. One of the reasons is that it takes time. It can take 1 minute or more to start Suricata with the new rule, and have it process a test pcap file. And if there are errors in your Lua script, Suricata will not be of much help to identify and fix these errors.

Inspired by Emerging Threats Lua scripts, we adopted the following development method:

Test the script with a standalone Lua interpreter, and move to Suricata for the final tests.

This is one of the reasons why, in part 1, we put the logic of our test in function PDFCheckName which takes a string as input and is called by the match function. By doing this, we can also call (and test) the function from other functions with a standalone Lua interpreter as shown below:

function Test()
    print()
    print(PDFCheckName("testing !!!", true))
    print()
    print(PDFCheckName("testing /JavaScript and more /J#61vaScript !!!", true))
    print()
    print(PDFCheckName("testing /JavaScript and !!!", true))
    print()
    print(PDFCheckName("testing /J#61vaScript !!!", true))
    print()
end

This Test function calls PDFCheckName with different strings as input. We also added extra print statements to the function (see complete source code below), which are activated by the second argument of function PDFCheckName. This boolean argument, bVerbose, adds verbosity to our function when the argument is true.

We can load the Lua program in a Lua interpreter, and then call function Test. One way to do this is type command “lua -i PDFCheckName.lua”, and then type Test() at the Lua prompt. This can all be scripted in a single command like this:

echo Test() ¦ lua -i PDFCheckName.lua

With the following result:

20170306-142115

This “code & run”-cycle is faster than using Suricata, and can be more verbose. Of course, you can also do this with an IDE like Eclipse.

We also added a function TestFile that reads a file (the PDFs we want to test), and then calls PDFCheckName with the content of the PDF file as the argument:

20170306-142208

This produces the following output:

20170306-142242

Being able to test a PDF file directly is also a big advantage, compared to having to create a PCAP file with a http request downloading the PDF file to test.

Conclusion

By using functions and a standalone Lua interpreter, we can significantly improve the development process of Lua programs for Suricata.

Code

-- 2017/02/20

-- echo Test() | lua53.exe -i test.lua
-- echo TestFile() | lua53.exe -i test.lua javascript.pdf

tBlacklisted = {["/JavaScript"] = true}

function PDFCheckName(sInput, bVerbose)
    if bVerbose then
        print('sInput: ' .. sInput)
    end
    for sMatchedName in sInput:gmatch("/[a-zA-Z0-9_#]+") do
        if bVerbose then
            print('sMatchedName: ' .. sMatchedName)
        end
        if sMatchedName:find("#") then
            local sNormalizedName = sMatchedName:gsub("#[a-fA-F0-9][a-fA-F0-9]", function(hex) return string.char(tonumber(hex:sub(2), 16)) end)
            if bVerbose then
                print('sNormalizedName: ' .. sNormalizedName)
            end
            if tBlacklisted[sNormalizedName] then
                if bVerbose then
                    print('Blacklisted!')
                end
                return 1
            end
        end
    end
    if bVerbose then
        print('Not blacklisted!')
    end
    return 0
end

function init(args)
    return {["http.response_body"] = tostring(true)}
end

function match(args)
    return PDFCheckName(tostring(args["http.response_body"]), false)
end

function Test()
    print()
    print(PDFCheckName("testing !!!", true))
    print()
    print(PDFCheckName("testing /JavaScript and more /J#61vaScript !!!", true))
    print()
    print(PDFCheckName("testing /JavaScript and !!!", true))
    print()
    print(PDFCheckName("testing /J#61vaScript !!!", true))
    print()
end

function TestFile()
    local file = io.open(arg[1])
    print()
    print(PDFCheckName(file:read("*all"), true))
    file:close()
end

Developing complex Suricata rules with Lua – part 1

The Suricata detection engine supports rules written in the embeddable scripting language Lua. In this post we give a PoC Lua script to detect PDF documents with name obfuscation.

One of the elements that make up a PDF, is a name. A name is a reserved word that starts with character / followed by alphanumerical characters. Example: /JavaScript. The presence of the name /JavaScript is an indication that the PDF contains scripts (written in JavaScript).

The PDF specification allows for the substitution of alphanumerical characters in a name by an hexadecimal representation: /J#61vaScript. #61 is the hexadecimal representation of letter a. We call the use of this hexadecimal representation in names “name obfuscation”, because it is a simple technique to evade detection by engines that just look for the normal, unobfuscated name (/JavaScript).

There is no limit to the number of characters in a name that can be replaced by their hexadecimal representation. That makes it impossible to write a Suricata/Snort rule (using content, pcre, …) that will detect all possible obfuscations of the name /JavaScript. However it is easy to write a program that normalizes obfuscated names (pdfid does this for example).

Fortunately Suricata supports the programming language Lua for some time now. Let’s take a look how we can use this to detect PDF files that contain the obfuscated name /JavaScript (FYI: all PDF files we observed with obfuscated /JavaScript name were malicious, so it’s a good test to detect malicious PDFs).

A Suricata Lua script has to implement 2 functions: init and match.

The init function declares the data we need from Suricata to be able to do our analysis. For the PDF document, we need the HTTP response body:

function init(args)
    return {["http.response_body"] = tostring(true)}
end

The match function needs to contain the actual logic to analyze the payload. We need to retrieve the HTTP response body, analyze it, and return 1 if we detect something. When nothing is detected, we need to return 0.

In this example of the match function, we detect if the HTTP response body is equal to string test:

function match(args)
    a = tostring(args["http.response_body"])
    if a == "test" then
        return 1
    else
        return 0
    end
end

To detect obfuscated /JavaScript names we use this code:

tBlacklisted = {["/JavaScript"] = true}

function PDFCheckName(sInput)
    for sMatchedName in sInput:gmatch("/[a-zA-Z0-9_#]+") do
        if sMatchedName:find("#") then
            local sNormalizedName = sMatchedName:gsub("#[a-fA-F0-9][a-fA-F0-9]", function(hex) return string.char(tonumber(hex:sub(2), 16)) end)
            if tBlacklisted[sNormalizedName] then
                return 1
            end
        end
    end
    return 0
end

Function PDFCheckName takes a string as input (sInput) and then starts to search for names in the input:

sInput:gmatch("/[a-zA-Z0-9_#]+")

For each name we find, we check if it contains a # character (e.g. if it could be obfuscated):

if sMatchedName:find("#") then

When this is the case, we try to normalize the name (replace #hexadecimal with corresponding ANSI code):

local sNormalizedName = sMatchedName:gsub("#[a-fA-F0-9][a-fA-F0-9]", function(hex) return string.char(tonumber(hex:sub(2), 16)) end)

And finally, we check if the normalized name is in our blacklist:

if tBlacklisted[sNormalizedName] then
    return 1
end

In that case we return 1. And otherwise we return 0.

The complete Lua script:

tBlacklisted = {["/JavaScript"] = true}

function PDFCheckName(sInput)
    for sMatchedName in sInput:gmatch"/[a-zA-Z0-9_#]+" do
        if sMatchedName:find("#") then
            local sNormalizedName = sMatchedName:gsub("#[a-fA-F0-9][a-fA-F0-9]", function(hex) return string.char(tonumber(hex:sub(2), 16)) end)
            if tBlacklisted[sNormalizedName] then
                return 1
            end
        end
    end
    return 0
end

function init(args)
    return {["http.response_body"] = tostring(true)}
end

function match(args)
    return PDFCheckName(tostring(args["http.response_body"]))
end

To get Suricata to run our Lua script, we need to copy it in the rules directory and add a rule to call the script, like this:

alert http $EXTERNAL_NET any -> $HOME_NET any (msg:"NVISO PDF file lua"; flow:established,to_client; luajit:pdfcheckname.lua; classtype:policy-violation; sid:1000000; rev:1;)

Rule option luajit allows us to specify the Lua script we want to execute (pdfcheckname.lua).

That’s all there is to do to get this running.

But on production systems, we will quickly get into trouble because of performance issues. The rule that we wrote will get the Lua script to execute on all HTTP traffic with incoming data. To avoid this, it is best to add pre-conditions to the rule so that the program will only run on downloaded PDF files:

alert http $EXTERNAL_NET any -> $HOME_NET any (msg:"NVISO PDF file lua"; flow:established,to_client; file_data; content:"%PDF-"; within:5; luajit:pdfcheckname.lua; classtype:policy-violation; sid:1000000; rev:1;)

This updated rule checks that the file starts with %PDF- (that’s an easy trick to detect a PDF file, but be aware that there are ways to bypass this simple detection).

For some environments, checking all downloaded PDF files might still cause performance problems. This updated rule uses a regular expression to check if the downloaded PDF file contains a (potentially) obfuscated name:

alert http $EXTERNAL_NET any -> $HOME_NET any (msg:"NVISO PDF file lua"; flow:established,to_client; file_data; content:"%PDF-"; within:5; pcre:"/\/.{0,10}#[a-f0-9][a-f0-9]/i"; luajit:pdfcheckname.lua; classtype:policy-violation; sid:1000000; rev:1;)

Note that in the regular expression of this rule we expect that the name is not longer than 11 characters (that’s the case with the name we want to detect, /JavaScript). So if you add your own names to the blacklist, and they are longer than 11 characters, then update the regular expression in the rule.

Conclusion

Support for Lua in Suricata makes it possible to develop complex analysis methods that would not be possible with simple rules, however performance needs to be taken into account.

In part 2 of this blog post, we will provide some tips to help with the development and testing of Lua scripts for Suricata.