Author Archives: didiernviso

Hunting malware with metadata

A while ago Michel wrote a blog post Tracking threat actors through .LNK files.

In this post, we want to illustrate how VirusTotal (retro) hunting can be leveraged to extract malware samples and metadata linked to a single threat actor. We use the power of YARA rules to pinpoint the metadata we are looking for.

With some of the metadata extracted from the .LNK file we wrote about in our previous blog post (Volume ID and MAC address), we’re going to search on VirusTotal for samples with that metadata. It is clear from the MAC address 00:0C:29:5A:39:04 that the threat actor used a virtual machine to build malware: 00:0C:29 is an OUI owned by VMware. We wonder if the same VM was used to create other samples.
With a VirusTotal Intelligence subscription, one can search through the VirusTotal sample database, for example with YARA rules. We use the following YARA rule for the metadata:

rule MALDOC_LNK {
strings:
$BirthObjectId = {C2 CC 13 98 18 B9 E2 41 82 40 54 A8 AD E2 0A 9A}
$MACAddress = {00 0C 29 5A 39 04}
condition:
all of them
}

VTI supports hunting and retro-hunting with YARA rules. With hunting, you will be informed each time your YARA rules triggers on the VT servers each time a newly submitted sample matching your rule. With retro-hunting, YARA rules are used to scan through 75TB of samples in the VT database. This correspond more or less to the set of samples submitted in the last three months.
Here is the result from a retro-hunt using YARA rule MALDOC_LNK:

Next step is to download and analyse all these samples. Since we did not include a file type condition in our YARA rule, we get different types of files: Word .doc files, .lnk files, raw OLE streams containing .lnk files, and MIME files (e-mails with Word documents as attachment).
With this command we search for strings containing “http” in the samples:

So we see that the same virtual machine has been used to created several samples. Here we extract the commands launched via the .lnk file:

There are 2 types of commands: downloading one executable; and downloading one executable and a decoy document.

The metadata from the OLE files reveals that the virtual machine has been used for a couple of weeks:

Conclusion

With metadata and VirusTotal, it is possible to identify samples created by the same actor over a period of 3 months. These samples can provide new metadata and IOCs.

Analysis of a CVE-2017-0199 Malicious RTF Document

There is a new exploit (CVE-2017-0199) going around for which a patch was released by Microsoft on 11/04/2017. In this post, we analyze an RTF document exploiting this vulnerability and provide a YARA rule for detection.

rtfdump.py is a Python tool to analyze RTF documents. Running it on our sample produces a list with all “entities” in the RTF document (text enclosed between {}):

This is often a huge list with a lot of information. But here, we are interested in OLE 1.0 objects embedded within this RTF file. We can use the filter with option -f O for such objects:

There are 2 entities (objdata and datastore) with indices 153 and 249 (this is a number generated by rtfdump, it is not part of the RTF code). The content of an object is encoded with hexadecimal characters in an RTF file,  entity 153 contains 5448 hexademical characters. So let’s take a look by selecting this entity for deeper analysis with option -s 153:

In this hex/ascii dump, we can see that the text starts with 01050000 02000000, indicating an OLE 1.0 object. As the second line starts with d0cf11e0, we can guess it contains an OLE file.

With option -H, we can convert the hexadecimal characters to binary:

Now we can see the string OLE2Link, which has often been referred to when talking about this zero-day. With option -i, we can get more information about the embedded object:

So it is clearly an embedded OLE file, and the name OLE2Link followed by a zero byte was chosen to identify this embedded OLE file. With option -E, we can extract the embedded object:

Since this is an OLE file, we can analyze it with oledump.py: we dump the file with option -d and pipe it into oledump:

The OLE file contains 2 streams. Let’s take a look at the first stream:

We can recognize a URL, let’s extract it with strings:

Because of vulnerability CVE-2017-0199, this URL will automatically be downloaded. The web server serving this document, will identify it as an HTA file via a Content-Type header:

Because this download is performed by the URL Moniker, this moniker will recognize the content-type and open the downloaded file with Microsoft’s HTA engine. The downloaded HTA file might look to us like an RTF file, but the HTA parser will find the VBS script and execute it:

This VBS script performs several actions, ultimately downloading and executing a malicious executable.

Detection

Let’s take a second look at the first stream in the OLE file (the stream with the malicious URL):

The byte sequence that we selected here (E0 C9 EA 79 F9 BA CE 11 8C 82 00 AA 00 4B A9 0B), is the binary representation of the URL Moniker GUID: {79EAC9E0-BAF9-11CE-8C82-00AA004BA90B}. Notice that the binary byte sequence and the text representation of the GUID is partially reversed, this is typical for GUIDs.

After the URL Moniker GUID, there is a length field, followed by the malicious URL (and then followed by a file closing sequence, …).

We use the following YARA rule to hunt for these RTF documents:

rule rtf_objdata_urlmoniker_http {
 strings:
 $header = "{\\rtf1"
 $objdata = "objdata 0105000002000000" nocase
 $urlmoniker = "E0C9EA79F9BACE118C8200AA004BA90B" nocase
 $http = "68007400740070003a002f002f00" nocase
 condition:
 $header at 0 and $objdata and $urlmoniker and $http
 }
 

Remark 1: we do not search for string OLE2Link

Remark 2: with a bit of knowledge of the RTF language, it is trivial to modify documents to bypass detection by this rule

Remark 3: the search for http:// (string $http) is case sensitive, and if you want, you can omit it (for example, it will not trigger on https).

Remark 4: there is no test for the order in which these strings appear

Happy hunting!

New Hancitor maldocs keep on coming…

Didier Stevens will provide NVISO training on malicious documents at Brucon Spring: Malicious Documents for Blue and Red Teams.

For more than half a year now we see malicious Office documents delivering Hancitor malware via a combination of VBA, shellcode and embedded executable. The VBA code decodes and executes the shellcode, the shellcode hunts for the embedded executable, decodes and executes it.

From the beginning, the embedded executable was encoded with a bit more complexity than a simple XOR operation. Here in the shellcode we see that the embedded executable is decoded by adding 3 to each byte and XORing with 17. Then base64 decoding and the EXE is decoded.

20170320-102839

The gang behind Hancitor steadily delivered new maldocs, without changing much to this encoding method. Until about 2 months ago we started to see samples where the XOR key was a WORD (2 bytes) instead of a single byte.

Recently we received a sample that changed the encoding of the embedded executable again. This sample still uses macros, shellcode and an embedded executable:

20170320-155549

The encoded shellcode is still in a form (stream 16), and the embedded executable is still in data (stream 5), appended after a PNG image:

20170320-155715

If we look at the embedded executable, we see that the pattern has changed: in the beginning, we see a pattern of 4 repeating bytes. This is a strong indication that the group started to adopt a DWORD (4 bytes) key:

20170320-155738

We can try to recover the xor key by performing a known plaintext attack: up til now, the embedded executables were base64 encoded and started with TVqQAA… Let’s use xor-kpa to try to recover the key:

20170320-160106

We still find no key after trying out all add values between 1 and 16. Could it be that this time, it is just XOR encoded without addition? Let’s try:

20170320-160137

Indeed! The key is xP4?.

We can now decode and extract the embedded executable:

20170320-160203

20170320-160409

20170320-160457

Conclusion

The gang behind Hancitor has been creating complex malicious document to deliver their malware, and we constantly have to keep up our analysis techniques.

Developing complex Suricata rules with Lua – part 2

In part 1 we showed a Lua program to have Suricata detect PDF documents with obfuscated /JavaScript names. In this second part we provide some tips to streamline the development of such programs.

When it comes to developing Lua programs, Suricata is not the best development environment. The “write code & test”-cycle with Suricata can be quite tedious. One of the reasons is that it takes time. It can take 1 minute or more to start Suricata with the new rule, and have it process a test pcap file. And if there are errors in your Lua script, Suricata will not be of much help to identify and fix these errors.

Inspired by Emerging Threats Lua scripts, we adopted the following development method:

Test the script with a standalone Lua interpreter, and move to Suricata for the final tests.

This is one of the reasons why, in part 1, we put the logic of our test in function PDFCheckName which takes a string as input and is called by the match function. By doing this, we can also call (and test) the function from other functions with a standalone Lua interpreter as shown below:

function Test()
    print()
    print(PDFCheckName("testing !!!", true))
    print()
    print(PDFCheckName("testing /JavaScript and more /J#61vaScript !!!", true))
    print()
    print(PDFCheckName("testing /JavaScript and !!!", true))
    print()
    print(PDFCheckName("testing /J#61vaScript !!!", true))
    print()
end

This Test function calls PDFCheckName with different strings as input. We also added extra print statements to the function (see complete source code below), which are activated by the second argument of function PDFCheckName. This boolean argument, bVerbose, adds verbosity to our function when the argument is true.

We can load the Lua program in a Lua interpreter, and then call function Test. One way to do this is type command “lua -i PDFCheckName.lua”, and then type Test() at the Lua prompt. This can all be scripted in a single command like this:

echo Test() ¦ lua -i PDFCheckName.lua

With the following result:

20170306-142115

This “code & run”-cycle is faster than using Suricata, and can be more verbose. Of course, you can also do this with an IDE like Eclipse.

We also added a function TestFile that reads a file (the PDFs we want to test), and then calls PDFCheckName with the content of the PDF file as the argument:

20170306-142208

This produces the following output:

20170306-142242

Being able to test a PDF file directly is also a big advantage, compared to having to create a PCAP file with a http request downloading the PDF file to test.

Conclusion

By using functions and a standalone Lua interpreter, we can significantly improve the development process of Lua programs for Suricata.

Code

-- 2017/02/20

-- echo Test() | lua53.exe -i test.lua
-- echo TestFile() | lua53.exe -i test.lua javascript.pdf

tBlacklisted = {["/JavaScript"] = true}

function PDFCheckName(sInput, bVerbose)
    if bVerbose then
        print('sInput: ' .. sInput)
    end
    for sMatchedName in sInput:gmatch("/[a-zA-Z0-9_#]+") do
        if bVerbose then
            print('sMatchedName: ' .. sMatchedName)
        end
        if sMatchedName:find("#") then
            local sNormalizedName = sMatchedName:gsub("#[a-fA-F0-9][a-fA-F0-9]", function(hex) return string.char(tonumber(hex:sub(2), 16)) end)
            if bVerbose then
                print('sNormalizedName: ' .. sNormalizedName)
            end
            if tBlacklisted[sNormalizedName] then
                if bVerbose then
                    print('Blacklisted!')
                end
                return 1
            end
        end
    end
    if bVerbose then
        print('Not blacklisted!')
    end
    return 0
end

function init(args)
    return {["http.response_body"] = tostring(true)}
end

function match(args)
    return PDFCheckName(tostring(args["http.response_body"]), false)
end

function Test()
    print()
    print(PDFCheckName("testing !!!", true))
    print()
    print(PDFCheckName("testing /JavaScript and more /J#61vaScript !!!", true))
    print()
    print(PDFCheckName("testing /JavaScript and !!!", true))
    print()
    print(PDFCheckName("testing /J#61vaScript !!!", true))
    print()
end

function TestFile()
    local file = io.open(arg[1])
    print()
    print(PDFCheckName(file:read("*all"), true))
    file:close()
end

Developing complex Suricata rules with Lua – part 1

The Suricata detection engine supports rules written in the embeddable scripting language Lua. In this post we give a PoC Lua script to detect PDF documents with name obfuscation.

One of the elements that make up a PDF, is a name. A name is a reserved word that starts with character / followed by alphanumerical characters. Example: /JavaScript. The presence of the name /JavaScript is an indication that the PDF contains scripts (written in JavaScript).

The PDF specification allows for the substitution of alphanumerical characters in a name by an hexadecimal representation: /J#61vaScript. #61 is the hexadecimal representation of letter a. We call the use of this hexadecimal representation in names “name obfuscation”, because it is a simple technique to evade detection by engines that just look for the normal, unobfuscated name (/JavaScript).

There is no limit to the number of characters in a name that can be replaced by their hexadecimal representation. That makes it impossible to write a Suricata/Snort rule (using content, pcre, …) that will detect all possible obfuscations of the name /JavaScript. However it is easy to write a program that normalizes obfuscated names (pdfid does this for example).

Fortunately Suricata supports the programming language Lua for some time now. Let’s take a look how we can use this to detect PDF files that contain the obfuscated name /JavaScript (FYI: all PDF files we observed with obfuscated /JavaScript name were malicious, so it’s a good test to detect malicious PDFs).

A Suricata Lua script has to implement 2 functions: init and match.

The init function declares the data we need from Suricata to be able to do our analysis. For the PDF document, we need the HTTP response body:

function init(args)
    return {["http.response_body"] = tostring(true)}
end

The match function needs to contain the actual logic to analyze the payload. We need to retrieve the HTTP response body, analyze it, and return 1 if we detect something. When nothing is detected, we need to return 0.

In this example of the match function, we detect if the HTTP response body is equal to string test:

function match(args)
    a = tostring(args["http.response_body"])
    if a == "test" then
        return 1
    else
        return 0
    end
end

To detect obfuscated /JavaScript names we use this code:

tBlacklisted = {["/JavaScript"] = true}

function PDFCheckName(sInput)
    for sMatchedName in sInput:gmatch("/[a-zA-Z0-9_#]+") do
        if sMatchedName:find("#") then
            local sNormalizedName = sMatchedName:gsub("#[a-fA-F0-9][a-fA-F0-9]", function(hex) return string.char(tonumber(hex:sub(2), 16)) end)
            if tBlacklisted[sNormalizedName] then
                return 1
            end
        end
    end
    return 0
end

Function PDFCheckName takes a string as input (sInput) and then starts to search for names in the input:

sInput:gmatch("/[a-zA-Z0-9_#]+")

For each name we find, we check if it contains a # character (e.g. if it could be obfuscated):

if sMatchedName:find("#") then

When this is the case, we try to normalize the name (replace #hexadecimal with corresponding ANSI code):

local sNormalizedName = sMatchedName:gsub("#[a-fA-F0-9][a-fA-F0-9]", function(hex) return string.char(tonumber(hex:sub(2), 16)) end)

And finally, we check if the normalized name is in our blacklist:

if tBlacklisted[sNormalizedName] then
    return 1
end

In that case we return 1. And otherwise we return 0.

The complete Lua script:

tBlacklisted = {["/JavaScript"] = true}

function PDFCheckName(sInput)
    for sMatchedName in sInput:gmatch"/[a-zA-Z0-9_#]+" do
        if sMatchedName:find("#") then
            local sNormalizedName = sMatchedName:gsub("#[a-fA-F0-9][a-fA-F0-9]", function(hex) return string.char(tonumber(hex:sub(2), 16)) end)
            if tBlacklisted[sNormalizedName] then
                return 1
            end
        end
    end
    return 0
end

function init(args)
    return {["http.response_body"] = tostring(true)}
end

function match(args)
    return PDFCheckName(tostring(args["http.response_body"]))
end

To get Suricata to run our Lua script, we need to copy it in the rules directory and add a rule to call the script, like this:

alert http $EXTERNAL_NET any -> $HOME_NET any (msg:"NVISO PDF file lua"; flow:established,to_client; luajit:pdfcheckname.lua; classtype:policy-violation; sid:1000000; rev:1;)

Rule option luajit allows us to specify the Lua script we want to execute (pdfcheckname.lua).

That’s all there is to do to get this running.

But on production systems, we will quickly get into trouble because of performance issues. The rule that we wrote will get the Lua script to execute on all HTTP traffic with incoming data. To avoid this, it is best to add pre-conditions to the rule so that the program will only run on downloaded PDF files:

alert http $EXTERNAL_NET any -> $HOME_NET any (msg:"NVISO PDF file lua"; flow:established,to_client; file_data; content:"%PDF-"; within:5; luajit:pdfcheckname.lua; classtype:policy-violation; sid:1000000; rev:1;)

This updated rule checks that the file starts with %PDF- (that’s an easy trick to detect a PDF file, but be aware that there are ways to bypass this simple detection).

For some environments, checking all downloaded PDF files might still cause performance problems. This updated rule uses a regular expression to check if the downloaded PDF file contains a (potentially) obfuscated name:

alert http $EXTERNAL_NET any -> $HOME_NET any (msg:"NVISO PDF file lua"; flow:established,to_client; file_data; content:"%PDF-"; within:5; pcre:"/\/.{0,10}#[a-f0-9][a-f0-9]/i"; luajit:pdfcheckname.lua; classtype:policy-violation; sid:1000000; rev:1;)

Note that in the regular expression of this rule we expect that the name is not longer than 11 characters (that’s the case with the name we want to detect, /JavaScript). So if you add your own names to the blacklist, and they are longer than 11 characters, then update the regular expression in the rule.

Conclusion

Support for Lua in Suricata makes it possible to develop complex analysis methods that would not be possible with simple rules, however performance needs to be taken into account.

In part 2 of this blog post, we will provide some tips to help with the development and testing of Lua scripts for Suricata.

Hunting with YARA rules and ClamAV

Did you know the open-source anti-virus ClamAV supports YARA rules? What benefits can this bring to us? One of the important features ClamAV has is the file decomposition capability. Say that the file you want to analyze resides in an archive, or is a packed executable, then ClamAV will unarchive/unpack the file, and run the YARA engine on it.

Let’s start with a simple YARA rule to detect the string “!This program cannot be run in DOS mode”:

20170213-155846

When we scan the notepad.exe PE file with this YARA rule, the rule (test1) triggers.

We can do the same with clamscan:

20170213-155926

With option -d (database), we bypass ClamAV’s signature database defined in clamd.conf and instruct clamscan to use the YARA rule test1.yara.

As shown in the example above, using clamscan on the PE file notepad.exe also triggers the previously created YARA rule test1.yara: YARA.test1.UNOFFICIAL.

In this example we decided to use just one YARA rule for simplicity, but of course you can use several YARA rules together with ClamAV’s signature database. Just put your YARA rules (extension .yara or .yar) in the database folder.

As mentioned in the introduction, ClamAV can also look inside ZIP files and apply the YARA rules on all files found in archives:

20170213-155946

This is something the standard YARA tool can not:

20170213-160050

ClamAV’s YARA rules support does however have some limitations. You can not use modules (like the PE file module), or use YARA rule sets that contain external variables, tags, private rules, global rules, …Every rule must also have strings to search for (at least 2 bytes long). Rules with a condition and without strings are not supported.

Let us take a look at a rule to detect if a file is a PE file (see appendix for the details of the rule):

20170213-160204

We get a warning from ClamAV: “yara rule contains no supported string”.

As ClamAV does not support rules without string: section. We must add a string to search for, even if the rule logic itself does not need it. Since a PE file contains string MZ, let’s search for that:

20170213-160236

This time the rule triggers.

Now, a tricky case: how do we design a rule when we have no single string to search for? The ClamAV developers offer a work-around for such cases: search for any string, and add a condition checking for the presence OR absence of the string. Like this:

20170213-160310

We search for string $a = “any string will do”, and we add condition ($a or not $a). It’s a bit of a hack, but it works.

ClamAV’s file decomposition features bring a lot to the table when it comes to YARA scanning, but in some cases it can be a bit too much. For example, ClamAV decompresses the VBA macro streams in Office documents for scanning. This means that we can use YARA rules to scan VBA source code. A simple rule searching for words AutoOpen and Declare would trigger on all Word documents with macros that run automatically and use the Windows API. Which is very nice to detect potential maldocs. However, ClamAV will apply this YARA rule to all files and decomposed/contained files. So if we feed ClamAV all kind of files (not only MS Office files), then the rule could also trigger (for example) on text files or e-mails that contain words AutoOpen and Declare.

If we could limit the scope of selected YARA rules to certain file types, this would help. Currently ClamAV supports signatures that are only applied to given file types (PE files, OLE files, …), unfortunately this is not supported for YARA files.

ClamAV is an interesting engine to run our YARA rules instead of the standard YARA engine. It has some limitations however, that can also generate false positives if we are not careful with the rules we use or design.

Deconstructing the YARA rule

Our example rule to detect a PE file contains just a condition:

uint16(0) = 0x5A4D and uint32(uint32(0x3C)) == 0x00004550

This rule does not use string searches. It checks a couple of values to determine if a file is a PE file. The checks it performs are:

  • see if the file starts with a MZ header, and;
  • contains a PE header.

First check: the first 2 bytes of the file are equal to MZ. uint16(0) = 0x5A4D.

Second check: the field (32-bit integer) at position 0x3C contains a pointer to a PE header. A PE header starts with bytes PE followed by 2 NULL bytes. uint32(uint32(0x3C)) == 0x00004550.

Functions uint16 and uint32 are little-endian, so we have to write the bytes in reverse order: MZ = 0x4D5A -> 0x5A4D

Maldoc: It’s not all VBA these days

Since late 2014 we witness a resurgence of campaigns spamming malicious Office documents with VBA macros. Sometimes however, we also see malicious Office documents exploiting relatively recent vulnerabilities.

In this blog post we look at a malicious MS Office document that uses an exploit instead of VBA.

The sample we received is 65495b359097c8fdce7fe30513b7c637. It exploits vulnerability CVE-2015-2545 which allows remote attackers to execute arbitrary code via a crafted EPS image, aka “Microsoft Office Malformed EPS File Vulnerability”. In this blog post we want to focus on extracting the payload.

A more detailed explanation on the exploit itself can be found here (pdf).

Analysis

The sample we received is a .docx file, with oledump.py we can confirm it doesn’t contain VBA code:

screen-shot-2017-02-06-at-10-51-58

With zipdump.py (remember that the MS Office 2007+ file format uses ZIP as a container) we can see what’s inside the document:

Screen Shot 2017-02-06 at 11.03.02.png

Looking at the extensions, we see that most files are XML files. There’s also a .gif and .eps file. The .eps file is unusual. Let’s check the start of each file to see if the extensions can be trusted:

screen-shot-2017-02-06-at-11-09-46

This confirms the extensions we see: 12 XML files, a GIF file and an EPS files. As we know are exploits for EPS, we took a look at this file first:

screen-shot-2017-02-06-at-11-58-26

The file contains a large amount of lines like above, so let’s get some stats first:

Screen Shot 2017-02-06 at 11.52.35.png

byte-stats.py gives us all kinds of statistics about the content of the file. First of all, it’s a large file for MS Office documents (10MB). And it’s a pure text file (it only contains printable characters (96%) and whitespace (4%)).

There is a large amount of bytes that are hexadecimal characters (66%) and BASE64 characters (93%). Since the hexadecimal character set is a subset of the BASE64 character set, we need more info to determine if the file contains hexadecimal strings or BASE64 strings. But it very likely contains some, as there are parts of the file (buckets) that contain only hexadecimal/BASE64 characters (10240 100%).

base64dump.py is a tool to search for BASE64 strings (and other encodings like hexadecimal). We will use it to search for a payload in the EPS file. Since the file is large, we can expect to have a lot of hits. So let’s set a minimum sequence length of 1000:

screen-shot-2017-02-06-at-12-11-31

There are 4 large sequences of BASE64 characters in the document. But as the start of each sequence (field Encoded) contains only hexadecimal characters, it’s necessary to check for hexadecimal encoding too:

screen-shot-2017-02-06-at-12-14-46

With this output, it’s clear that the EPS file contains 4 large hexadecimal strings.

Near the start of the decoded string 2 and 4, we can see characters MZ: this could indicate a PE file. So let’s check:

Screen Shot 2017-02-06 at 12.19.55.png

This certainly looks like a PE file. Let’s pipe it through pecheck.py (we need to skip the first 8 bytes: UUUUffff):

screen-shot-2017-02-06-at-12-32-10

This tells us that it is definitively a PE file. With more details from pecheck’s output, we can say it’s a 64-bit DLL. It has a small overlay:

screen-shot-2017-02-06-at-12-36-22

Since this overlay is actually just 8 bytes (UUUUffff), it’s not an overlay, but a “sentinel” like at the start of the hexadecimal sequence. So let’s remove this:

screen-shot-2017-02-06-at-13-43-24

We did submit this DLL to VirusTotal: 30ec672cfcde4ea6fd3b5b14d6201c43.

It has some interesting strings:

screen-shot-2017-02-06-at-13-18-04

Like the string of the PDB file: GetDownLoader. And a PowerShell command to download and execute an .exe (the URL is readable).

Also notice that the string “This program can not be run in DOS mode.” appears twice. This is a strong indication that this DLL contains another PE file.

Let’s search for it. By using the –cut operator to search for another instance of string MZ, we can cut-out the embedded PE file:

screen-shot-2017-02-06-at-13-56-17

We also submitted this file to VirusTotal: 2938d6eda6cd941e59df3dd54bf8dad8. It is a 32-bit EXE file.

The hexadecimal string with Id 4 found with base64dump also contains a PE file. It is a 32-bit DLL (ce95faf23621a0a705b796c19d9fec44), containing the same 32-bit EXE as the 64-bit DLL:  2938d6eda6cd941e59df3dd54bf8dad8.

Conclusion

With a steady flow of VBA maldocs for over more than 2 years, one would almost forget that Office maldocs with exploits are found in-the-wild too. If you just look for VBA macros in documents you receive, you will miss these exploits.

In this sample, detecting a payload was not too difficult: we found an unusual file (large .eps file) with long hexadecimal strings that decode to PE-files. It’s not always that easy, especially if we are dealing with binary MS Office files (like .doc).

In this post we focus on a static analysis method to extract the payload. When performing analysis on this file yourself, be aware that this maldoc also contains shellcode (strings 1 and 3 found by base64dump) and an exploit to break out of the sandbox (protected view).