Malicious Document Targets Belgian Users

In this blog post I want to show how a malicious document (maldoc) behaves and how it can be analyzed with free tools.

A couple of weeks ago many users in Belgium received an e-mail, supposedly from a courier company, informing them that a package was waiting for them (article in Dutch).

This is an example of the e-mail:

20161114-142948

This e-mail contains a link to a Word document:

20161114-142226

The Word document contains VBA macro code to download and execute malware (downloader behavior). But MS Word contains protection features that prevent the code from running when the document is opened in Word.

First of all, since the Word document was downloaded from the Internet, it will be marked as such, and MS Word will open the document in Protected View:

20161114-143404

The user is social-engineered into clicking the Enable Editing button. Because the Word document contains VBA macros, another protection kicks in:

20161114-143421

By default, MS Word disables macros for documents of untrusted sources. Only after the user clicks on the Enable Content button, will the VBA macros run.

The user is presented with an empty document, but meanwhile malware was downloaded and executed invisibly to the user:

20161114-143442

The VBA macro code can be extracted with a free open-source tool: oledump.py.

20161114-153022

When looking at the VBA code (streams 8 and 9), we find subroutine Document_Open in stream 9:

20161114-153526

This subroutine is automatically executed when Word opens the document. Subroutine Document_Open contains a call to subroutine TvoFLxE in Module1:

20161114-155109

Subroutine TvoFLxE removes the content of the document (this causes the document to become blank, see screenshot), saves the document and calls function HuEJcCj.

20161114-155123

In this function we see a call to CreateObject. This is always interesting and requires further analysis. CreateObject takes a string as argument: the name of the object to be created. In this code, the string is returned by function JFZzIeCKcjgPWI which takes 2 arguments: 2 strings that look like gibberish. We see this often in maldocs (malicious documents): strings are obfuscated, e.g. made unreadable. Function JFZzIeCKcjgPWI is a string decoding function, taking strings “MWqSBYcnRrviVpGRtY.ASJhGneqYlVl”and “FYqRnVNvJB1GqMA” and converting them to a meaningful string.

In this maldoc, the string obfuscation method is rather simple. Function JFZzIeCKcjgPWI removes all characters found in string “FYqRnVNvJB1GqMA” from string “MWqSBYcnRrviVpGRtY.ASJhGneqYlVl”. Was is left is string “WScript.Shell”. This Shell object can be used to make Windows execute commands. So we need to deobfus.

20161114-155207

When we deobfuscate these strings, we get this PowerShell command:

20161114-162354

This PowerShell command downloads an executable (malware) to disk and executes it. The downloaded malware seems to be ransomware, we’ll write another blog post if it has interesting features.

To protect yourself from this kind of attacks, never activate the document (Enable Editing and Enable Content). Anti-virus can also protect you by 1) detecting the maldoc and 2) detecting the executable written to disk. When you don’t trust a document, you can always upload it to VirusTotal.

 

Testimonial of Stefaan Truijen

Hi, I’m Stefaan Truijen and in 2014-2015 I did my master thesis at the department of computer science at KULeuven. I assessed the susceptibility of modern web browsers to RAM scrapers in collaboration with NVISO. Security had always been one of my passions, so I was excited to get started.

Writing a thesis is an intensive process. Happily, I was able to rely on both Arne (NVISO) and Raoul (KULeuven) throughout the entire year for advice/brainstorming.

First, I needed to get an overview of prior research on memory scraping. Arne supplied me with a couple of initial research documents and references, and I reviewed any new material I found with Arne and Raoul almost weekly.

After some preliminary tests, I had to determine how I would continue and I wanted to contribute at least a little bit to fighting memory scrapers. I was able to bounce a few ideas off Arne and Raoul. In the end we decided that, since I was unable to find any prior research that had already assessed the size of the problem – i.e. memory scraping web browsers – measuring the degree of susceptibility of each of the three most commonly used web browsers (Chrome, Firefox, IE) was the most interesting angle.

In order to get a sufficient amount of data to form a solid conclusion, I ran thousands of experiments. Of course, running thousands of experiments manually is not very efficient and it affects reproducibility of the results. Therefore I learned how to work with new tools. Most relevant were Selenium’s automated testing framework for web browsers and the Windows API. Whenever I had questions, Arne and Raoul gladly answered them.

Now that the dust has settled, I can say that I have acquired a deeper understanding of low level security, more specifically memory scraping, and the consequences of relatively relaxed memory and API access policies that I did not have before. I am very satisfied with the result of my thesis and NVISO played an important role in realizing it!

Testimonial of Nick Van Haver

Hi, I’m Nick Van Haver and I want to reflect briefly on my master thesis which I have worked out in cooperation with NVISO and the Ghent University. NVISO helped me in many ways while providing me with a lot of freedom to choose the course of my thesis. They showed me a lot of trust and respect, which I truly appreciate.

The topic of my thesis research was “The Detection of Client-side Vulnerabilities in Web Applications through the Browser”. This topic is deeply rooted in the field of web application security, and thus lead me far beyond its basics. At first I had quite some experience with the development of web applications, but far less with relation to their security aspects.

When looking into a new field or topic, it is hard to find the right sources and high quality references. The right resources can turn a week’s worth of work into a single day. NVISO provided me with these resources and handed me tools, enabling me to educate myself in the web application security field and to make the most out of my thesis. Thanks to NVISO, I had contact with some of the big names in the industry such as Google, Minded Security, Portswigger and many others. Furthermore they assisted me with their expertise in security during meetings.

In the end, my research resulted in a fairly high score of 16 out of 20. Because of these grades I graduated magna cum laude as a Civil Engineer in Computer Sciences. At the beginning of my thesis my knowledge on web application security was rather limited. Now I feel accomplished in this field of security and I now know where to find the most correct information when dealing with web application vulnerabilities. I now also feel more confident when contacting external parties.

I can highly recommend working with NVISO. Choosing to work together with them for your master thesis ensures you that the topic will be both challenging and interesting. You will receive the support and resources you need to achieve your goal. It really is a worthwhile experience! Once the results of my thesis are public, they will be shared with the community!

Cyber Security Challenge Belgium 2015 – Solving the NVISO Lottery challenge

This is the fourth and final blog post in the Cyber Security Challenge Belgium 2015 (CSCBE) solutions series. This time, we’re taking a look at one of the more programming oriented challenges: The NVISO Lottery.

The NVISO Lottery

The students were given the following info:

Come and throw away your money at the NViso Lottery!

They also received the IP address for the NVISO Lottery service.

Gathering information

Once again we take out our trusty pocket knife named netcat.
We have to guess the correct number from a set of 1000 possibilities. If we guess the right number, we get $75, but each guess costs us $10. If we want to win the prize, we have to earn $1337. This means we have to guess correctly at least 20 times without making too many mistakes. Let’s try!
We weren’t able to guess the correct number. We do get an ID, which we can use to get feedback from the NVISO casino. The ID looks completely random, but the last character (=) is a typical tell-tale for Base64 encoding. The equals sign is used as extra padding when the amount of bytes to encode is not dividable by 8. Decoding this using the Base64 algorithm gives the following:
The decoded string doesn’t give us the answer to the random number, but the content does appear to be structured and further decoding may be necessary.
As was explained at the beginning of this write up, this challenge is programming oriented. If you’ve worked with the Python language a lot, you may already recognize the decoded string as being a specific python file format.
In Python, you can use the Pickle module to serialize data objects. Serializing (or marshaling) objects is the processes of converting arbitrary data to a byte stream. This byte stream can then safely be transported over a network, or stored in a file.
Serializing is a reversible process. That means we can deserialize (or ‘unpickle’) the data we got from the Base64 decode:
 This is very promising. The unpickled value consists of a nested list with three random numbers.

Random number generators

Lets take a look at how random numbers are usually generated. Algorithms can not generate truly random numbers. An algorithm will always perform the exact same steps given the same input. Many software implementations therefore rely on Pseudo-Random Number Generators (PRNGs). These algorithms do not generate true random numbers, but they do share many properties with true random numbers. For example, a good PRNG will make it extremely difficult to determine the next random number based on the random number that was just received.
An example of a PRNG is a Linear Congruential Generator (LCG). The most simple LCG needs three numbers to calculate a random number sequence. These three numbers are called the seed of the LCG. Given these numbers (a, c, m), the LCG will calculate the sequence as follows:
The next number in the sequence is calculated by multiplying the current number by a, adding the result to c and taking the remainder of division by m.
From a programming point of view, PRNGs are very useful as they can be reverted to a certain state. If the application suddenly crashes based on a specific random input, it would be very hard to debug the application if the same random input can not be generated. For security critical implementations, of course, a PRNG should not be used.
Since we have to guess a random number, it may be a good guess to say that the decoded value is the seed for a PRNG.

Exploiting the vulnerability

Python allows the programmer to set the state of the random number generator. To confirm we’re on the right track, let’s print out the current state of the default random number generator:
Unfortunately, this seed appears to be a lot bigger than the seed we recovered from the lottery service. Python’s random module actually uses the Mersene Twister algorithm, which is not an LCG,
But there is good news, the output of the getstate() command is very similar to our decoded value. Python has a few other random libraries: random.SystemRandom() and random.WichmannHill(). According to the documentation, SystemRandom() doesn’t have a getstate() method. WichmannHill() does:
This is exactly what we were looking for. By using setstate() with our decoded lottery ID, we should be able to predict the number that will be generated:
Great! That was the solution we were looking for. Because we get the ticket ID before we have to enter our guess, we can predict the value that the server will expect and get our prize!
We could do this manually since there’s no timeout for our answer, but we can just as easily create a python script that does this for us:
We got the flag, which is “I’m_going_to_be_a_professional_gambler!

Statistics

We had many different connections to the server, so a lot of teams tried to solve the challenge. Most teams told us they managed to decipher the Base64 encoding, and some teams also found the Python pickle format. In the end, only four teams were able to completely solve this challenge: HacknamStyle Jr, ISW, Turla Tech Support and Vrije Universiteit Leuven. All of these teams made it to the finals.

Final thoughts

This challenge was partly aimed at testing the student’s programming skills. Although Python is a very popular programming language, some students may have never used it before, making this challenge a little bit harder. Even so, a security researcher will often encounter unknown file formats or protocols, and finding out what the data means or how to use it may be critical to a successful security audit or forensics investigation. Being able to automate custom tasks can often save lots of time or solve problems that would be impossible to do manually. Having some experience with any programming language is an invaluable tool in every security expert’s toolkit!

Cyber Security Challenge Belgium 2015 – Solving the One Way challenge

This is the third blog post in the Cyber Security Challenge Belgium 2015 (CSCBE) solutions series. This time, we’re taking on a very technical challenge: One Way.

Data Extraction

The challenge

The following challenge description was given to the students:

We want our employees to be able to send us confidential information which only we can decrypt. Since we don’t believe in PKI (we have our reasons!), we made our own crypto system (homemade is always better, right!). To prevent tampering, we took some precautions: A salt is added to each request and the IV is chosen at random for every connection. Take a look at the given clientFramework.py file for more info on how to use our crypto system.

The accompanying clientFramework.py file contains some helper methods so that the students could focus on the actually encryption logic instead of fighting with python to be able to correctly communicate with the server.

The details

The python file contains some information about the server, from which the following information can be deduced:

  • The Initiation Vector (IV) is chosen at random for every session
  • The IV is updated after every encryption request according to a known algorithm
  • The server encrypts the given plain text as follows: encryption = encrypt(plain text + FLAG, IV)
  • The encryption protocol is AES in CBC mode with blocks of 16 characters
  • The FLAG consists of 8 lowercase ASCII characters
  • The used IV is returned together with the encrypted string

The IV is randomly chosen at the start of the session, but the client can request multiple encryption operations during each session. After each encryption, the IV is updated according to a known function. That means that we can calculate the IV that will be used for the next iteration. This will prove to be very important in what follows.

Encryption 101

Let’s take a look at how the Cipher Block Chaining (CBC) algorithm works, which is what the challenge is using.
The following image shows the working of CBC:
Image taken from Wikipedia
The plaintext is split up into blocks of 16 bytes each and each block is encrypted separately. In order to counter certain attacks which are possible against the Electronic CodeBook algorithm (ECB), each plaintext is first XOR’ed with the ciphertext of the previous block. Because the first block doesn’t have a previous block which it can use to XOR with, an IV is used. The IV should always be random and unpredictable. 
After the plaintext has been encrypted, the IV has served its purpose and it no longer has to be secret. In this challenge, the IV is returned to the client together with the encrypted text.

Finding the flaw

You may have already noticed a small but very important mismatch between how CBC should be implemented, and how the challenge server implements CBC: the IV should always be random and unpredictable. The server’s IV is completely random and unpredictable, but only for the first encryption request. For every subsequent request, the IV can be calculated from the original IV, which creates a serious security flaw.
Take another look at the CBC diagram. By knowing which IV will be used to XOR with the plaintext, we can prevent the IV from having effect. If we XOR the plaintext with the predicted IV before sending it to the server, the server will apply the XOR again which undoes our original XOR:
plaintext \oplus IV \oplus IV = plaintext \oplus (IV \oplus IV) = plaintext \oplus 0 = plaintext.
The second flaw is that the flag is appended to the given plaintext. Since we have full control over the plaintext, we can decide at which position in the plaintext the flag will be, and hence we can control where it will end up in the encrypted string.

Exploiting the flaw

If we have complete control over which plaintext is entered into the first encrypted block, we can get the encrypted value of any given plaintext. This means we can create a rainbow table for every possible plaintext consisting of 16 bytes:
aaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaab
aaaaaaaaaaaaaaac
//…
zzzzzzzzzzzzzzzx
zzzzzzzzzzzzzzzy
zzzzzzzzzzzzzzzz
Remember that we have to XOR the plaintext string with the predicted IV before sending it to the server.
Before encrypting the plaintext, the server appends the flag to our input. If we only send 15 characters to the server, the server will encrypt aaaaaaaaaaaaaaaX where X is the first character of the flag. 
We can now look up the encrypted value of aaaaaaaaaaaaaaaX in our rainbow table. This will match to 
aaaaaaaaaaaaaaas and we now know that the first character of the flag is an ‘s’. 
To get the second character, we need to create a rainbow table based on the aaaaaaaaaaaaaas prefix (which has 14 a’s). When the table is complete, we can ask the server to encrypt “aaaaaaaaaaaaaas”. The encrypted string will contain the second character of the flag in the last position and we can look it up in our rainbow table. The encrypted string will match to aaaaaaaaaaaaaasa, so ‘a’ is the next character of the flag. We can keep doing this for every character:
aaaaaaaaaaaaaaas
aaaaaaaaaaaaaasa
aaaaaaaaaaaaasal
aaaaaaaaaaaasalt
aaaaaaaaaaasaltm
aaaaaaaaaasaltmi
aaaaaaaaasaltmin
aaaaaaaasaltmine
aaaaaaasaltmine0
aaaaaasaltmine00
aaaaasaltmine000
aaaasaltmine0000
aaasaltmine00000
aasaltmine000000
asaltmine0000000
saltmine00000000
After a few iterations, the padding zeros start showing up in the solution. These extra zeros after the flag are just padding that was added by the server in order to have a complete block to encrypt. When we’ve removed all the prefixed a’s, we end up with the flag, which is saltmine.

Padding attack

The attack we used above is a form of padding oracle attack. This attack is possible because of two distinct vulnerabilities in the server algorithm: We can predict the IV, and we can modify the padding in front of the flag. By combining these two flaws, we are able to get the flag, which would have been impossible without either of them.
In November 2014, the POODLE attack was discovered, which uses a padding oracle attack against SSL3.0.

Statistics

Nine of the participating teams were able to solve this challenge. Eight of these teams were able to secure a place in the CSCBE finals. There were a lot of random guesses for the solution of this challenge. Some even came close (“saltflag” or “salted00”) but luckily, only the teams who actually solved the challenge were able to get the points.

Final thoughts

A strong cryptographic algorithm is only effective when it is used correctly. The challenge demonstrated that small flaws can a have disastrous effects. Although cryptography can be very daunting at first, it certainly pays off to invest some time in to understanding how different algorithms work and how they should be used. Even if you don’t fully understand the internal workings of the AES encryption method, you may still be able to find flaws in the way it is used and thereby be able to break the encryption.

Cyber Security Challenge Belgium 2015 – Solving the Data Extraction challenge

This is the second blog post in the Cyber Security Challenge Belgium 2015 (CSCBE) solutions series. This time, we’re taking a look at the Data Extraction challenge.

Data Extraction

The challenge

The following challenge description was given to the students:

We messed up and contacted the wrong forensic department. They say they found data, but we can’t really make anything out of it. Can you?

The students were also given the following image:

The challenge was designed to test the students’ out of the box thinking capabilities, as well as their ability to research a certain subject.

Analyzing the image

Steganography is the art of hiding information inside another file. This can take on many different forms. Images are often used as a container for other kinds of information and the possibilities are endless. Steganography challenges are often part of a CTF and of course, we had to create one too. These challenges are usually one of the more feared challenges, as the number of possible approaches is literally endless.

There are two major categories for hiding information inside an image. Either you modify the internals of the image to add some extra data, or you visually encrypt your information and just add it to the image where everyone can see it. Because the image appears to contain a certain pattern, it is most likely that the second approach was taken.

Four different kind of shapes were used to create the pattern and every shape has its own color. If you’ve payed attention in your high school biology class, you may recognize the shapes. When you learned about the birds and the bees, you should have also learned about deoxyribonucleic acid, which is just a fancy way of saying DNA.

Research

DNA consists of two biopolymer strands coiled around each other, forming the well known double helix:

Image taken from classroom.synonym.com

Each strand consists of many different nucleotides which lock together in a certain way. There are two base pairs: Adenine (A) matches with Thymine (T) and Guanine (G) matches with Cytosine (C).

Image taken from thinglink.com

The colors of the different nucleotides may be different depending on which textbook you use, but the shapes are usually the same. That means we can convert our image into a string of letters consisting of A, T, G and C using a little bit of python:

This is already looking good, but we can’t really see anything that resembles a flag.

Digging a little deeper

 If we do some more research, we can find that different nucleotides are combined into amino acids when they are processed by your cells. Each triplet is combined into one of the amino acids according to the following diagram:

Image taken from commons.wikimedia.org
For example, AAA will be transformed into Lysine (K) and UCA will be transformed into Serine (S). 
We could write a python script that translates these nucleotides into amino acids, but it’s even easier to let the internet do it for us. The Bioinformatics Resource Portal hosts a tool that will do the hard work for us. If we enter our nucleotide string into the tool, we get the following:

This tool also automatically solves the last hurdle of the challenge: If you start combining from the first nucleotide, you only get a bunch of random letters. However, if you start combining from the second nucleotide, you can read “THE PASS IS METAPHYSIC LIGHTYEARS”, which is the solution to this challenge.

Statistics

At the end of the qualifiers, only five teams were able to solve this challenge. Apparently many teams recognized the image as being a DNA sequence and they successfully transformed and combined the nucleotides into amino acids. Only a handful of teams got past the last hurdle of starting with the correct nucleotide. 

Final thoughts

Not a lot of teams managed to solve this challenge. As steganography can be done in a million ways, it’s not always easy to see a path towards the solution. This can discourage teams and have them invest their time in more practical challenges that have a more well defined scope. Choosing which challenges you invest your time in is an important decision in any CTF and apparently most teams did not prefer to invest their time in this one.

Cyber Security Challenge Belgium 2015 – Solving the SFTP challenge

Two weeks ago, we proudly organised the Cyber Security Challenge Belgium 2015 (CSCBE). The CSCBE was a typical Capture-The-Flag (CTF) competition aimed at students from universities and colleges all over Belgium. During the competition, teams of three or four students had to tackle different technical challenges in order to prove their skills. In the following weeks, we will discuss some of the challenges that the students had to solve.

SFTP

The challenge

The following challenge description was given to the students:

“One of our employees uses an SFTP server to store sensitive company files. He ensured us it’s safe. I mean… Why wouldn’t SFTP be safe?”

They were also given a target IP, a target port and some user credentials (Kermit:MissPiggy).

The challenge was designed to test a few very needed skills for every cyber security enthusiast: problem solving, using the right tool for the job, gathering information and thinking critically.

Connecting to the server

A first approach would be to connect directly to the SFTP server:
After executing the command, the terminal remains empty. This is not normal behavior for an SFTP server, which normally asks for a password, as shown below:
At this point, a lot of the students were complaining that the server was down, or that our firewall was blocking their campus networks. Every complaint was taken seriously, but every time we checked the server, it was up and running. Each time, we suggested that the students should try a different approach.
Netcat is like the swiss army knife of network tools. It can do many different things, but most importantly, it gives us complete control over which data is sent and received. Using netcat (command: nc) gives us the following output:
Now we can see that the server is actually up and running and sending us data. The server introduces itself as “+NVISO SFTP server”, which is not what you would expect from a normal SFTP server. On to the next step!

Gathering information

Although SFTP is a widely known and used acronym for the SSH File Transfer Protocol, acronyms can often mean many different things. A great resource for working out acronyms, is Wikipedia:
As we noted, SFTP stands for SSH File Transfer Protocol. However, SFTP also stands for the Simple File Transfer Protocol. This protocol is “an unsecured and rarely used protocol”, which sounds exactly like something we’d be interested in.
The Simple File Transfer Protocol is fully documented in RFC 913. According to the RFC, the server should greet a client with “+MIT-XX SFTP Service”, which is very similar to what our SFTP server sent.

Using the server

The RFC lists the following commands: USER ! ACCT ! PASS ! TYPE ! LIST ! CDIR ! KILL ! NAME ! DONE ! RETR ! STOR. Let’s try some of these commands:
Great! We were able to log in, and there appears to be a “documents” directory. Inside the documents directory, there is a file called flag, as we can see here:

According to the documentation, we should use ‘RETR flag’ followed by ‘SEND’. But first, let’s set the transfer mode to binary, so that we are sure we get a correct file:

If we send the ‘RETR flag’ command now, the server will print the size of the flag. We can then indicate we are ready to receive the flag by sending ‘SEND’ to the server. This can most easily be done through a python script, which will store away the file correctly:

This little script uses the Pwn library, which makes communicating with the server a lot more easy. Running this script will produce a file called ‘flag’ in the current directory. By using the linux ‘file’ command, we can find out what kind of file this is:

Renaming flag to flag.jpg lets us see the image:

If you look closely, in the upper right corner of the image, it says “KermitLovesBacon”, which was the flag needed to solve this challenge.

Statistics

Fourteen teams managed to solve the challenge during the qualifiers. The first team that solved the challenge was HacknamStyle JR. HacknamStyle JR emerged as the victors in the Finals, so this may not come as a surprise. Out of the fourteen teams that managed to solve this challenge, seven earned their spot in the Finals. 

Hurdles

The biggest hurdle to this challenge was actually communicating correctly with the server. Many students immediately decided that SFTP could only stand for SSH File Transfer Protocol. They quickly realized that their favourite SFTP application didn’t work, but only a few of the teams investigated the root cause of this by looking at the actual data that was sent by the server and doing some research.
One of the teams actually opened netcat and tried nearly every possible combination of four letters to see which commands worked and which didn’t. Even though they failed to find the correct protocol documentation, they did manage to get the flag. Using brute-force to solve this challenge is definitely allowed, but they could have probably saved a lot of time by doing some more research first.

Final thoughts

This challenge was definitely a success. Many teams fell into the trap of thinking it was an SSH FTP Server, which was exactly what we were hoping for. The challenge wasn’t meant to be very difficult, so we were glad that quite a number of teams were able to solve it.