Developing complex Suricata rules with Lua – part 1

The Suricata detection engine supports rules written in the embeddable scripting language Lua. In this post we give a PoC Lua script to detect PDF documents with name obfuscation.

One of the elements that make up a PDF, is a name. A name is a reserved word that starts with character / followed by alphanumerical characters. Example: /JavaScript. The presence of the name /JavaScript is an indication that the PDF contains scripts (written in JavaScript).

The PDF specification allows for the substitution of alphanumerical characters in a name by an hexadecimal representation: /J#61vaScript. #61 is the hexadecimal representation of letter a. We call the use of this hexadecimal representation in names “name obfuscation”, because it is a simple technique to evade detection by engines that just look for the normal, unobfuscated name (/JavaScript).

There is no limit to the number of characters in a name that can be replaced by their hexadecimal representation. That makes it impossible to write a Suricata/Snort rule (using content, pcre, …) that will detect all possible obfuscations of the name /JavaScript. However it is easy to write a program that normalizes obfuscated names (pdfid does this for example).

Fortunately Suricata supports the programming language Lua for some time now. Let’s take a look how we can use this to detect PDF files that contain the obfuscated name /JavaScript (FYI: all PDF files we observed with obfuscated /JavaScript name were malicious, so it’s a good test to detect malicious PDFs).

A Suricata Lua script has to implement 2 functions: init and match.

The init function declares the data we need from Suricata to be able to do our analysis. For the PDF document, we need the HTTP response body:

function init(args)
    return {["http.response_body"] = tostring(true)}
end

The match function needs to contain the actual logic to analyze the payload. We need to retrieve the HTTP response body, analyze it, and return 1 if we detect something. When nothing is detected, we need to return 0.

In this example of the match function, we detect if the HTTP response body is equal to string test:

function match(args)
    a = tostring(args["http.response_body"])
    if a == "test" then
        return 1
    else
        return 0
    end
end

To detect obfuscated /JavaScript names we use this code:

tBlacklisted = {["/JavaScript"] = true}

function PDFCheckName(sInput)
    for sMatchedName in sInput:gmatch("/[a-zA-Z0-9_#]+") do
        if sMatchedName:find("#") then
            local sNormalizedName = sMatchedName:gsub("#[a-fA-F0-9][a-fA-F0-9]", function(hex) return string.char(tonumber(hex:sub(2), 16)) end)
            if tBlacklisted[sNormalizedName] then
                return 1
            end
        end
    end
    return 0
end

Function PDFCheckName takes a string as input (sInput) and then starts to search for names in the input:

sInput:gmatch("/[a-zA-Z0-9_#]+")

For each name we find, we check if it contains a # character (e.g. if it could be obfuscated):

if sMatchedName:find("#") then

When this is the case, we try to normalize the name (replace #hexadecimal with corresponding ANSI code):

local sNormalizedName = sMatchedName:gsub("#[a-fA-F0-9][a-fA-F0-9]", function(hex) return string.char(tonumber(hex:sub(2), 16)) end)

And finally, we check if the normalized name is in our blacklist:

if tBlacklisted[sNormalizedName] then
    return 1
end

In that case we return 1. And otherwise we return 0.

The complete Lua script:

tBlacklisted = {["/JavaScript"] = true}

function PDFCheckName(sInput)
    for sMatchedName in sInput:gmatch"/[a-zA-Z0-9_#]+" do
        if sMatchedName:find("#") then
            local sNormalizedName = sMatchedName:gsub("#[a-fA-F0-9][a-fA-F0-9]", function(hex) return string.char(tonumber(hex:sub(2), 16)) end)
            if tBlacklisted[sNormalizedName] then
                return 1
            end
        end
    end
    return 0
end

function init(args)
    return {["http.response_body"] = tostring(true)}
end

function match(args)
    return PDFCheckName(tostring(args["http.response_body"]))
end

To get Suricata to run our Lua script, we need to copy it in the rules directory and add a rule to call the script, like this:

alert http $EXTERNAL_NET any -> $HOME_NET any (msg:"NVISO PDF file lua"; flow:established,to_client; luajit:pdfcheckname.lua; classtype:policy-violation; sid:1000000; rev:1;)

Rule option luajit allows us to specify the Lua script we want to execute (pdfcheckname.lua).

That’s all there is to do to get this running.

But on production systems, we will quickly get into trouble because of performance issues. The rule that we wrote will get the Lua script to execute on all HTTP traffic with incoming data. To avoid this, it is best to add pre-conditions to the rule so that the program will only run on downloaded PDF files:

alert http $EXTERNAL_NET any -> $HOME_NET any (msg:"NVISO PDF file lua"; flow:established,to_client; file_data; content:"%PDF-"; within:5; luajit:pdfcheckname.lua; classtype:policy-violation; sid:1000000; rev:1;)

This updated rule checks that the file starts with %PDF- (that’s an easy trick to detect a PDF file, but be aware that there are ways to bypass this simple detection).

For some environments, checking all downloaded PDF files might still cause performance problems. This updated rule uses a regular expression to check if the downloaded PDF file contains a (potentially) obfuscated name:

alert http $EXTERNAL_NET any -> $HOME_NET any (msg:"NVISO PDF file lua"; flow:established,to_client; file_data; content:"%PDF-"; within:5; pcre:"/\/.{0,10}#[a-f0-9][a-f0-9]/i"; luajit:pdfcheckname.lua; classtype:policy-violation; sid:1000000; rev:1;)

Note that in the regular expression of this rule we expect that the name is not longer than 11 characters (that’s the case with the name we want to detect, /JavaScript). So if you add your own names to the blacklist, and they are longer than 11 characters, then update the regular expression in the rule.

Conclusion

Support for Lua in Suricata makes it possible to develop complex analysis methods that would not be possible with simple rules, however performance needs to be taken into account.

In part 2 of this blog post, we will provide some tips to help with the development and testing of Lua scripts for Suricata.

One thought on “Developing complex Suricata rules with Lua – part 1

  1. Pingback: Developing complex Suricata rules with Lua – part 2 | NVISO LABS – blog

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s