Over the last couple of weeks, we have received quite a few user-submitted Android application samples (.apk files) through our online scanning service ApkScan. The auto-generated analysis reports have already highlighted some interesting results, and we have observed a broad range of malicious activities launched by the uploaded samples: from premium rate phone scams to full-fletched encrypted command-and-control channels and phishing malware targeting clients of mobile banking applications.
Although the reports are certainly very useful to get a better understanding of what the impact is of certain malware samples, they don’t always explain how the malware developer is pulling off the attack:
- Is the sample obfuscated, and if so, which techniques were used to hide steps?
- Does the malware rely on the official Android APIs, or was for example the kernel targeted using native exploit code?
- Is there any hidden code in the application that requires a special “trigger” before being executed (e.g. only attack the device if the user is using a known mobile banking application)?
Automated tools (including ApkScan) are good for a “quick glance” at how a (malicious) application behaves. However, these automated tools will only rarely be able to give a full answer to the abovementioned (and many other) questions – for this, we need to get our hands dirty and perform a manual analysis.
In this blog post, we will be analyzing an Android package in the hunt for malicious behavior. Although for the sake of this post the complexity of the analyzed sample is limited, the steps we will go through in this post can be applied to analysis of much more complex applications, too.
Step 1 – Getting our sample
For this post, we will be using a sample that has previously been uploaded to ApkScan. Although we are currently not offering malware samples for download, Contagio Mini Dump is offering tons of samples for download on which you can practice.
Step 2 – Our environment
It goes without saying that malware analysis should never be performed in a production environment, as you never know what you will be dealing with, and we certainly do not want to infect our own machines! For this article, we will be using a BackTrack Virtual Machine. Although the image already contains quite some tools that are ready-suitable for the analysis of APK archives, it is best to double check that all the following free tools are installed on your system:
Once you have these up and running, it’s time to get our hands dirty!
Step 3 – The Android manifest
The first thing we do is figure out if the sample we are analyzing has the correct format. APK packages are nothing more than ZIP files with a predefined structure (including for example a manifest file).
On the screenshot above, we first used the “type” command to identify that the APK file is indeed a valid ZIP archive. Using “type” can be useful, especially in cases with password encrypted ZIP archives. Next, we used the “unzip” command to unzip the archive to the current folder. Nothing special so far! As we can see from the extracted file list, a file “AndroidManifest.xml” was also unzipped. This already gives us the impression that we are indeed dealing with a valid Android application, but let’s have a closer look.
If we simply “cat” the unzipped manifest file, we get the output from the screenshot above. That looks like a binary file! But wasn’t the Manifest file an XML document that could be edited freely by developers? When an Android application is compiled into bytecode, the XML Manifest (AndroidManifest.xml) is optimized and converted into a binary format. This explains the unprintable funny characters from the output above. What we need to do is convert the binary manifest back into a text file, and for this we can use android-apktool:
“apktool -d” as illustrated above will perform several steps in the decompilation process, and will convert the APK archive to human-readable files. This includes the manifest file, printed above. As we can see from the last few lines of the manifest file, the application is asking permission to write to the external storage (a memory card), identified by the permission identifier WRITE_EXTERNAL_STORAGE. Additionally, the application is asking permission to receive and send text messages (RECEIVE_SMS and SEND_SMS). This looks a bit suspicious already, but let’s have a closer look.
Step 4 – Static source code analysis
After analyzing the Android manifest in the previous step, we noticed that the application is asking somewhat unusual permissions, including the permissions to send an receive text messages. Quite a large portion of Android malware consists of premium rate scams in which an the malware application will send out text messages to premium rate numbers without the users’ consent, resulting in a huge phone bill! The permission to receive text messages is often abused to hide incoming text messages from the premium rate numbers from the victim, to prevent the user becoming suspicious when he or she suddenly sees a response from such a premium rate number. But let’s not jump ahead too much yet, we’ll first have a look at the source code to see if our suspicion is confirmed.
Using dex2jar, we have converted the APK package to a standard Java Archive (JAR) file. Being able to convert Android Dalvik byte-code into Class files can come in very handy if you are more comfortable analyzing standard Java code instead of Dalvik instructions. However, never forget that automated conversion steps (by making use of tools such as dex2jar) might introduce a certain amount of “translation” errors – analyzing binaries “closer to the source” (e.g. by analyzing the byte-code directly in the .dex files) is certainly advised for more complex samples, but for this article we will stick with the converted JAR file.
After having converted the APK archive to a JAR package, we can convert the Java .Class files into readable source code using a tool such as jd-gui.
After launching jd-gui, we load the JAR (previously converted using dex2jar) from the “File” menu. After loading the JAR file, we identify two packages: “Android” and “com.fake.site”. How do we know which code will be executed when the application is loaded? If you look again at the text version of the Android Manifest we obtained in Step 3, we can see that the default launcher activity is defined by the developer in the “com.fake.site” package:
We can see from the Manifest extract that class “com.fake.site.StartActivity” is defined as the MAIN and LAUNCHER action using the tag, which means that this class will serve as the application’s entry point. So let’s have a look at what’s happening in there:
Above you can see the code that is embedded in the constructor method “onCreate”, located in the “com.fake.site.StartActivity” class. By only glancing at the code, we can already see that the code us doing something fishy:
- The code checks if the boolean “hasVisited” in the shared preferences is true. If this is the case:
- A reference to the SmsManager is stored in localSmsManager.
- Using the Android API call “sendTextMessage“, a text message is sent to the number “+79258539996”. The content of the message is provided as the third parameter and says “Ya Tut :)”.
- The shared preferences is updated and the “hasVisited” property is set to “true.
Given that the application will send a text message to this number immediately upon launching the application, we can suspect this is (intended?) malicious behavior. Upon further investigating a bit, we discovered that “Ya Tut :)” apparently could mean “I am here” in Russian. The number does not appear to be a premium rate number.
Now that we know that the application will send a text message to a Russian number upon launch, what is next? What else is the application doing? There are a few options: you could continue with the manual static analysis in jd-gui and other tools (“reading” through the classes as we did in the previous step and try to figure out what the application is doing) or you could perform a dynamic analysis to see what else is happening “in real time” when the application is running in a (sandboxed) environment (using for example DroidBox). Performing dynamic analysis and more advanced analysis techniques (bypassing obfuscation, encrypted content, …) will be covered in future articles on this blog so keep an eye out for updates!
If you have any questions regarding this article or any of the techniques discussed, you can get in touch with the author at firstname.lastname@example.org.