The growing popularity of Android in recent years has helped increase the number of mobile developers and the apps they produce. However, many of these, while using sensitive data, do not always take security into account. Without great difficulty, in fact, it is possible to get hold of the source code of an app and other sensitive data, through a reverse engineering process of the APK package (containing all the binary files to run on Android).
In this article, we will focus on the contents of a generic APK package and illustrate the process of decompiling and obfuscating the code contained in it. To do this, we will work on Windows, although this process can be appropriately adapted to other operating systems.
Anatomy of an APK
After completing the process of compiling an Android project, the output generated by our IDE will be a file with the .apk extension consisting of four elements:
|classes.dex||It contains all the app .class files converted to the bytecode format interpreted by the Dalvik Virtual Machine (a modified version of the Java Virtual Machine, used by Android for code execution)|
|resources.arsc||Contains the compiled resources of our application|
|Uncompiled resources||Represent all the non-compiled application resources (for example, the contents of the assets folder in the project)|
|AndroidManifest.xml||It contains a lot of basic information for using the app.|
Following is the process of creating an .apk and its contents.
An .apk file is simply a data store containing the aforementioned files. In order to have access to these files it will be enough to change the file extension in .zip and extract the contents, obtaining the following output:
Although you can explore the contents of an APK, it is not always possible to read its contents, as the .xml, .dex and .arsc files are encoded. It is, therefore, necessary to decompile the APK .
Before proceeding, it is worth pointing out that, in itself, this procedure is not an illegal operation, provided it is not used for piracy or similar purposes.
Decompilation of an APK
Decompiling an APK can be done using tools such as Android APK Decompiler and ApkAnalyser. These tools share the same basic software core, which consists of three fundamental elements:
- Dex2jar, a converter of .dex files into .class files stored in a .jar file (also allows you to perform the reverse operation);
- Apktool, a necessary tool to decode the project resources in the original format and to recompile it by saving the changes made;
- Java Decompiler (JD), a Java decompiler equipped with a graphical interface that allows analyzing the bytecode contained in a .jar file.
The appropriate combination of these elements makes the replicable decompiling process without the aid of third-party software. Let’s see how.
We create a new folder in our work environment that will contain:
- the three tools downloaded for our operating system;
- the APK we want to analyze;
- the decompressed archive of the aforementioned APK, contained in the folder of the same name.
At this point in order to read the source code of our application just run JD with double click on the icon, and load the newly created classes-dex2jar.jar file.
As shown in figure 5, it is possible to explore the source of the decompiled APK and to make it work.
Dex2jar only allows decompiling the source code but not the resources.arsc file. To do this we use Apktool, which requires the following files
- the APK to decompile.
The apktool.bat and apktool_2.0.3.jar files are available on the official Apktool website. The apktool_2.0.3.jar file must be renamed to apktool.jar and placed in the same folder containing the .bat file. After that, let’s position ourselves in the folder containing the necessary files, open the command prompt from here and type:
apktool d "[PERCORSO_FILE]\NOME.apk"
Below is the result of the command prompt confirming successful decompilation and showing the steps taken to perform it.
The result of this operation is a folder with the name of the APK, which will contain the code smali, the original encoded files, the resources contained in the res folder and the AndroidManifest.xml, all perfectly legible and editable. The Smali is an assembler for the format .dex used by DVM with a syntax similar to Jasmin and Dedexer, but specifically for Android.
Thanks to the use of these tools it is, therefore, possible to reverse engineer the APK without using third-party tools and succeeding in obtaining all the information of interest.
Having understood how simple it is to analyze the content of an APK, it is good to understand how we can protect the source code to prevent malicious users from getting hold of the heart of our app.
One of the most widespread solutions is ProGuard, a software that allows:
- reduce the code by eliminating the unused classes and variables in the application;
- optimize the code and the resources, allowing a weight reduction of the APK;
- obfuscate the code by renaming classes, attributes, and methods with semantically meaningless names.
This solution allows for an APK optimized in terms of code and size, but above all makes the APK decompilation and reverse engineering process more complicated. In addition, ProGuard can be integrated into the Android build system, automating its effects every time a new APK is generated. It is therefore not mandatory to enable ProGuard, but it is strongly recommended to do so in case the application has security-sensitive features.
If enabled, ProGuard will run from the build system only in release mode (and therefore during the creation of the production APK), allowing the developer to work in debug mode and to understand the possible errors reported without decoding the information. To enable it in your project, just change the minifyEnabled property in the build.gradle file by changing the value from false to true.
Created the APK in release mode the project will have a new folder called mapping and contained within the folder outputs of build (Figure 8), containing the following files:
|dump.txt||Describes the internal structure of the APK|
|mapping.txt||Contains the list of matches between classes, methods, original and obfuscated attributes, and it is essential to interpret bug reports from the Play Store|
|seeds.txt||It is the list of classes that have not been blurred|
|usage.txt||Specifies the portions of code that have been deleted|
For further details on ProGuard please refer to the appropriate page on the official documentation of Android.