Skip to content

Reverse Engineering Android: Disassembling Hello World

October 16, 2012

davtbaum

When it comes to learning Android, it’s amazing how easy it is to find tutorials, code samples, and documentation to immerse yourself into. Interestingly, I’ve found the inverse to be true for the, dare I say, way cooler world of hacking Android. Reverse engineering Android applications can be really fun and give you a decent knowledge for the inner workings of the Dalvik Virtual Machine. This post will be an all-out, start-to-finish, beginners* tutorial on the tools and practices of reverse engineering Android through the disassembly and code injection of the Android Hello World application.

*Beginner means that you know a bit about Android and Java in general, if not, learn a bit first and come back. Experience in the terminal environment on your machine is also probably necessary.

The Apk

In order to start reverse engineering, you must first understand what you’re working with. So what exactly is an apk? (hint: not American Parkour.) An Android package, or apk, is the container for an Android app’s resources and executables. It’s a zipped file that contains simply:

  • AndroidManifest.xml (serialized, non human readable)
  • classes.dex
  • res/
  • lib/ (sometimes)
  • META-INF/

The meat of the application is the classes.dex file, or the Dalvik executable (get it, dex) that runs on the device. The application’s resources (i.e. images, sound files) reside in the res directory, and the AndroidManifest.xml is more or less the link between the two, providing some additional information about the application to the OS. The lib directory contains native libraries that the application may use via NDK, and the META-INF directory contains information regarding the application’s signature.

You can grab the HelloWorld apk we will be hacking here. The source to this apk is available from the developer docs tutorial, and when compiled looks something like this:

Flashy, huh

The Tools

In order to complete this tutorial, you’ll need to download and install the following tools:

Apktool does all of the disassembling/reassembling and wraps functionality from a lot of tools in the reverse engineering realm (smali/baksmali assembler, XML deserializers, etc). I’m not a _huge_ fan of the tool, but it’s a great way to get started. Jarsigner and keytool allow you to re-sign the application after it’s been disassembled. We’ll get into what the signing process does later on.

Disassembling the Apk

Once you’ve installed apktool, go ahead and open up your terminal and change directory into where you’ve placed the downloaded apk.

$ cd ~/Desktop/HelloWorld

Execution of the apktool binary without arguments will give you its usage, but we will only use the ‘d’ (dump) and ‘b’ (build) commandline options for this tutorial. Dump the apk using the apktool ‘d’ option:

$ apktool d HelloWorld.apk

This will tell the tool to decode the assets and disassemble the .dex file in the apk. When finished, you will see the ./HelloWorld directory, containing:

  • AndroidManifest.xml (decoded, human readable)
  • res/ (decoded)
  • smali/
  • apktool.yml

The AndroidManifest.xml is now readable, the resources have been decoded, and a smali directory has been created (ignore the apktool.yml as it’s just a configuration for the tool itself). The smali directory is probably the most important of the three, as it contains a set of smali files, or bytecode representation of the application’s dex file. You can think of it as an intermediate file between the .java and the executable.

So let’s take a look at what’s in the smali directory , ‘ls’ yields:

 $ ls HelloWorld/smali/com/test/helloworld/
HelloWorldActivity.smali
R$attr.smali
R$drawable.smali
R$layout.smali
R$string.smali
R.smali

Immediately we notice that the smali directory contains subdirectories defining the application’s namespace (com.test.helloworld). Additionally, we can see an individual smali file for each java class. There’s one catch – any ‘$’ in the smali file’s name means it’s an inner class in Java. Here we see the bytecode representation of the following classes:

  • HelloWorldActivity.java
  • R.java

Where R.java contains inner classes attr, string, and so on. It’s evident that HelloWorldActivity is the activity that’s displayed when the app launches, so what exactly is R?

R.java is an automatically generated file at application build time that maps resources to an associated id. When a developer wants to use anything in the res folder, he/she must use the R class to appropriately reference that resource. Because of this, we’ll omit the R.java from our investigation, as it really only contains a bunch of constants that no one cares about.

Reading the Smali

Now that we’ve disassembled our apk, let’s take a look at the java and smali representations of our impressive HelloWorldActivity.

package com.test.helloworld;

import android.app.Activity;
import android.os.Bundle;
import android.widget.TextView;

public class HelloWorldActivity extends Activity {
    /** Called when the activity is first created. */
    @Override
    public void onCreate(Bundle savedInstanceState) {
        super.onCreate(savedInstanceState);

        TextView text = new TextView(this);
        text.setText("Hello World, Android");
        setContentView(text);
    }
}
.class public Lcom/test/helloworld/HelloWorldActivity;
.super Landroid/app/Activity;
.source "HelloWorldActivity.java"

# direct methods
.method public constructor ()V
    .locals 0

    .prologue
    .line 7
    invoke-direct {p0}, Landroid/app/Activity;->()V

    return-void
.end method

# virtual methods
.method public onCreate(Landroid/os/Bundle;)V
    .locals 2
    .parameter "savedInstanceState"

    .prologue
    .line 11
    invoke-super {p0, p1}, Landroid/app/Activity;->onCreate(Landroid/os/Bundle;)V

    .line 13
    new-instance v0, Landroid/widget/TextView;

    invoke-direct {v0, p0}, Landroid/widget/TextView;->(Landroid/content/Context;)V

    .line 14
    .local v0, text:Landroid/widget/TextView;
    const-string v1, "Hello World, Android"

    invoke-virtual {v0, v1}, Landroid/widget/TextView;->setText(Ljava/lang/CharSequence;)V

    .line 15
    invoke-virtual {p0, v0}, Lcom/test/helloworld/HelloWorldActivity;->setContentView(Landroid/view/View;)V

    .line 17
    return-void
.end method

It should be pretty evident which one of these files is written in java, nonetheless, the smali representation shouldn’t be too intimidating.

Let’s break down whats going on here in java first.  In line 07, we define our HelloWorldActivity class that extends android.app.Activity, and within that class, override the onCreate() method. Inside the method, we create an instance of the TextView class and call the TextView.setText() method with our message. Finally, in line 15 we set the view by calling setContentView(), passing in the TextView instance.

In smali, we can see that we have a bit more going on. Let’s break it up into sections, we have:

  1. class declarations from lines 01-03
  2. a constructor method from lines 07-15
  3. a bigger onCreate() method from lines 19-43

Declarations and Constructor

The class declarations in smali are essentially the same in java, just in a different syntax. They give the virtual machine their class and superclass name via the .class and .super tags. Additionally, the compiler throws in the source file name for…shits and gigs? Nope, stack traces.

The constructor has seemingly appeared out of no where, but really was inserted by the compiler because we extended another class. You can see that in line 12 the virtual machine is to make a direct invokation of the super classes constructor – this follows the nature of subclasses, they must call their superclasses constructor.

Data Types

In the onCreate() method beginning on line 19, we can see that the smali method definition isn’t that far off from its java counterpart. The method’s parameter types are defined within the parenthesis (semicolon separated) with the return type discreetly placed on the end of the .method line. Object return types are easy to recognize, given they begin with an L and are in full namespace. Java primitives, however, are represented as capital chars and follow the format:

V	 void
Z	 boolean
B	 byte
S	 short
C	 char
I	 int
J	 long (64 bits)
F	 float
D	 double (64 bits)

So for our onCreate() definition in smali, we can expect a void return value.

Registers

Moving one line down, on line 20 we see the ‘.locals’ directive. This determines how many registers the Dalvik vm will use for this method _without_ including registers allocated to the parameters of the method. Additionally, the number of parameters for any virtual method will always be the number of input parameters + 1. This is due to an implicit reference to the current object that resides in parameter register 0 or p0 (in java this is called the “this” reference). The registers are essentially references, and can point to both primitive data types and java objects. Given 2 local registers, 1 parameter register, and 1 “this” reference, the onCreate() method uses an effective 4 registers.

For convenience, smali uses a ‘v’ and ‘p’ naming convention for local vs. parameter registers. Essentially, parameter (p) registers can be represented by local (v) registers and will always reside in the highest available registers. For this example, onCreate() has 2 local registers and 2 parameter registers, so the naming scheme will look something like this:

v0 - local 0
v1 - local 1
v2/p0 - local 2 or parameter 0 (this)
v3/p1 - local 3 or parameter 1 (android/os/Bundle)

Note: You may see the .registers directive as oppose to the .locals directive. The only difference is that the .registers directive includes parameter registers (including “this”) into the count. Given the onCreate() example, .locals 2 == .registers 4

Opcodes

Dalvik opcodes are relatively straightforward, but there are a lot of them. For the sake of this post’s length, we’ll only go over the basic (yet important) opcodes found in our example HelloWorldActivity.smali. In the onCreate method in HelloWorldActivity the following opcodes are used:

  1. invoke-super vx, vy, … invokes the parent classes method in object vx, passing in parameter(s) vy, …
  2. new-instance vx creates a new object instance and places its reference in vx
  3. invoke-direct vx, vy, … invokes a method in object vx with parameters vy, … without the virtual method resolution
  4. const-string vx creates string constant and passes reference into vx
  5. invoke-virtual vx, vy, … invokes the virtual method in object vx, passing in parameters vy, …
  6. return-void returns void

Hacking the App

Now that we know what we’re looking at, lets inject some code and rebuild the app. The code we will inject is only one line in java and presents the user with the toast message “hacked!”.

Toast.makeText(getApplicationContext(), "Hacked!", Toast.LENGTH_SHORT).show();

How do we do this in smali? Easy, let’s just compile this into another application and disassemble. The end result is something like this:

    .line 18
    invoke-virtual {p0}, Lcom/test/helloworld/HelloWorldActivity;->getApplicationContext()Landroid/content/Context;

    move-result-object v1

    const-string v2, "Hacked!"

    const/4 v3, 0x0

    invoke-static {v1, v2, v3}, Landroid/widget/Toast;->makeText(Landroid/content/Context;Ljava/lang/CharSequence;I)Landroid/widget/Toast;

    move-result-object v1

    invoke-virtual {v1}, Landroid/widget/Toast;->show()V

Now, let’s ensure we have the right amount of registers in our original onCreate() to support these method calls. We can see that the highest register in the code we want to patch is v3, which we have but will require us to overwrite both of our parameter registers. Given we won’t be using either of those registers after setContentView(), this number is appropriate. Our final patched HelloWorldActivity.smali should look like:

.class public Lcom/test/helloworld/HelloWorldActivity;
.super Landroid/app/Activity;
.source "HelloWorldActivity.java"

# direct methods
.method public constructor ()V
    .locals 0

    .prologue
    .line 8
    invoke-direct {p0}, Landroid/app/Activity;->()V

    return-void
.end method

# virtual methods
.method public onCreate(Landroid/os/Bundle;)V
    .locals 2
    .parameter "savedInstanceState"

    .prologue
    .line 12
    invoke-super {p0, p1}, Landroid/app/Activity;->onCreate(Landroid/os/Bundle;)V

    .line 14
    new-instance v0, Landroid/widget/TextView;

    invoke-direct {v0, p0}, Landroid/widget/TextView;->(Landroid/content/Context;)V

    .line 15
    .local v0, text:Landroid/widget/TextView;
    const-string v1, "Hello World, Android"

    invoke-virtual {v0, v1}, Landroid/widget/TextView;->setText(Ljava/lang/CharSequence;)V

    .line 16
    invoke-virtual {p0, v0}, Lcom/test/helloworld/HelloWorldActivity;->setContentView(Landroid/view/View;)V

    # Patches Start

    invoke-virtual {p0}, Lcom/test/helloworld/HelloWorldActivity;->getApplicationContext()Landroid/content/Context;

    move-result-object v1

    const-string v2, "Hacked!"

    const/4 v3, 0x0

    invoke-static {v1, v2, v3}, Landroid/widget/Toast;->makeText(Landroid/content/Context;Ljava/lang/CharSequence;I)Landroid/widget/Toast;

    move-result-object v1

    invoke-virtual {v1}, Landroid/widget/Toast;->show()V

    # Patches End

    return-void
.end method

Lines 40+ contain the injected code.

Rebuilding the Apk

Now all that’s left is to rebuild the app!

$ apktool b ./HelloWorld

This will instruct apktool to rebuild the app, however, this rebuilt app will not be signed. We will need to sign the app before it can be successfully installed on any device or emulator.

Signing the Apk

In order to sign the apk, you’ll need jarsigner and keytool (or a platform specific alternative, like signapk for windows). With jarsigner and keytool, however, the steps are pretty easy. First create the key:

$ keytool -genkey -v -keystore my-release-key.keystore -alias alias_name -keyalg RSA -validity 10000

Then use jarsigner to sign your apk, referencing that key:

$ jarsigner -verbose -keystore my-release-key.keystore ./HelloWorld/dist/HelloWorld.apk alias_name

Then you’re done! Install the app onto your device or emulator and impress the shit out of yourself!

damn…impressive

That’s it for this tutorial, but stay tuned. There will definitely be more in the future. Feel free to leave any questions in the comment section, or contact me with any questions.

Happy hacking,

David Teitelbaum

12 Comments

Post a comment
  1. Chris Arriola #
    October 18, 2012

    Wow, great post! After looking into this myself, I’m really surprised that not many apps use ProGuard to obfuscate their code. It’s so simple to enable yet I wonder why developers haven’t used it as a standard.

  2. shubhangiverma #
    November 23, 2012

    thanks for sharing such a nice tutorial on reverse engineering android

  3. February 2, 2013

    Nice stuff and the first good writeup of how to read smali we found on the web!

  4. February 14, 2013

    (unable to access apktool.jar) in cmd plz help me

  5. AsRaj #
    March 19, 2013

    For reading out java source take a look at here..

    http://stackoverflow.com/questions/1249973/decompiling-dex-into-java-sourcecode

    Impressive results with dex2jar and jd-gui!.

  6. subha #
    April 5, 2013

    While trying to re-compile, application shows below error..any idea

    C:\apktool>apktool b C:\HelloWorld
    I: Checking whether sources has changed…
    I: Smaling…
    Exception in thread “main” java.lang.NullPointerException
    at org.jf.util.PathUtil.getRelativeFile(PathUtil.java:44)
    at org.jf.smali.smaliFlexLexer.getSourceName(smaliFlexLexer.java:2922)
    at org.antlr.runtime.CommonTokenStream.getSourceName(CommonTokenStream.j
    ava:345)
    at org.antlr.runtime.Parser.getSourceName(Parser.java:88)
    at org.jf.smali.smaliParser.getErrorHeader(smaliParser.java:358)
    at org.antlr.runtime.BaseRecognizer.displayRecognitionError(BaseRecogniz
    er.java:192)
    at org.antlr.runtime.BaseRecognizer.reportError(BaseRecognizer.java:186)

    at org.jf.smali.smaliParser.smali_file(smaliParser.java:736)
    at brut.androlib.mod.SmaliMod.assembleSmaliFile(SmaliMod.java:71)
    at brut.androlib.src.DexFileBuilder.addSmaliFile(DexFileBuilder.java:43)

    at brut.androlib.src.DexFileBuilder.addSmaliFile(DexFileBuilder.java:33)

    at brut.androlib.src.SmaliBuilder.buildFile(SmaliBuilder.java:64)
    at brut.androlib.src.SmaliBuilder.build(SmaliBuilder.java:48)
    at brut.androlib.src.SmaliBuilder.build(SmaliBuilder.java:35)
    at brut.androlib.Androlib.buildSourcesSmali(Androlib.java:222)
    at brut.androlib.Androlib.buildSources(Androlib.java:179)
    at brut.androlib.Androlib.build(Androlib.java:170)
    at brut.androlib.Androlib.build(Androlib.java:154)
    at brut.apktool.Main.cmdBuild(Main.java:182)
    at brut.apktool.Main.main(Main.java:67)

  7. Girish #
    April 29, 2013

    I want an app which use this reverse enginering concept

  8. July 29, 2013

    very cool reverse engineering tut!

  9. Nick Abbott #
    August 29, 2013

    I’m trying to understand some smali and you say this: “overwrite both of our parameter registers. Given we won’t be using either of those registers after setContentView(), this number is appropriate.” BUT, what if you did want to re-use your registers? Would you do:

    .line 18
    invoke-virtual {p1}, Lcom/test/helloworld/HelloWorldActivity;->getApplicationContext()Landroid/content/Context;

    move-result-object v4

    const-string v5, “Hacked!”

    const/4 v6, 0x0

    invoke-static {v4, v5, v6}, Landroid/widget/Toast;->makeText(Landroid/content/Context;Ljava/lang/CharSequence;I)Landroid/widget/Toast;

    move-result-object v4

    invoke-virtual {v4}, Landroid/widget/Toast;->show()V

    Then set .locals to 4 (as we’re using new registers). I think I tried this on something myself and it caused an APK crash!

  10. stanto #
    September 14, 2013

    what functions do .local vx / .end local vx and .restart local vx have? Can u give me some advices!

  11. September 28, 2014

    Thanks designed for sharing such a nice opinion, paragraph is fastidious, thats
    why i have read it entirely

Trackbacks & Pingbacks

  1. Java Android Reverse Engineering | Ranjan Kumar

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 1,007 other followers

%d bloggers like this: