When you study the behavior of an Android application, you would have to determine if a specific Java method is called either by the application code or by a third-party code. The easiest way to tackle this problem is to use Androguard. We will see how it is easy to statically extract the call graph of a given Java method from an Android application.
This article is for educational purpose.
Entire call graph extraction
Androguard comes with an handy script androcg.py
meant to dump the entire call graph of the given application into a gml
or dot
file that can be displayed with Gephi.
androcg.py
is really simple to use
usage: androcg.py [-h] [--output OUTPUT] [--show] [--verbose]
[--classname CLASSNAME] [--methodname METHODNAME]
[--descriptor DESCRIPTOR] [--accessflag ACCESSFLAG]
[--no-isolated]
APK
Create a call graph based on the dataof Analysis and export it into a graph
format.
positional arguments:
APK The APK to analyze
optional arguments:
-h, --help show this help message and exit
--output OUTPUT, -o OUTPUT
Filename of the output file, the extension is used to
decide which format to use (default callgraph.gml)
--show, -s instead of saving the graph, print it with mathplotlib
(you might not see anything!
--verbose, -v Print more output
--classname CLASSNAME
Regex to filter by classname
--methodname METHODNAME
Regex to filter by methodname
--descriptor DESCRIPTOR
Regex to filter by descriptor
--accessflag ACCESSFLAG
Regex to filter by accessflags
--no-isolated Do not store methods which has no xrefs
You can also compute this call graph programmatically
from androguard.misc import AnalyzeAPK
a, d, dx = AnalyzeAPK('path/to/an/application.apk')
call_graph = dx.get_call_graph()
When you have your entire call graph, you can dig into it and manually extract the call graph of a given Java method. That is really painful. But, a tiny bit of graph theory will help us.
Specific call graph extraction
A call graph is basically a directed graph where vertices are Java methods and edges are calls from a method to another. The call graph of a given method is the sub-graph of the entire call graph only containing the ancestors of the specific method. Ancestors are all the vertices that are reachable by following the edges in the backward direction to one or many roots of the directed call graph.
Androguard allows you to find a specific method in a specific class. To do so, you can use the following script
from androguard.misc import AnalyzeAPK
a, d, dx = AnalyzeAPK('path/to/an/application.apk')
methods = dx.find_methods(methodname='a method regex', classname='a class regex')
If you are looking for the method getDeviceId
declared in the class Landroid/telephony/TelephonyManager
, the tiny script looks like that
from androguard.misc import AnalyzeAPK
a, d, dx = AnalyzeAPK('path/to/an/application.apk')
methods = dx.find_methods(methodname='getDeviceId', classname='Landroid/telephony/TelephonyManager')
Finally, you just have to get the ancestors of the vertices contained in methods
- methods that match the 2 REGEXes.
from androguard.misc import AnalyzeAPK
import matplotlib.pyplot as plt
import networkx as nx
a, d, dx = AnalyzeAPK('path/to/an/application.apk')
call_graph = dx.get_call_graph()
for m in dx.find_methods(methodname='getDeviceId', classname='Landroid/telephony/TelephonyManager'):
ancestors = nx.ancestors(call_graph, m.get_method())
ancestors.add(m.get_method())
graph = call_graph.subgraph(ancestors)
# Drawing
pos = nx.spring_layout(graph, iterations=500)
nx.draw_networkx_nodes(graph, pos=pos, node_color='r')
nx.draw_networkx_edges(graph, pos, arrow=True)
nx.draw_networkx_labels(graph, pos=pos, labels={x: str(x) for x in graph.nodes}, font_size=8)
plt.axis('off')
plt.draw()
plt.show()
Adapt the script to your needs, run it and see the result.
This example shows the call graph of getDeviceId
method by obfuscated code.