This section presents the design of DROIDECHO. As shown in Fig. 2, DROIDECHO takes as input an Android application, which contains the class files, the manifest file and the description of its functionality. DROIDECHO will generate an attack report which contains identified malicious behaviors and the corresponding traces of these behaviors for forensic use. DROIDECHO leverages the attack model which is presented in Section Semantic model of attack as the guidance for attack detection, and proceeds in four phases: disclaimer learning, ICCG construction, attack detection and attack confirmation. The first phase disclaimer learning receives the descriptive text of applications as input, and generates a white list of “necessary” behaviors (a.k.a., disclaimer of the application) in a supervised manner. The white list will be used to exclude the detection for the claimed functionality of the application. Second, ICCG construction takes class files and the manifest file of the application as input, and constructs an ICCG, which is then passed to the third phase. Attack detection can find out, if any, existing attacks and the corresponding traces which cause these attacks in the application. At last, attack confirmation receives the candidate attacks, and determines whether one attack candidate is a false positive or not by a trace-guided dynamic execution.
Disclaimer learning
Some Android applications may perform seemingly suspicious behaviors while they are actually demanded to accomplish the functionality. The demanded functionality and the risks it may bring are usually claimed in their descriptive text. We regard this as a benign behavior (henceforth disclaimer), and it will not be considered as an attack candidate. For example, TripAdvisor is a travel application, which can provide the nearby restaurants and hotels when the user is travelling. For ease of use, it acquires the permission FINE_LOCATION to learn the user’s location such that it can provide the most suitable information for the customers. Although we detect that TripAdisor has a privacy issue, which sends the user’s location to a remote server from time to time, we regard this as being benign and harmless.
As shown in Fig. 3, we obtain the descriptions of applications and perform a description-to-permission fidelity analysis (Qu et al. 2014). The fidelity analysis builds a description-to-permission relatedness model in which one permission is associated with a list of noun phrases. For the description of a given application, we can leverage this model to produce a list of requested permissions. Then, we employ PScout (Au et al. 2012) to elicit the corresponding APIs that request permissions. For example, the sentence “Your location: These permissions are needed to obtain your location so we can help you discover hotels, restaurants, and attractions around you” in app TripAdvisor implies that it requests for recognizing users’ current location the permission android.permission.ACCESS_COARSE_LOCATION and android.permission.ACCESS_FINE_LOCATION. Therefore, 21 Android APIs (e.g., void requestLocationUpdates(float, LocationListener) and Location getLastKnownLocation(String)) are regarded as being necessary to invoke by permission-to-api mapping.
The produced Android APIs serve as disclaimers to refine the attack model. During attack detection (see Section Attack detection), these APIs will not be considered as attack actions.
ICCG construction
The construction of ICCG takes class files and the manifest file of the application to be checked as inputs. Primarily, DROIDECHO employs Soot (Vallée-Rai et al. 1999) to generate a rough call graph of the whole application and a control flow graph for each method. Given that, DROIDECHO proceeds in three steps successively: pointer analysis, link analysis and graph assembling. The first two steps can provide all auxiliary information to assemble an ICCG.
Pointer analysis
Pointer analysis is a static analysis to infer which variables are pointed to by pointer references or heap references. In this step, we want to identify all references which are pointing to variables in the application, and all possible values which the variables can be assigned to. The result of this step is a PointerTable, which contains mappings from variables to concrete values: Set(variables)→Set(values). Set(variables) denotes a set of variables which are pointed to with the same reference at a time, and Set(values) denotes a set of possible values to which the variables can be assigned. PointerTable plays a critical role in the step of link analysis and action recognition. During the step link analysis, PointerTable is used to infer the actions and classes of an Intent object, thereby DROIDECHO can identify which components are able to receive this Intent. And DROIDECHO needs the PointerTable to recognize the semantics of actions during the action recognition. For example, when DROIDECHO encounters an operation to query a content provider, it needs to learn the value of the argument URI, to distinguish different content providers.
Parts of our pointer analysis are based on SPARK (Lhoták and Hendren 2003), which is a pointer analysis framework. It can cluster the variables into several sets, i.e., Set(variables), where all variables in the same set have been pointed to with same reference at a time. Since we have got a rough call graph and control flow graphs of all methods, we traverse the call graph and go inside control flow graphs to perform value inference. We evaluate each node in a control flow graph, and infer the possible values of the variables. The value inference can handle basic arithmetic and String operations. In addition, we do not evaluate all types of variables, which are both computation expensive and useless to our attack detection. We only pay attention to the valuation of primary types (e.g., boolean, int, double), String, ComponentName, URI/URL and Intent. It is worth mentioning that the values of ComponentName and URI/URL objects can be expressed by a String, while we construct a more complicated structure for Intent objects, which basically contains four fields: action, class, data and category.
The pointer analysis used in this work is type-sensitive, however, flow-insensitive. That is, every variable in the same set needs to share the same data type with others. In order to reduce the expense of storage and computation, we store all possible values which the set of variables can be assigned to rather than only parts of them after a certain statement.
Link analysis
Link analysis is to establish all links between methods or components in an application, i.e., the edges in ICCG. Primarily, the call graph generated by Soot only contains the call relationship between Java methods. As introduced in Section The inter-component communication graph, there are implicit invocations and a variety of communication mechanisms on Android. On the basis of the call graph, we analyze all links between methods and build a complete communication graph for the application.
There are two kinds of links between two methods, invocation links (either explicit or implicit) and communication links via Android medium (e.g., Intent and message). We first build call chains for the lifecycle of Android components. For example, one of the call chains of Android Activity is onCreate →onStart →onResume, which shows the implicit invocations after the start of the Activity. As a result, the above methods in the call graph will be linked with an invocation edge, respectively. For communication links, we recognize the mediums as well as their attributes existing in the methods, and identify which components or methods can receive these mediums. Take the Intent medium as an example, if we find an action which starts activities, like startActivity(Intent), we retrieve the attributes (e.g., class and action) of the Intent object and identify which activities can be triggered by this Intent object. As a result, we add a new link between the method which sends out the Intent and the constructor method of the target activities.
Graph assembling
By far, we have obtained the control flow graph for each method of the application, and all links between these methods. We take the control flow graphs as nodes, the links as edges, and assemble them into an ICCG. The graph depicts the execution order and communications between different methods at the system level, and illustrates the control flow at the method level. Combined with PointerTable, ICCG is passed to the attack detection phase. Attack detection will search the graph and find out any existing attack.
Attack detection
To reduce the search space of attack detection, we will not analyze the program from its entry points. In converse, we first recognize attack-related actions existing in the program in a fast way, and perform a bidirectional flow analysis from behaviors, which can effectively speedup the search process.
Algorithm 1 shows the whole process to check whether one attack is contained by the application or not. The algorithm takes ICCG of an application, and one attack model as the input, and outputs whether the attack model exists in the ICCG. Line 1-3 show that it recognizes all actions existing in the ICCG. If any of actions in the attack is not contained in the ICCG, DROIDECHO concludes that the application does not contain this attack. In our implementation, we conduct an one-time retrieval of the ICCG for each application and store all recognized actions. By comparing the included actions in each attack, we can quickly eliminate some attacks which will definitely not happen.
If all actions in the attack model are found in ICCG, we proceed the reachability analysis and program slicing. Since there are two kinds of flows (referred to control flow and data flow in program analysis, respectively) defined in our attack model, we carry on ForwardControlFlowAnalysis (Line 10) and TaintAnalysis (Line 6) to determine whether the flows are satisfied or not. At last, we get the trigger causing this attack (Line 13), and check if it is a kind of environmental input, e.g., the initialization of application, system broadcast message and a timer task. In the following, we will give a more detailed description for each step.
Action recognition
We use actions to describe the basic elements in an attack, which is semantic but domain-independent. However, we need to define a system of notations in a specific domain (here Android), to capture these actions and triggers in ICCG. On Android, we recognize an action by the corresponding constraints. Here we define three kinds of predicates to express APIs and constraints in these actions we met in the code: sig(api), type(arg), and value(arg), where api is an Android API, arg is a variable, and these predicates will return a comparable constant value. As a consequence, action recognition can be transformed into a satisfiability problem,
$$ action \models \mathbf{sig}(api) $$
(1)
$$ \mathbf{sig}(api) \models \mathbf{type}(arg) \cap \mathbf{value}(arg) $$
(2)
One action is recognized if we detect some APIs which satisfies the above constraints progressively. Equation 1 shows the action can be recognized with an API with the specific signature, and moreover, the arguments or the base, if any, need to satisfy two kinds of predicates, type and value. As shown in Eq. 2, arg is either the base of the API (static methods do not have a base), or the arguments. Specially, arg may be another invocation of API, i.e., sig. Therefore, we will recursively solve the constraints until the action is recognized. Taking the example of obtaining contacts, the essential code at language level of this action can be described as follows:
$$ \frac{sig(api) = obj.query(uri, *)}{obtain~contact} $$
(3)
$$ \frac{\begin{aligned} type(obj) &= ContentResolver, \\ type(uri) &= Uri, \\ value(uri) &=``content://contacts" \end{aligned}} {sig(api) = obj.query(uri, *)} $$
(4)
As shown in Eq. 3, we first need to find a pivotal function whose signature matches obj.query(uri, *), and the methods need to meet three constraints: the base of the invocation obj needs to be an object of the class android.content.ContentResolver, the type of uri needs to be an object of android.net.Uri, and its value needs to be content://contacts as shown in Eq. 4. The code statements, which together form a behavior, might have dependency relationship or follow an execution order in between. We deal with it as a constraint satisfaction problem, and recognize a behavior with reasoning. The benefits are that we do not need to care about the execution order of code in a behavior, and hence our approach is more general so as to identify more variations.
Reachability analysis & slicing
If the ICCG contains all necessary elements for one attack, we start to do program slicing from these elements. The slicing consists of backward and forward control flow analysis. The backward control flow analysis aims to complete three tasks: 1) find the root cause that lead to such action, i.e., its entry points. Based on the entry points, we can infer the type of the triggers. Then we know whether the attack is triggered by a user interaction or environmental inputs; 2) obtain all conditions in a trace from the entry points to the action. The conditions are used in attack confirmation to guide the dynamic execution of the application; 3) identify the search space for potential taint analysis.
The forward control flow analysis aims to complete two tasks: 1) determine the occurrence of the subsequent actions in an attack model; 2) similar to the backward control flow analysis, identify the search space for the taint analysis. As a result, we will not search the entire ICCG during the taint analysis, which is computationally expensive.
Taint analysis
Taint analysis can track the flow of data during detection. Taking privacy leakage as an example, we need to carry on taint analysis to track the flow of data, and if the data is flowed to a sink action and sent out eventually. During the taint analysis, we get a domain set in a control-flow order SearchDomain=D1→D2,...→Dn, and the source action is located at D
sr
after the above steps. Then we perform a forward data flow analysis on the domain set SearchDomain. Figure 4 illustrates the ways how the data can be tainted cross domains. First of all, data in the domain D
s
can influence the data in its previous domain by three methods: return the data at the call site in the previous domains, referring to #x24ZZ;1circle; the data flow #x24ZZ;2circle shows how the data in the latter domain influences the data in its previous domains; and we can assign the data to one commonly shared variable between the domain D
s
as shown in #x24ZZ;3circle. There are three possible ways for the data in domain D
s
to influence the data in the successive domains: enclose communication medium with data and pass it to the next domains as shown by the data flow #x24ZZ;4circle; pass the data as an argument to its successive domains, which are used in these domains, referring to #x24ZZ;5circle; assign the data to a commonly shared variable in between as shown by the data flow #x24ZZ;6circle. In addition, we take a coarse-grained aliasing analysis in this paper, i.e., if for example a string variable is passed to a function, and this function will encrypt the string and return a new encrypted value with a cryptographic scheme. Although we do not know how to convert the original string to the encrypted one (we do not infer the meaning of cryptographic schemes), we can definitely ensure the operation is reversible, and the returned data is also of sensitive information.
Dynamic attack confirmation
As discussed before, DROIDECHO’s ICCG construction and attack detection are based on static program analysis, which is less precise than dynamic analysis. As a result, the attacks reported by DROIDECHO may be false positives. Therefore, we introduce a confirmation step to reduce false positives, and the attack confirmation is based on the technique of dynamic testing.
An attack candidate, which is passed from the attack detection phase to the attack confirmation phase, contains an attack trace and the conditions that guarantee the occurrence of attacks. Given that, we simulate the inputs to drive the dynamic execution of the application and check whether the attack trace can occur in the real execution. In order to activate the attack candidate and capture malicious behaviors, we first instrument Android OS by hooking specific Android APIs which are included in our attack model, and then generate the triggers which are used to activate the contained malicious behaviors.
-
Instrumentation. Since the actions in attack model are recognized as the invocations of specific Android APIs, we instrument Android OS to monitor the invocation behaviors. In this paper, we leverage TaintDroid (Enck et al. 2010) to determine whether these APIs are invoked.
-
Triggers. We leverage IntelliDroid (Wong and Lie 2016) to generate all triggers leading to specific malicious behaviors, and subsequently schedule these triggers to drive the execution of the application. We simply feeds the application with all possible trigger sequences, and in order to eliminate the impossible sequences (which never occur during the real executions), we exploit the “happen-before” relations among these triggers to generate sequences.
Obtaining these inputs, DROIDECHO is able to execute the suspicious applications to determine if the attack is reachable. In order to make the exploration faster, DROIDECHO prunes the paths which rarely lead to the attack trace, which can significantly reduce the search space of the program.