Skip to main content

Table 2 Overview of static features extracted from Android APKs by the reviewed papers

From: On building machine learning pipelines for Android malware detection: a procedural survey of practices, challenges and opportunities

Feature APK part Rationale Tools Used by
Requested permissions Manifest Malware tend to request more permissions, and more dangerous ones. APKTool (2021), aapt (Portal AD 2021b), androguard (Project A 2021) Liu and Liu (2014), Arp et al. (2014), Yuan et al. (2014, 2016), Alzaylaee et al. (2020), Sahs and Khan (2012), Li et al. (2018), Wang et al. (2014), Wu et al. (2012), Demontis et al. (2019), Yerima (2013), Kim et al. (2019), Zhu et al. (2018), Zhang et al. (2018), Yerima et al. (2014, 2015), Wang et al. (2016), Aafer et al. (2013), Peiravian and Zhu (2013), Saracino et al. (2018), Sanz et al. (2013), Zarni Aung (2013), Lindorfer et al. (2015), Suarez-Tangil et al. (2017)
Used permissions DEX All requested permissions might not be used. Unused permissions introduce noise and should be eliminated. APKTool (2021), PScout (Project P 2021b), baksmali (Project B 2021) Liu and Liu (2014), Arp et al. (2014), Demontis et al. (2019), Lindorfer et al. (2015)
Hardware requirements Manifest Malware tend to request more sensitive hardware (e.g., Camera) aapt (Portal AD 2021b) Arp et al. (2014), Demontis et al. (2019), Sanz et al. (2013)
Names and types of app components Manifest To detect code reuse (common services, broadcast receivers, or other app components) by malware Arp et al. (2014), Wu et al. (2012), Demontis et al. (2019), Kim et al. (2019), Suarez-Tangil et al. (2017)
Filtered intents Manifest Malware tend to subscribe to sensitive system broadcasts, such as BOOT_COMPLETE. Arp et al. (2014), Wu et al. (2012), Demontis et al. (2019), Zhu et al. (2018), Zhang et al. (2018), Lindorfer et al. (2015), Suarez-Tangil et al. (2017)
API calls DEX Malware may call sensitive or suspicious APIs, such as ones to access SMS. baksmali (Project B 2021), soot (Project S 2021), androguard (Project A 2021), dexdump (Man Pages U 2021) Arp et al. (2014), Yuan et al. (2016), Sahs and Khan (2012), Wu et al. (2012) Yuan et al. (2014), Demontis et al. (2019), Yerima (2013), Kim et al. (2019), Zhu et al. (2018), Zhang et al. (2018), Yerima et al. (2014, 2015), Karbab et al. (2018), Aafer et al. (2013), Peiravian and Zhu (2013), Gascon et al. (2013), Yang et al. (2014), Suarez-Tangil et al. (2017)
Network addresses DEX Malware may commonly communicate with untrustworthy internet hosts. Arp et al. (2014), Demontis et al. (2019)
Opcodes DEX , Shared libraries Certain sequences of opcodes may reveal malicious intents in apps. baksmali (Project B 2021), IDA Pro (Hex-rays 2021) McLaughlin et al. (2017), Kim et al. (2019)
Bytecodes DEX Certain bytecode sequences may reveal malicious intents in apps. Grace et al. (2012), Xu et al. (2018), Bakour and Ünver (2021)
Decompiled Java code DEX Certain patterns of code may reveal malicious intent. dex2jar (Project D 2021b), Procyon (Project P 2021a) Milosevic et al. (2017), Wang et al. (2016)
Linux command strings DEX & Resources Malware may use dangerous commands to exploit the phone and gain privileged access. Yerima (2013), Yerima et al. (2014, 2015)
Use of encryption routines DEX Malware may use encryption to hide their intent. Yerima (2013), Lindorfer et al. (2015), Suarez-Tangil et al. (2017)
Presence of secondary APK or shell scripts Assets Malware may hide APK files which will be installed after infection. Shell scripts might be used for exploitation. Yerima (2013), Lindorfer et al. (2015)
Environmental Information Manifest Malware may target a specific vulnerable execution environment (e.g., Android version). Kim et al. (2019), Suarez-Tangil et al. (2017)
Constant strings Resources Malware may contain suspicious strings (e.g., fake ads) APKTool (2021) Kim et al. (2019), Zhang et al. (2018), Xu et al. (2018), Suarez-Tangil et al. (2017)
Use of Java reflection DEX Malware may use reflection to dynamically load code and thwart static analysis efforts. Lindorfer et al. (2015), Suarez-Tangil et al. (2017)
Signing certificate data META-INF The fingerprint, serial number, owner or other data from the certificate may correspond to known malware authors. Lindorfer et al. (2015), Suarez-Tangil et al. (2017)
Presence of native executables or libraries Lib Malware often use native code to perform exploits or make reverse-engineering harder. Lindorfer et al. (2015), Suarez-Tangil et al. (2017)