A static method for detecting android malware based on directed API callManh Vu Minh, Huy-Trung Nguyen, H. Viet Le, Tri Duc Nguyen, Xuan Cho DoInternational Journal of Web Information Systems, Vol. 21, No. 3, pp.183-204
The openness of the Android operating system offers users convenience but also exposes them to a multitude of malicious applications. Consequently, analyzing applications before installation has become a crucial research area in mobile security. Static analysis, known for its accuracy and low cost, is a prominent method within this field. This paper aims to propose an ML/DL-based approach to detect benign and malicious applications in APK format.
The analysis method, detailed further in the paper, consists of five steps. Step 1, each APK file in the sample set undergoes decompilation to convert it into source code. Then, directed API call graph (DACG) generator is used to analyze the decompiled source code from Step 1 and extract API calls. After that, the authors apply the graph2vec method to convert the DACG data set into characteristic subgraphs. Next, saving the necessary features that each model needs to learn from the vector set. This helps reduce the vector dimensionality for each model type and reduces time and noise by eliminating unnecessary features. Finally, training and evaluating the ability to detect Android malware based on popular machine learning algorithms such as Random Forest, support vector machine, K-nearest neighbor, logistic regression and one of the most powerful machine learning algorithms currently available, gradient boosting regression.
The authors come to the conclusion, feature graphs based on API call graphs are effective in detecting Android malware. Experimental results demonstrate the proposed method’s superiority over existing detection methods on a data set of 7,000 samples, achieving TPR > 97%, FPR < 1% and AUC∼0.98. Following these steps, a final classification will determine the safety of the tested application, aiding users in avoiding malware installation.
Although some limitations remain to be addressed, the DACG construction method holds significant potential for further exploration. Future research will focus on integrating dynamic analysis techniques to broaden the detectable and classifiable Android malware categories. In addition, the authors aim to adapt the methodology for broader applicability to other system types, including the widely used ELF systems in Linux.
In the study, the authors addressed the issue of generating graph-based feature for Android malware detection in a meaningful, practical and efficient way. The results can be used as a pattern for similar scenarios and applications.