{"title":"FABLDroid: Malware detection based on hybrid analysis with factor analysis and broad learning methods for android applications","authors":"Kazım Kılıç , İsmail Atacak , İbrahim Alper Doğru","doi":"10.1016/j.jestch.2024.101945","DOIUrl":null,"url":null,"abstract":"<div><div>The Android operating system, which is popular on mobile devices, creates concerns for users due to the malware it is exposed to. Android allows applications to be downloaded and installed outside the official application store. Applications installed from third-party environments threaten users’ privacy and security. Deep learning-based methods are popular for detecting Android malware. However, deep learning methods contain a large number of parameters and have high memory consumption and are graphics card dependent architectures. To overcome these difficulties, a detection architecture using lightweight Broad learning method that provides high detection performance as an alternative to layer stacking found in deep structures is presented. Our method is based on a lightweight deep neural network architecture based on broad learning to reveal hidden factors to detect Android malware. The proposed architecture uses the Factor Analysis (FA) dimension reduction method to reveal hidden factors within the hybrid features of Android applications. The features extracted by factor analysis are expanded using the broad learning method and fed to a deep neural network with two hidden layers. In the proposed method, the learning ability of the deep neural network architecture, which has strong computational ability, is increased with the broad learning technique. The Kronodroid dataset is used to validate our approach. The Kronodroid dataset is a dataset consisting of malware and benign applications, specifically designed to examine and explore the concept drift and cross-device detection issues in the problem domain. The Kronodroid dataset contains different datasets obtained from both real devices and emulator runtimes. The tests of our method were carried out separately with the features extracted in the real device and emulator runtime. In this way, the behaviors of malicious applications in different environments were compared. In order to verify the effectiveness of the factor analysis method, the classification performance was measured by extracting 32, 64, 128, and 256 features with different dimensionality reduction techniques. As a result of the experiments conducted using different rates of expansion with the broad learning method, a 98.20% accuracy value was achieved on the real device dataset with the proposed architecture. An accuracy value of 97.90% was produced on the emulator dataset. In order to compare the proposed method on different datasets, 4000 applications were downloaded from the Androzoo environment to create a hybrid feature dataset. The proposed method achieved 98.40% accuracy on the Androzoo dataset. The experimental results reveal that the broad learning method increases the performance compared to the raw features. The findings show that the proposed broad learning-based method exhibits successful performance compared to similar studies based on deep learning using ensemble learning methods and layer stacking.</div></div>","PeriodicalId":48609,"journal":{"name":"Engineering Science and Technology-An International Journal-Jestech","volume":"62 ","pages":"Article 101945"},"PeriodicalIF":5.1000,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Engineering Science and Technology-An International Journal-Jestech","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2215098624003318","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0
Abstract
The Android operating system, which is popular on mobile devices, creates concerns for users due to the malware it is exposed to. Android allows applications to be downloaded and installed outside the official application store. Applications installed from third-party environments threaten users’ privacy and security. Deep learning-based methods are popular for detecting Android malware. However, deep learning methods contain a large number of parameters and have high memory consumption and are graphics card dependent architectures. To overcome these difficulties, a detection architecture using lightweight Broad learning method that provides high detection performance as an alternative to layer stacking found in deep structures is presented. Our method is based on a lightweight deep neural network architecture based on broad learning to reveal hidden factors to detect Android malware. The proposed architecture uses the Factor Analysis (FA) dimension reduction method to reveal hidden factors within the hybrid features of Android applications. The features extracted by factor analysis are expanded using the broad learning method and fed to a deep neural network with two hidden layers. In the proposed method, the learning ability of the deep neural network architecture, which has strong computational ability, is increased with the broad learning technique. The Kronodroid dataset is used to validate our approach. The Kronodroid dataset is a dataset consisting of malware and benign applications, specifically designed to examine and explore the concept drift and cross-device detection issues in the problem domain. The Kronodroid dataset contains different datasets obtained from both real devices and emulator runtimes. The tests of our method were carried out separately with the features extracted in the real device and emulator runtime. In this way, the behaviors of malicious applications in different environments were compared. In order to verify the effectiveness of the factor analysis method, the classification performance was measured by extracting 32, 64, 128, and 256 features with different dimensionality reduction techniques. As a result of the experiments conducted using different rates of expansion with the broad learning method, a 98.20% accuracy value was achieved on the real device dataset with the proposed architecture. An accuracy value of 97.90% was produced on the emulator dataset. In order to compare the proposed method on different datasets, 4000 applications were downloaded from the Androzoo environment to create a hybrid feature dataset. The proposed method achieved 98.40% accuracy on the Androzoo dataset. The experimental results reveal that the broad learning method increases the performance compared to the raw features. The findings show that the proposed broad learning-based method exhibits successful performance compared to similar studies based on deep learning using ensemble learning methods and layer stacking.
期刊介绍:
Engineering Science and Technology, an International Journal (JESTECH) (formerly Technology), a peer-reviewed quarterly engineering journal, publishes both theoretical and experimental high quality papers of permanent interest, not previously published in journals, in the field of engineering and applied science which aims to promote the theory and practice of technology and engineering. In addition to peer-reviewed original research papers, the Editorial Board welcomes original research reports, state-of-the-art reviews and communications in the broadly defined field of engineering science and technology.
The scope of JESTECH includes a wide spectrum of subjects including:
-Electrical/Electronics and Computer Engineering (Biomedical Engineering and Instrumentation; Coding, Cryptography, and Information Protection; Communications, Networks, Mobile Computing and Distributed Systems; Compilers and Operating Systems; Computer Architecture, Parallel Processing, and Dependability; Computer Vision and Robotics; Control Theory; Electromagnetic Waves, Microwave Techniques and Antennas; Embedded Systems; Integrated Circuits, VLSI Design, Testing, and CAD; Microelectromechanical Systems; Microelectronics, and Electronic Devices and Circuits; Power, Energy and Energy Conversion Systems; Signal, Image, and Speech Processing)
-Mechanical and Civil Engineering (Automotive Technologies; Biomechanics; Construction Materials; Design and Manufacturing; Dynamics and Control; Energy Generation, Utilization, Conversion, and Storage; Fluid Mechanics and Hydraulics; Heat and Mass Transfer; Micro-Nano Sciences; Renewable and Sustainable Energy Technologies; Robotics and Mechatronics; Solid Mechanics and Structure; Thermal Sciences)
-Metallurgical and Materials Engineering (Advanced Materials Science; Biomaterials; Ceramic and Inorgnanic Materials; Electronic-Magnetic Materials; Energy and Environment; Materials Characterizastion; Metallurgy; Polymers and Nanocomposites)