Swagatam Jay Sankar, Utkrisht Singh, I. Ali, M. Naskar, Mahendra Kumar Gourisaria
{"title":"基于机器学习的Android应用成功失败率预测","authors":"Swagatam Jay Sankar, Utkrisht Singh, I. Ali, M. Naskar, Mahendra Kumar Gourisaria","doi":"10.1109/OTCON56053.2023.10113988","DOIUrl":null,"url":null,"abstract":"This paper reports a machine learning model to predict the likelihood of success of android applications. As the android applications are play an important role with in the software industry, it would be a beneficial to study the field. There is currently no archived collection or method for predictingthe success rate of these Android applications. In this research work, we establish a dataset consisting of 30, 000 apps taken from the google play store, third-party apps, and apple storeapps. The dataset is complex and contains about 184 features ofa single application. The data are distributed into two classes malware application and benign applications. The redundant information are dropped from the dataset, following that data cleaning, dimension reduction, and mathematical analysis on the dataset is performed. There are three challenges arise i.e. dataset contain missing value, outliers and class distribution is imbalance. Using the standard techniques the missing value and outliers are treated. For imbalance class distribution various sampling method and cost-sensitive approach are considered. The machine learning algorithms like Logistic Regression (LR), Decision Tree (DT), Support Vector Machine(SVM), and ExtremeGradient Boosting (XG-Boost) are used. It is observed that highest accuracy of 84.44% achieved using ADASYN sampling technique using XGBoost classifier.","PeriodicalId":265966,"journal":{"name":"2022 OPJU International Technology Conference on Emerging Technologies for Sustainable Development (OTCON)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-02-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Success and failure rate prediction of Android Application using Machine Learning\",\"authors\":\"Swagatam Jay Sankar, Utkrisht Singh, I. Ali, M. Naskar, Mahendra Kumar Gourisaria\",\"doi\":\"10.1109/OTCON56053.2023.10113988\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper reports a machine learning model to predict the likelihood of success of android applications. As the android applications are play an important role with in the software industry, it would be a beneficial to study the field. There is currently no archived collection or method for predictingthe success rate of these Android applications. In this research work, we establish a dataset consisting of 30, 000 apps taken from the google play store, third-party apps, and apple storeapps. The dataset is complex and contains about 184 features ofa single application. The data are distributed into two classes malware application and benign applications. The redundant information are dropped from the dataset, following that data cleaning, dimension reduction, and mathematical analysis on the dataset is performed. There are three challenges arise i.e. dataset contain missing value, outliers and class distribution is imbalance. Using the standard techniques the missing value and outliers are treated. For imbalance class distribution various sampling method and cost-sensitive approach are considered. The machine learning algorithms like Logistic Regression (LR), Decision Tree (DT), Support Vector Machine(SVM), and ExtremeGradient Boosting (XG-Boost) are used. It is observed that highest accuracy of 84.44% achieved using ADASYN sampling technique using XGBoost classifier.\",\"PeriodicalId\":265966,\"journal\":{\"name\":\"2022 OPJU International Technology Conference on Emerging Technologies for Sustainable Development (OTCON)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-02-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 OPJU International Technology Conference on Emerging Technologies for Sustainable Development (OTCON)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/OTCON56053.2023.10113988\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 OPJU International Technology Conference on Emerging Technologies for Sustainable Development (OTCON)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/OTCON56053.2023.10113988","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Success and failure rate prediction of Android Application using Machine Learning
This paper reports a machine learning model to predict the likelihood of success of android applications. As the android applications are play an important role with in the software industry, it would be a beneficial to study the field. There is currently no archived collection or method for predictingthe success rate of these Android applications. In this research work, we establish a dataset consisting of 30, 000 apps taken from the google play store, third-party apps, and apple storeapps. The dataset is complex and contains about 184 features ofa single application. The data are distributed into two classes malware application and benign applications. The redundant information are dropped from the dataset, following that data cleaning, dimension reduction, and mathematical analysis on the dataset is performed. There are three challenges arise i.e. dataset contain missing value, outliers and class distribution is imbalance. Using the standard techniques the missing value and outliers are treated. For imbalance class distribution various sampling method and cost-sensitive approach are considered. The machine learning algorithms like Logistic Regression (LR), Decision Tree (DT), Support Vector Machine(SVM), and ExtremeGradient Boosting (XG-Boost) are used. It is observed that highest accuracy of 84.44% achieved using ADASYN sampling technique using XGBoost classifier.