对系统机器学习过程的需求:通过移动恶意软件分类案例研究提出的建议

Gürol Canbek
{"title":"对系统机器学习过程的需求:通过移动恶意软件分类案例研究提出的建议","authors":"Gürol Canbek","doi":"10.1109/ISCTURKEY53027.2021.9654378","DOIUrl":null,"url":null,"abstract":"Machine learning (ML) seems a highly promising solution for many problems in many domains including healthcare and cyber security. Researchers and practitioners try to make use of ML with high expectations of a return of investment in terms of not only money but also effort and time. Those expectations might become similar to “if your only tool is a hammer, then every problem looks like nails” mood. Conducting anML workflow efficiently and correctly is difficult to achieve in reality considering both ML challenges and domain-specific issues. Hence, the interaction and dependencies between ML and domain should be clearly addressed and the steps should be planned and conducted according to certain requirements. This study provides insights into achieving such goals through a systematic ML process that should be conducted from beginning to end. The systematic process is designed as a cycle with eight sub-processes going through introduced spaces (file, sample, class, feature, dataset, model, and finally metric spaces). The dataset quality analysis/comparison sub-process is specifically formed as a quality control gateway. The proposed process is explained via a case study of the Android mobile malware classification problem domain where practical and research problems, as well as possible solutions, are provided.","PeriodicalId":383915,"journal":{"name":"2021 International Conference on Information Security and Cryptology (ISCTURKEY)","volume":"29 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"The need for a systematic machine-learning process: A proposal via a mobile malware classification case study\",\"authors\":\"Gürol Canbek\",\"doi\":\"10.1109/ISCTURKEY53027.2021.9654378\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Machine learning (ML) seems a highly promising solution for many problems in many domains including healthcare and cyber security. Researchers and practitioners try to make use of ML with high expectations of a return of investment in terms of not only money but also effort and time. Those expectations might become similar to “if your only tool is a hammer, then every problem looks like nails” mood. Conducting anML workflow efficiently and correctly is difficult to achieve in reality considering both ML challenges and domain-specific issues. Hence, the interaction and dependencies between ML and domain should be clearly addressed and the steps should be planned and conducted according to certain requirements. This study provides insights into achieving such goals through a systematic ML process that should be conducted from beginning to end. The systematic process is designed as a cycle with eight sub-processes going through introduced spaces (file, sample, class, feature, dataset, model, and finally metric spaces). The dataset quality analysis/comparison sub-process is specifically formed as a quality control gateway. The proposed process is explained via a case study of the Android mobile malware classification problem domain where practical and research problems, as well as possible solutions, are provided.\",\"PeriodicalId\":383915,\"journal\":{\"name\":\"2021 International Conference on Information Security and Cryptology (ISCTURKEY)\",\"volume\":\"29 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-12-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 International Conference on Information Security and Cryptology (ISCTURKEY)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ISCTURKEY53027.2021.9654378\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 International Conference on Information Security and Cryptology (ISCTURKEY)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISCTURKEY53027.2021.9654378","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

摘要

机器学习(ML)似乎是一个非常有前途的解决方案,可以解决包括医疗保健和网络安全在内的许多领域的许多问题。研究人员和从业人员试图利用机器学习,不仅在金钱方面,而且在努力和时间方面都对投资回报抱有很高的期望。这些期望可能会变得类似于“如果你唯一的工具是一把锤子,那么每个问题看起来都像钉子”的情绪。考虑到ML的挑战和特定领域的问题,有效和正确地执行ML工作流在现实中是很难实现的。因此,机器学习和领域之间的交互和依赖关系应该被清楚地处理,并且应该根据一定的需求来计划和执行步骤。这项研究提供了通过系统的机器学习过程实现这些目标的见解,应该从头到尾进行。系统过程被设计成一个循环,其中有八个子过程经过引入的空间(文件、样本、类、特征、数据集、模型,最后是度量空间)。特别形成数据集质量分析/比较子过程作为质量控制网关。通过Android移动恶意软件分类问题领域的案例研究来解释所提出的过程,其中提供了实际和研究问题以及可能的解决方案。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
The need for a systematic machine-learning process: A proposal via a mobile malware classification case study
Machine learning (ML) seems a highly promising solution for many problems in many domains including healthcare and cyber security. Researchers and practitioners try to make use of ML with high expectations of a return of investment in terms of not only money but also effort and time. Those expectations might become similar to “if your only tool is a hammer, then every problem looks like nails” mood. Conducting anML workflow efficiently and correctly is difficult to achieve in reality considering both ML challenges and domain-specific issues. Hence, the interaction and dependencies between ML and domain should be clearly addressed and the steps should be planned and conducted according to certain requirements. This study provides insights into achieving such goals through a systematic ML process that should be conducted from beginning to end. The systematic process is designed as a cycle with eight sub-processes going through introduced spaces (file, sample, class, feature, dataset, model, and finally metric spaces). The dataset quality analysis/comparison sub-process is specifically formed as a quality control gateway. The proposed process is explained via a case study of the Android mobile malware classification problem domain where practical and research problems, as well as possible solutions, are provided.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信