Micam: Visualizing Feature Extraction of Nonnatural Data

Randy Klepetko, R. Krishnan
{"title":"Micam: Visualizing Feature Extraction of Nonnatural Data","authors":"Randy Klepetko, R. Krishnan","doi":"10.5121/csit.2023.130201","DOIUrl":null,"url":null,"abstract":"Convolutional Neural Networks (CNN) continue to revolutionize image recognition technology and are being used in non-image related fields such as cybersecurity. They are known to work as feature extractors, identifying patterns within large data sets, but when dealing with nonnatural data, what these features represent is not understood. Several class activation map (CAM) visualization tools are available that assist with understanding the CNN decisions when used with images, but they are not intuitively comprehended when dealing with nonnatural security data. Understanding what the extracted features represent should enable the data analyst and model architect tailor a model to maximize the extracted features while minimizing the computational parameters. In this paper we offer a new tool Model integrated Class Activation Maps, (MiCAM) which allows the analyst the ability to visually compare extracted feature intensities at the individual layer detail. We explore using this new tool to analyse several datasets. First the MNIST handwriting data set to gain a baseline understanding. We then analyse two security data sets: computers process metrics from cloud based application servers that are infected with malware and the CIC-IDS-2017 IP data traffic set and identify how re-ordering nonnatural security related data affects feature extraction performance and identify how reordering the data affect feature extraction performance.","PeriodicalId":132577,"journal":{"name":"Machine Learning and Soft Computing","volume":"99 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-01-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Machine Learning and Soft Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5121/csit.2023.130201","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Convolutional Neural Networks (CNN) continue to revolutionize image recognition technology and are being used in non-image related fields such as cybersecurity. They are known to work as feature extractors, identifying patterns within large data sets, but when dealing with nonnatural data, what these features represent is not understood. Several class activation map (CAM) visualization tools are available that assist with understanding the CNN decisions when used with images, but they are not intuitively comprehended when dealing with nonnatural security data. Understanding what the extracted features represent should enable the data analyst and model architect tailor a model to maximize the extracted features while minimizing the computational parameters. In this paper we offer a new tool Model integrated Class Activation Maps, (MiCAM) which allows the analyst the ability to visually compare extracted feature intensities at the individual layer detail. We explore using this new tool to analyse several datasets. First the MNIST handwriting data set to gain a baseline understanding. We then analyse two security data sets: computers process metrics from cloud based application servers that are infected with malware and the CIC-IDS-2017 IP data traffic set and identify how re-ordering nonnatural security related data affects feature extraction performance and identify how reordering the data affect feature extraction performance.
Micam:非自然数据的可视化特征提取
卷积神经网络(CNN)继续革新图像识别技术,并被用于网络安全等非图像相关领域。众所周知,它们作为特征提取器工作,识别大型数据集中的模式,但是当处理非自然数据时,这些特征代表什么就不被理解了。当与图像一起使用时,有几个类激活图(CAM)可视化工具可以帮助理解CNN决策,但在处理非自然安全数据时,它们不能直观地理解。了解提取的特征表示什么应该使数据分析师和模型架构师能够定制模型,以最大化提取的特征,同时最小化计算参数。在本文中,我们提供了一个新的工具模型集成类激活图(MiCAM),它允许分析人员能够在单个层细节上直观地比较提取的特征强度。我们探索使用这个新工具来分析几个数据集。首先,MNIST手写数据集获得基线理解。然后,我们分析了两个安全数据集:来自受恶意软件感染的基于云的应用服务器的计算机处理指标和CIC-IDS-2017 IP数据流量集,并确定重新排序非自然安全相关数据如何影响特征提取性能,以及确定重新排序数据如何影响特征提取性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信