声学蚊子数据集分布移位的处理

H. Y. Nkouanga, Suresh Singh
{"title":"声学蚊子数据集分布移位的处理","authors":"H. Y. Nkouanga, Suresh Singh","doi":"10.1109/ICMLA55696.2022.00246","DOIUrl":null,"url":null,"abstract":"In recent years, the task of detecting mosquito presence through acoustic data has drawn the attention of many researchers. However, just like in any other detection task, these researchers are often confronted with the distribution shift problem, which alludes to the situation where the training and test datasets do not share the same distribution. A detection system is almost always guaranteed to fail during testing when this situation arises. Solutions to this issue have been proposed over the years, but they are often computationally expensive and complex to implement. In this paper, we propose a simple solution that consists in (1) identifying and getting rid of the noise present in the input data, (2) performing a dimensionality reduction, and (3) classifying the data. We tested our technique on a large and publicly available dataset of mosquito recordings (HumBugDB) and the results showed a maximum improvement of nearly 28% when compared to a baseline classification system.","PeriodicalId":128160,"journal":{"name":"2022 21st IEEE International Conference on Machine Learning and Applications (ICMLA)","volume":"103 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Dealing with Distribution Shift in Acoustic Mosquito Datasets\",\"authors\":\"H. Y. Nkouanga, Suresh Singh\",\"doi\":\"10.1109/ICMLA55696.2022.00246\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In recent years, the task of detecting mosquito presence through acoustic data has drawn the attention of many researchers. However, just like in any other detection task, these researchers are often confronted with the distribution shift problem, which alludes to the situation where the training and test datasets do not share the same distribution. A detection system is almost always guaranteed to fail during testing when this situation arises. Solutions to this issue have been proposed over the years, but they are often computationally expensive and complex to implement. In this paper, we propose a simple solution that consists in (1) identifying and getting rid of the noise present in the input data, (2) performing a dimensionality reduction, and (3) classifying the data. We tested our technique on a large and publicly available dataset of mosquito recordings (HumBugDB) and the results showed a maximum improvement of nearly 28% when compared to a baseline classification system.\",\"PeriodicalId\":128160,\"journal\":{\"name\":\"2022 21st IEEE International Conference on Machine Learning and Applications (ICMLA)\",\"volume\":\"103 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 21st IEEE International Conference on Machine Learning and Applications (ICMLA)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICMLA55696.2022.00246\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 21st IEEE International Conference on Machine Learning and Applications (ICMLA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICMLA55696.2022.00246","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

近年来,利用声学数据检测蚊子的存在引起了许多研究者的关注。然而,就像在任何其他检测任务中一样,这些研究人员经常面临分布移位问题,即训练数据集和测试数据集不共享相同分布的情况。当这种情况出现时,检测系统几乎总是保证在测试过程中失败。针对这个问题的解决方案已经提出了很多年,但是它们通常在计算上很昂贵,而且实现起来很复杂。在本文中,我们提出了一个简单的解决方案,包括(1)识别和消除输入数据中存在的噪声,(2)执行降维,(3)对数据进行分类。我们在一个大型和公开的蚊子记录数据集(HumBugDB)上测试了我们的技术,结果显示,与基线分类系统相比,我们的技术最大改进了近28%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Dealing with Distribution Shift in Acoustic Mosquito Datasets
In recent years, the task of detecting mosquito presence through acoustic data has drawn the attention of many researchers. However, just like in any other detection task, these researchers are often confronted with the distribution shift problem, which alludes to the situation where the training and test datasets do not share the same distribution. A detection system is almost always guaranteed to fail during testing when this situation arises. Solutions to this issue have been proposed over the years, but they are often computationally expensive and complex to implement. In this paper, we propose a simple solution that consists in (1) identifying and getting rid of the noise present in the input data, (2) performing a dimensionality reduction, and (3) classifying the data. We tested our technique on a large and publicly available dataset of mosquito recordings (HumBugDB) and the results showed a maximum improvement of nearly 28% when compared to a baseline classification system.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信