自然界中的声学指纹:生态系统活动监测的自我监督学习方法

IF 5.8 2区 环境科学与生态学 Q1 ECOLOGY
Dario Dematties , Samir Rajani , Rajesh Sankaran , Sean Shahkarami , Bhupendra Raut , Scott Collis , Pete Beckman , Nicola Ferrier
{"title":"自然界中的声学指纹:生态系统活动监测的自我监督学习方法","authors":"Dario Dematties ,&nbsp;Samir Rajani ,&nbsp;Rajesh Sankaran ,&nbsp;Sean Shahkarami ,&nbsp;Bhupendra Raut ,&nbsp;Scott Collis ,&nbsp;Pete Beckman ,&nbsp;Nicola Ferrier","doi":"10.1016/j.ecoinf.2024.102823","DOIUrl":null,"url":null,"abstract":"<div><div>According to the World Health Organization, <em>healthy communities rely on well-functioning ecosystems</em>. Clean air, fresh water, and nutritious food are inextricably linked to ecosystem health. Changes in biological activity convey important information about ecosystem dynamics, and understanding such changes is crucial for the survival of our species. Scientific edge cyberinfrastructures collect distributed data and process it in situ, often using machine learning algorithms. Most current machine learning algorithms deployed on edge cyberinfrastructures, however, are trained on data that does not accurately represent the real stream of data collected at the edge. In this work we explore the applicability of two new self-supervised learning algorithms for characterizing an insufficiently curated, imbalanced, and unlabeled dataset collected by using a set of nine microphones at different locations at the Morton Arboretum, an internationally recognized tree-focused botanical garden and research center in Lisle, IL. Our implementations showed completely autonomous characterization capabilities, such as the separation of spectrograms by recording location, month, week, and hour of the day. The models also showed the ability to discriminate spectrograms by biological and atmospheric activity, including rain, insects, and bird activity, in a completely unsupervised fashion. We validated our findings using a supervised deep learning approach and with a dataset labeled by experts, confirming competitive performance in several features. Toward explainability of our self-supervised learning approach, we used acoustic indices and false color spectrograms, showing that the topology and orientation of the clouds of points in the output space over a 24-h period are strongly linked to the unfolding of biological activity. Our findings show that self-supervised learning has the potential to learn from and process data collected at the edge, characterizing it with minimal human intervention. We believe that further research is crucial to extending this approach for complete autonomous characterization of raw data collected on distributed sensors at the edge.</div></div>","PeriodicalId":51024,"journal":{"name":"Ecological Informatics","volume":"83 ","pages":"Article 102823"},"PeriodicalIF":5.8000,"publicationDate":"2024-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S1574954124003650/pdfft?md5=879940a92e3b5b36fc5955d07c153779&pid=1-s2.0-S1574954124003650-main.pdf","citationCount":"0","resultStr":"{\"title\":\"Acoustic fingerprints in nature: A self-supervised learning approach for ecosystem activity monitoring\",\"authors\":\"Dario Dematties ,&nbsp;Samir Rajani ,&nbsp;Rajesh Sankaran ,&nbsp;Sean Shahkarami ,&nbsp;Bhupendra Raut ,&nbsp;Scott Collis ,&nbsp;Pete Beckman ,&nbsp;Nicola Ferrier\",\"doi\":\"10.1016/j.ecoinf.2024.102823\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>According to the World Health Organization, <em>healthy communities rely on well-functioning ecosystems</em>. Clean air, fresh water, and nutritious food are inextricably linked to ecosystem health. Changes in biological activity convey important information about ecosystem dynamics, and understanding such changes is crucial for the survival of our species. Scientific edge cyberinfrastructures collect distributed data and process it in situ, often using machine learning algorithms. Most current machine learning algorithms deployed on edge cyberinfrastructures, however, are trained on data that does not accurately represent the real stream of data collected at the edge. In this work we explore the applicability of two new self-supervised learning algorithms for characterizing an insufficiently curated, imbalanced, and unlabeled dataset collected by using a set of nine microphones at different locations at the Morton Arboretum, an internationally recognized tree-focused botanical garden and research center in Lisle, IL. Our implementations showed completely autonomous characterization capabilities, such as the separation of spectrograms by recording location, month, week, and hour of the day. The models also showed the ability to discriminate spectrograms by biological and atmospheric activity, including rain, insects, and bird activity, in a completely unsupervised fashion. We validated our findings using a supervised deep learning approach and with a dataset labeled by experts, confirming competitive performance in several features. Toward explainability of our self-supervised learning approach, we used acoustic indices and false color spectrograms, showing that the topology and orientation of the clouds of points in the output space over a 24-h period are strongly linked to the unfolding of biological activity. Our findings show that self-supervised learning has the potential to learn from and process data collected at the edge, characterizing it with minimal human intervention. We believe that further research is crucial to extending this approach for complete autonomous characterization of raw data collected on distributed sensors at the edge.</div></div>\",\"PeriodicalId\":51024,\"journal\":{\"name\":\"Ecological Informatics\",\"volume\":\"83 \",\"pages\":\"Article 102823\"},\"PeriodicalIF\":5.8000,\"publicationDate\":\"2024-09-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.sciencedirect.com/science/article/pii/S1574954124003650/pdfft?md5=879940a92e3b5b36fc5955d07c153779&pid=1-s2.0-S1574954124003650-main.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Ecological Informatics\",\"FirstCategoryId\":\"93\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1574954124003650\",\"RegionNum\":2,\"RegionCategory\":\"环境科学与生态学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ECOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Ecological Informatics","FirstCategoryId":"93","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1574954124003650","RegionNum":2,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ECOLOGY","Score":null,"Total":0}
引用次数: 0

摘要

世界卫生组织指出,健康的社区有赖于运作良好的生态系统。清洁的空气、新鲜的水和营养丰富的食物与生态系统的健康密不可分。生物活动的变化传递着有关生态系统动态的重要信息,了解这些变化对我们人类的生存至关重要。科学边缘网络基础设施收集分布式数据并在现场进行处理,通常使用机器学习算法。然而,目前部署在边缘网络基础设施上的大多数机器学习算法都是在不能准确代表边缘收集的真实数据流的数据上进行训练的。在这项工作中,我们探索了两种新的自监督学习算法的适用性,它们可用于表征未经充分整理、不平衡和无标记的数据集,这些数据集是在伊利诺伊州利索市的莫顿植物园(Morton Arboretum)的不同地点使用一组九个麦克风收集的,莫顿植物园是国际公认的以树木为重点的植物园和研究中心。我们的实施方案展示了完全自主的特征描述能力,例如按录音地点、月份、星期和时间分离频谱图。这些模型还展示了以完全无监督的方式根据生物和大气活动(包括雨、昆虫和鸟类活动)区分频谱图的能力。我们使用有监督的深度学习方法和专家标注的数据集验证了我们的研究结果,证实了在几个特征方面具有竞争力的性能。为了实现自我监督学习方法的可解释性,我们使用了声学指数和假色频谱图,结果表明输出空间中的点云在 24 小时内的拓扑结构和方向与生物活动的展开密切相关。我们的研究结果表明,自我监督学习具有从边缘收集的数据中学习和处理数据的潜力,只需最少的人工干预就能描述数据的特征。我们认为,进一步的研究对于扩展这种方法,以完全自主地描述在边缘分布式传感器上收集的原始数据至关重要。
本文章由计算机程序翻译,如有差异,请以英文原文为准。

Acoustic fingerprints in nature: A self-supervised learning approach for ecosystem activity monitoring

Acoustic fingerprints in nature: A self-supervised learning approach for ecosystem activity monitoring
According to the World Health Organization, healthy communities rely on well-functioning ecosystems. Clean air, fresh water, and nutritious food are inextricably linked to ecosystem health. Changes in biological activity convey important information about ecosystem dynamics, and understanding such changes is crucial for the survival of our species. Scientific edge cyberinfrastructures collect distributed data and process it in situ, often using machine learning algorithms. Most current machine learning algorithms deployed on edge cyberinfrastructures, however, are trained on data that does not accurately represent the real stream of data collected at the edge. In this work we explore the applicability of two new self-supervised learning algorithms for characterizing an insufficiently curated, imbalanced, and unlabeled dataset collected by using a set of nine microphones at different locations at the Morton Arboretum, an internationally recognized tree-focused botanical garden and research center in Lisle, IL. Our implementations showed completely autonomous characterization capabilities, such as the separation of spectrograms by recording location, month, week, and hour of the day. The models also showed the ability to discriminate spectrograms by biological and atmospheric activity, including rain, insects, and bird activity, in a completely unsupervised fashion. We validated our findings using a supervised deep learning approach and with a dataset labeled by experts, confirming competitive performance in several features. Toward explainability of our self-supervised learning approach, we used acoustic indices and false color spectrograms, showing that the topology and orientation of the clouds of points in the output space over a 24-h period are strongly linked to the unfolding of biological activity. Our findings show that self-supervised learning has the potential to learn from and process data collected at the edge, characterizing it with minimal human intervention. We believe that further research is crucial to extending this approach for complete autonomous characterization of raw data collected on distributed sensors at the edge.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Ecological Informatics
Ecological Informatics 环境科学-生态学
CiteScore
8.30
自引率
11.80%
发文量
346
审稿时长
46 days
期刊介绍: The journal Ecological Informatics is devoted to the publication of high quality, peer-reviewed articles on all aspects of computational ecology, data science and biogeography. The scope of the journal takes into account the data-intensive nature of ecology, the growing capacity of information technology to access, harness and leverage complex data as well as the critical need for informing sustainable management in view of global environmental and climate change. The nature of the journal is interdisciplinary at the crossover between ecology and informatics. It focuses on novel concepts and techniques for image- and genome-based monitoring and interpretation, sensor- and multimedia-based data acquisition, internet-based data archiving and sharing, data assimilation, modelling and prediction of ecological data.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信