vak: a neural network framework for researchers studying animal acoustic communication

D. Nicholson, Y. Cohen
{"title":"vak: a neural network framework for researchers studying animal acoustic communication","authors":"D. Nicholson, Y. Cohen","doi":"10.25080/gerudo-f2bc6f59-008","DOIUrl":null,"url":null,"abstract":"—How is speech like birdsong? What do we mean when we say an animal learns their vocalizations? Questions like these are answered by studying how animals communicate with sound. As in many other fields, the study of acoustic communication is being revolutionized by deep neural network models. These models enable answering questions that were previously impossible to address, in part because the models automate analysis of very large datasets. Acoustic communication researchers have developed multiple models for similar tasks, often implemented as research code with one of several libraries, such as Keras and Pytorch. This situation has created a real need for a framework that allows researchers to easily benchmark multiple models, and test new models, with their own data. To address this need, we developed vak (https://github.com/vocalpy/vak), a neural network framework designed for acoustic communication researchers. (\"vak\" is pronounced like \"talk\" or \"squawk\" and was chosen for its similarity to the Latin root voc , as in \"vocal\".) Here we describe the design of the vak, and explain how the framework makes it easy for researchers to apply neural network models to their own data. We highlight enhancements made in version 1.0 that significantly improve user experience with the library. To provide researchers without expertise in deep learning access to these models, vak can be run via a command-line interface that uses configuration files. Vak can also be used directly in scripts by scientist-coders. To achieve this, vak adapts design patterns and an API from other domain-specific PyTorch libraries such as torchvision, with modules representing neural network operations, models, datasets, and transformations for pre-and post-processing. vak also leverages the Lightning library as a backend, so that vak developers and users can focus on the domain. We provide proof-of-concept results showing how vak can be used to test new models and compare existing models from multiple model families. In closing we discuss our roadmap for development and vision for the community","PeriodicalId":364654,"journal":{"name":"Proceedings of the Python in Science Conference","volume":"10 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Python in Science Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.25080/gerudo-f2bc6f59-008","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

—How is speech like birdsong? What do we mean when we say an animal learns their vocalizations? Questions like these are answered by studying how animals communicate with sound. As in many other fields, the study of acoustic communication is being revolutionized by deep neural network models. These models enable answering questions that were previously impossible to address, in part because the models automate analysis of very large datasets. Acoustic communication researchers have developed multiple models for similar tasks, often implemented as research code with one of several libraries, such as Keras and Pytorch. This situation has created a real need for a framework that allows researchers to easily benchmark multiple models, and test new models, with their own data. To address this need, we developed vak (https://github.com/vocalpy/vak), a neural network framework designed for acoustic communication researchers. ("vak" is pronounced like "talk" or "squawk" and was chosen for its similarity to the Latin root voc , as in "vocal".) Here we describe the design of the vak, and explain how the framework makes it easy for researchers to apply neural network models to their own data. We highlight enhancements made in version 1.0 that significantly improve user experience with the library. To provide researchers without expertise in deep learning access to these models, vak can be run via a command-line interface that uses configuration files. Vak can also be used directly in scripts by scientist-coders. To achieve this, vak adapts design patterns and an API from other domain-specific PyTorch libraries such as torchvision, with modules representing neural network operations, models, datasets, and transformations for pre-and post-processing. vak also leverages the Lightning library as a backend, so that vak developers and users can focus on the domain. We provide proof-of-concept results showing how vak can be used to test new models and compare existing models from multiple model families. In closing we discuss our roadmap for development and vision for the community
Vak:一个用于研究动物声音交流的神经网络框架
说话怎么像鸟叫?我们说动物学会发声是什么意思?这样的问题可以通过研究动物如何用声音交流来回答。与许多其他领域一样,深度神经网络模型正在彻底改变声通信的研究。这些模型可以回答以前无法解决的问题,部分原因是这些模型可以自动分析非常大的数据集。声学通信研究人员已经为类似的任务开发了多个模型,通常使用几个库中的一个实现为研究代码,例如Keras和Pytorch。这种情况确实需要一个框架,使研究人员能够轻松地对多个模型进行基准测试,并使用自己的数据测试新模型。为了满足这一需求,我们开发了vak (https://github.com/vocalpy/vak),这是一个为声学通信研究人员设计的神经网络框架。(“vak”的发音类似于“talk”或“squawk”,选择它是因为它与拉丁语词根voc相似,如“vocal”。)在这里,我们描述了vak的设计,并解释了该框架如何使研究人员能够轻松地将神经网络模型应用于他们自己的数据。我们重点介绍1.0版本中所做的增强,这些增强显著改善了库的用户体验。为了让没有深度学习专业知识的研究人员能够访问这些模型,vak可以通过使用配置文件的命令行界面运行。Vak也可以由科学家编码人员直接在脚本中使用。为了实现这一点,vak采用了来自其他特定领域PyTorch库(如torchvision)的设计模式和API,并使用模块表示神经网络操作、模型、数据集和用于预处理和后处理的转换。vak还利用闪电库作为后端,以便vak开发人员和用户可以专注于该领域。我们提供了概念验证结果,展示了如何使用vak来测试新模型并比较来自多个模型族的现有模型。最后,我们将讨论我们的发展路线图和社区愿景
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信