The Red Hen Audio Tagger

IF 1.1 2区 文学 0 LANGUAGE & LINGUISTICS
Sabyasachi Ghosal, Austin Bennett, Mark Turner
{"title":"The Red Hen Audio Tagger","authors":"Sabyasachi Ghosal, Austin Bennett, Mark Turner","doi":"10.1515/lingvan-2022-0130","DOIUrl":null,"url":null,"abstract":"The International Distributed Little Red Hen Lab, usually called “Red Hen Lab” or just “Red Hen”, is dedicated to research into multimodal communication. In this article, we introduce the Red Hen Audio Tagger (RHAT), a novel, publicly available open source platform developed by Red Hen Lab. RHAT employs deep learning models to tag audio elements frame by frame, generating metadata tags that can be utilized in various data formats for analysis. RHAT seamlessly integrates with widely used linguistic research tools like ELAN: the researcher can use RHAT to tag audio content automatically and display those tags alongside other ELAN annotation tiers. RHAT additionally complements existing Red Hen pipelines devoted to natural language processing, speech-to-text processing, body pose analysis, optical character recognition, named entity recognition, computer vision, semantic frame recognition, and so on. These cooperating Red Hen pipelines are research tools to advance the science of multimodal communication.","PeriodicalId":55960,"journal":{"name":"Linguistics Vanguard","volume":"240 1","pages":""},"PeriodicalIF":1.1000,"publicationDate":"2024-04-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Linguistics Vanguard","FirstCategoryId":"98","ListUrlMain":"https://doi.org/10.1515/lingvan-2022-0130","RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"0","JCRName":"LANGUAGE & LINGUISTICS","Score":null,"Total":0}
引用次数: 0

Abstract

The International Distributed Little Red Hen Lab, usually called “Red Hen Lab” or just “Red Hen”, is dedicated to research into multimodal communication. In this article, we introduce the Red Hen Audio Tagger (RHAT), a novel, publicly available open source platform developed by Red Hen Lab. RHAT employs deep learning models to tag audio elements frame by frame, generating metadata tags that can be utilized in various data formats for analysis. RHAT seamlessly integrates with widely used linguistic research tools like ELAN: the researcher can use RHAT to tag audio content automatically and display those tags alongside other ELAN annotation tiers. RHAT additionally complements existing Red Hen pipelines devoted to natural language processing, speech-to-text processing, body pose analysis, optical character recognition, named entity recognition, computer vision, semantic frame recognition, and so on. These cooperating Red Hen pipelines are research tools to advance the science of multimodal communication.
小红母鸡音频标记
国际分布式小红母鸡实验室(通常称为 "红母鸡实验室 "或简称 "红母鸡")致力于多模态通信研究。在本文中,我们将介绍红母鸡实验室开发的新型公开开源平台--红母鸡音频标记(RHAT)。RHAT 采用深度学习模型逐帧标记音频元素,生成的元数据标签可用于各种数据格式的分析。RHAT 与广泛使用的语言学研究工具(如 ELAN)无缝集成:研究人员可以使用 RHAT 自动标记音频内容,并将这些标记与其他 ELAN 注释层一起显示。此外,RHAT 还能补充现有的红轩管道,包括自然语言处理、语音到文本处理、人体姿态分析、光学字符识别、命名实体识别、计算机视觉、语义框架识别等。这些合作的红母鸡管道是推动多模态通信科学发展的研究工具。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
2.00
自引率
18.20%
发文量
105
期刊介绍: Linguistics Vanguard is a new channel for high quality articles and innovative approaches in all major fields of linguistics. This multimodal journal is published solely online and provides an accessible platform supporting both traditional and new kinds of publications. Linguistics Vanguard seeks to publish concise and up-to-date reports on the state of the art in linguistics as well as cutting-edge research papers. With its topical breadth of coverage and anticipated quick rate of production, it is one of the leading platforms for scientific exchange in linguistics. Its broad theoretical range, international scope, and diversity of article formats engage students and scholars alike. All topics within linguistics are welcome. The journal especially encourages submissions taking advantage of its new multimodal platform designed to integrate interactive content, including audio and video, images, maps, software code, raw data, and any other media that enhances the traditional written word. The novel platform and concise article format allows for rapid turnaround of submissions. Full peer review assures quality and enables authors to receive appropriate credit for their work. The journal publishes general submissions as well as special collections. Ideas for special collections may be submitted to the editors for consideration.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信