扬森 2.0：时频域音频绘制

arXiv - EE - Audio and Speech Processing Pub Date : 2024-09-10 DOI:arxiv-2409.06392

Ondřej Mokrý, Peter Balušík, Pavel Rajmic

{"title":"扬森 2.0：时频域音频绘制","authors":"Ondřej Mokrý, Peter Balušík, Pavel Rajmic","doi":"arxiv-2409.06392","DOIUrl":null,"url":null,"abstract":"The paper focuses on inpainting missing parts of an audio signal spectrogram.\nFirst, a recent successful approach based on an untrained neural network is\nrevised and its several modifications are proposed, improving the\nsignal-to-noise ratio of the restored audio. Second, the Janssen algorithm, the\nautoregression-based state-of-the-art for time-domain audio inpainting, is\nadapted for the time-frequency setting. This novel method, coined Janssen-TF,\nis compared to the neural network approach using both objective metrics and a\nsubjective listening test, proving Janssen-TF to be superior in all the\nconsidered measures.","PeriodicalId":501284,"journal":{"name":"arXiv - EE - Audio and Speech Processing","volume":"5 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Janssen 2.0: Audio Inpainting in the Time-frequency Domain\",\"authors\":\"Ondřej Mokrý, Peter Balušík, Pavel Rajmic\",\"doi\":\"arxiv-2409.06392\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The paper focuses on inpainting missing parts of an audio signal spectrogram.\\nFirst, a recent successful approach based on an untrained neural network is\\nrevised and its several modifications are proposed, improving the\\nsignal-to-noise ratio of the restored audio. Second, the Janssen algorithm, the\\nautoregression-based state-of-the-art for time-domain audio inpainting, is\\nadapted for the time-frequency setting. This novel method, coined Janssen-TF,\\nis compared to the neural network approach using both objective metrics and a\\nsubjective listening test, proving Janssen-TF to be superior in all the\\nconsidered measures.\",\"PeriodicalId\":501284,\"journal\":{\"name\":\"arXiv - EE - Audio and Speech Processing\",\"volume\":\"5 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-09-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - EE - Audio and Speech Processing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2409.06392\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - EE - Audio and Speech Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.06392","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

首先，本文对最近一种基于未训练神经网络的成功方法进行了修订，并提出了若干修改意见，从而提高了还原音频的信噪比。其次，Janssen 算法是基于回归的时域音频绘制的最先进算法，该算法适用于时频设置。这种被称为 Janssen-TF 的新方法使用客观指标和主观听力测试与神经网络方法进行了比较，证明 Janssen-TF 在所有考虑的指标上都更胜一筹。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Janssen 2.0: Audio Inpainting in the Time-frequency Domain

The paper focuses on inpainting missing parts of an audio signal spectrogram. First, a recent successful approach based on an untrained neural network is revised and its several modifications are proposed, improving the signal-to-noise ratio of the restored audio. Second, the Janssen algorithm, the autoregression-based state-of-the-art for time-domain audio inpainting, is adapted for the time-frequency setting. This novel method, coined Janssen-TF, is compared to the neural network approach using both objective metrics and a subjective listening test, proving Janssen-TF to be superior in all the considered measures.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

arXiv - EE - Audio and Speech Processing

自引率

0.00%

发文量