{"title":"RF-GML: Reference-Free Generative Machine Listener","authors":"Arijit Biswas, Guanxin Jiang","doi":"arxiv-2409.10210","DOIUrl":null,"url":null,"abstract":"This paper introduces a novel reference-free (RF) audio quality metric called\nthe RF-Generative Machine Listener (RF-GML), designed to evaluate coded mono,\nstereo, and binaural audio at a 48 kHz sample rate. RF-GML leverages transfer\nlearning from a state-of-the-art full-reference (FR) Generative Machine\nListener (GML) with minimal architectural modifications. The term \"generative\"\nrefers to the model's ability to generate an arbitrary number of simulated\nlistening scores. Unlike existing RF models, RF-GML accurately predicts\nsubjective quality scores across diverse content types and codecs. Extensive\nevaluations demonstrate its superiority in rating unencoded audio and\ndistinguishing different levels of coding artifacts. RF-GML's performance and\nversatility make it a valuable tool for coded audio quality assessment and\nmonitoring in various applications, all without the need for a reference\nsignal.","PeriodicalId":501284,"journal":{"name":"arXiv - EE - Audio and Speech Processing","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - EE - Audio and Speech Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.10210","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
This paper introduces a novel reference-free (RF) audio quality metric called
the RF-Generative Machine Listener (RF-GML), designed to evaluate coded mono,
stereo, and binaural audio at a 48 kHz sample rate. RF-GML leverages transfer
learning from a state-of-the-art full-reference (FR) Generative Machine
Listener (GML) with minimal architectural modifications. The term "generative"
refers to the model's ability to generate an arbitrary number of simulated
listening scores. Unlike existing RF models, RF-GML accurately predicts
subjective quality scores across diverse content types and codecs. Extensive
evaluations demonstrate its superiority in rating unencoded audio and
distinguishing different levels of coding artifacts. RF-GML's performance and
versatility make it a valuable tool for coded audio quality assessment and
monitoring in various applications, all without the need for a reference
signal.