{"title":"Automatic design optimization of preference-based subjective evaluation with online learning in crowdsourcing environment","authors":"Yusuke Yasuda, Tomoki Toda","doi":"10.1016/j.csl.2025.101888","DOIUrl":null,"url":null,"abstract":"<div><div>Preference-based subjective evaluation is a key method for reliably evaluating generative media. However, its huge number of pair combinations makes it prohibitively difficult to apply to large-scale evaluation using crowdsourcing. To address this issue, we propose an automatic optimization method for preference-based subjective evaluation in terms of pair combination selections and the allocation of evaluation volumes with online learning in a crowdsourcing environment. We use a preference-based online learning method based on a sorting algorithm to identify the total order of systems with minimum sample volumes. Our online learning algorithm supports parallel and asynchronous executions under fixed-budget conditions required for crowdsourcing. Our experiment on the preference-based subjective evaluation of synthetic speech on naturalness shows that our method successfully optimizes the preference-based test by reducing the number of pair combinations from 351 to 83 and allocating optimal evaluation volumes for each pair ranging from 30 to 663 without compromising evaluation errors and wasting budget allocations.</div></div>","PeriodicalId":50638,"journal":{"name":"Computer Speech and Language","volume":"96 ","pages":"Article 101888"},"PeriodicalIF":3.4000,"publicationDate":"2025-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer Speech and Language","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0885230825001135","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Preference-based subjective evaluation is a key method for reliably evaluating generative media. However, its huge number of pair combinations makes it prohibitively difficult to apply to large-scale evaluation using crowdsourcing. To address this issue, we propose an automatic optimization method for preference-based subjective evaluation in terms of pair combination selections and the allocation of evaluation volumes with online learning in a crowdsourcing environment. We use a preference-based online learning method based on a sorting algorithm to identify the total order of systems with minimum sample volumes. Our online learning algorithm supports parallel and asynchronous executions under fixed-budget conditions required for crowdsourcing. Our experiment on the preference-based subjective evaluation of synthetic speech on naturalness shows that our method successfully optimizes the preference-based test by reducing the number of pair combinations from 351 to 83 and allocating optimal evaluation volumes for each pair ranging from 30 to 663 without compromising evaluation errors and wasting budget allocations.
期刊介绍:
Computer Speech & Language publishes reports of original research related to the recognition, understanding, production, coding and mining of speech and language.
The speech and language sciences have a long history, but it is only relatively recently that large-scale implementation of and experimentation with complex models of speech and language processing has become feasible. Such research is often carried out somewhat separately by practitioners of artificial intelligence, computer science, electronic engineering, information retrieval, linguistics, phonetics, or psychology.