{"title":"Praditor: A DBSCAN-based automation for speech onset detection.","authors":"Zhengyuan Liu, Xinqi Yu, Wing Chung Hu, Yunxiao Ma, Ruiming Wang, Haoyun Zhang","doi":"10.3758/s13428-025-02776-2","DOIUrl":null,"url":null,"abstract":"<p><p>Speech onset time (SOT) serves as a critical parameter in speech production research, marking the transition from background noise to the start of the speech signal. While manual annotation remains the gold standard for identifying SOT, its labor-intensive nature can result in considerable fatigue, thereby jeopardizing the accuracy of the annotation. Here, we present Praditor, a semi-automatic speech onset detection tool, leveraging a combination of algorithms consisting of density-based spatial clustering of applications with noise (DBSCAN) and first-derivative thresholding. Praditor offers a user-friendly experience across major platforms, including Windows and macOS, eliminating the need for complex setup procedures and offering a GUI that facilitates the tuning procedure. Furthermore, Praditor is capable of processing both multiple-onset and single-onset audio files regardless of language, and generates a TextGrid file for subsequent verification. To assess the accuracy of Praditor, we compared time difference (TD) scores and executed a linear regression analysis between manual and automatic annotations. Results showed that Praditor was highly accurate in both Mandarin and English datasets, as about 90% of the annotations fell within the range of ±20 ms, with corpus-level tuning achieving slightly lower but acceptable accuracy with respect to file-level tuning. This semi-automatic method is expected to offer a general solution for speech onset annotation in a language-independent manner, catering to not only experienced programmers but also users with little to no prior experience. Praditor is openly available on its official GitHub repository ( https://github.com/Paradeluxe/Praditor ).</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"57 9","pages":"247"},"PeriodicalIF":3.9000,"publicationDate":"2025-08-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Behavior Research Methods","FirstCategoryId":"102","ListUrlMain":"https://doi.org/10.3758/s13428-025-02776-2","RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"PSYCHOLOGY, EXPERIMENTAL","Score":null,"Total":0}
引用次数: 0
Abstract
Speech onset time (SOT) serves as a critical parameter in speech production research, marking the transition from background noise to the start of the speech signal. While manual annotation remains the gold standard for identifying SOT, its labor-intensive nature can result in considerable fatigue, thereby jeopardizing the accuracy of the annotation. Here, we present Praditor, a semi-automatic speech onset detection tool, leveraging a combination of algorithms consisting of density-based spatial clustering of applications with noise (DBSCAN) and first-derivative thresholding. Praditor offers a user-friendly experience across major platforms, including Windows and macOS, eliminating the need for complex setup procedures and offering a GUI that facilitates the tuning procedure. Furthermore, Praditor is capable of processing both multiple-onset and single-onset audio files regardless of language, and generates a TextGrid file for subsequent verification. To assess the accuracy of Praditor, we compared time difference (TD) scores and executed a linear regression analysis between manual and automatic annotations. Results showed that Praditor was highly accurate in both Mandarin and English datasets, as about 90% of the annotations fell within the range of ±20 ms, with corpus-level tuning achieving slightly lower but acceptable accuracy with respect to file-level tuning. This semi-automatic method is expected to offer a general solution for speech onset annotation in a language-independent manner, catering to not only experienced programmers but also users with little to no prior experience. Praditor is openly available on its official GitHub repository ( https://github.com/Paradeluxe/Praditor ).
期刊介绍:
Behavior Research Methods publishes articles concerned with the methods, techniques, and instrumentation of research in experimental psychology. The journal focuses particularly on the use of computer technology in psychological research. An annual special issue is devoted to this field.