An analysis of acoustic features for accented speech classification

IF 4.3 3区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Egyptian Informatics Journal Pub Date : 2025-07-22 DOI:10.1016/j.eij.2025.100743

Apar Garg , Yassine Aribi , Turke Althobaiti , Tanmay Bhowmik

{"title":"An analysis of acoustic features for accented speech classification","authors":"Apar Garg , Yassine Aribi , Turke Althobaiti , Tanmay Bhowmik","doi":"10.1016/j.eij.2025.100743","DOIUrl":null,"url":null,"abstract":"<div><div>Spoken language is a topic which lured researchers for a long duration. Due to the variety of different voice-based products, the application of spoken language can be observed in various places. Several home assistant systems have become an integral part of our lives as they make mundane tasks such as setting up reminders and checking emails easy. However, non-native English speakers frequently face problems in using automated assistants because of accented speech. This study presents an analysis of speech accent features for accented speech classification. The aim is to identify which speech features are the most important for accurately classifying accents in spoken language. We collected a dataset of accented speech samples and used various feature extraction techniques to extract relevant features from the speech signal. These features included mel frequency cepstral coefficients, zero-crossing rate, spectral features, chroma features, etc. Machine learning algorithms are used to classify the accents based on the extracted features and achieve an overall accuracy of 86.67%. This research work is prompted by the increasing need to develop robust speech recognition systems that can generalize across regional accents. The performance of standard automatic speech recognition systems drops very often due to accented speech. Several studies tend towards deep learning-based solutions; however, there is a lack of focused analysis of the performance of traditional acoustic features in accent discrimination tasks. This study targets to bridge that gap by performing a comparative study on selected acoustic features. The analysis of speech accent features presented in this study can be useful to develop robust speech accent classification systems for applications such as language learning, speech recognition, and accent identification.</div></div>","PeriodicalId":56010,"journal":{"name":"Egyptian Informatics Journal","volume":"31 ","pages":"Article 100743"},"PeriodicalIF":4.3000,"publicationDate":"2025-07-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Egyptian Informatics Journal","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1110866525001367","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Spoken language is a topic which lured researchers for a long duration. Due to the variety of different voice-based products, the application of spoken language can be observed in various places. Several home assistant systems have become an integral part of our lives as they make mundane tasks such as setting up reminders and checking emails easy. However, non-native English speakers frequently face problems in using automated assistants because of accented speech. This study presents an analysis of speech accent features for accented speech classification. The aim is to identify which speech features are the most important for accurately classifying accents in spoken language. We collected a dataset of accented speech samples and used various feature extraction techniques to extract relevant features from the speech signal. These features included mel frequency cepstral coefficients, zero-crossing rate, spectral features, chroma features, etc. Machine learning algorithms are used to classify the accents based on the extracted features and achieve an overall accuracy of 86.67%. This research work is prompted by the increasing need to develop robust speech recognition systems that can generalize across regional accents. The performance of standard automatic speech recognition systems drops very often due to accented speech. Several studies tend towards deep learning-based solutions; however, there is a lack of focused analysis of the performance of traditional acoustic features in accent discrimination tasks. This study targets to bridge that gap by performing a comparative study on selected acoustic features. The analysis of speech accent features presented in this study can be useful to develop robust speech accent classification systems for applications such as language learning, speech recognition, and accent identification.

查看原文本刊更多论文

重音语音分类的声学特征分析

口语是一个长期吸引研究者的课题。由于基于语音的产品种类繁多，因此可以在不同的地方观察到口语的应用。一些家庭助理系统已经成为我们生活中不可或缺的一部分，因为它们使设置提醒和检查电子邮件等日常任务变得容易。然而，非英语为母语的人在使用自动助手时经常会因为口音而遇到问题。本研究对语音重音特征进行分析，用于重音语音分类。目的是确定哪些语音特征对准确分类口语中的口音最重要。我们收集了一个重音语音样本数据集，并使用各种特征提取技术从语音信号中提取相关特征。这些特征包括频谱倒谱系数、过零率、光谱特征、色度特征等。使用机器学习算法根据提取的特征对口音进行分类，总体准确率达到86.67%。这项研究工作是由于越来越需要开发强大的语音识别系统，可以跨区域口音进行概括。标准的自动语音识别系统的性能经常因为重音语音而下降。一些研究倾向于基于深度学习的解决方案；然而，传统声学特征在口音识别任务中的表现缺乏重点分析。本研究旨在通过对选定的声学特征进行比较研究来弥合这一差距。本研究提出的语音口音特征分析有助于开发鲁棒的语音口音分类系统，用于语言学习、语音识别和口音识别等应用。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Egyptian Informatics Journal Decision Sciences-Management Science and Operations Research

CiteScore

11.10

自引率

1.90%

发文量

审稿时长

110 days

期刊介绍： The Egyptian Informatics Journal is published by the Faculty of Computers and Artificial Intelligence, Cairo University. This Journal provides a forum for the state-of-the-art research and development in the fields of computing, including computer sciences, information technologies, information systems, operations research and decision support. Innovative and not-previously-published work in subjects covered by the Journal is encouraged to be submitted, whether from academic, research or commercial sources.