{"title":"一种新的三次样条插值分类方法","authors":"Husam Ali Abdulmohsin, H. A. Wahab, A. Hossen","doi":"10.32604/IASC.2022.018045","DOIUrl":null,"url":null,"abstract":"Classification is the last, and usually the most time-consuming step in recognition. Most recently proposed classification algorithms have adopted machine learning (ML) as the main classification approach, regardless of time consumption. This study proposes a statistical feature classification cubic spline interpolation (FC-CSI) algorithm to classify emotions in speech using a curve fitting technique. FC-CSI is utilized in a speech emotion recognition system (SERS). The idea is to sketch the cubic spline interpolation (CSI) for each audio file in a dataset and the mean cubic spline interpolations (MCSIs) representing each emotion in the dataset. CSI interpolation is generated by connecting the features extracted from each file in the feature extraction phase. The MCSI is generated by connecting the mean features of 70% of the files of each emotion in the dataset. Points on the CSI are considered the new generated features. To classify each audio file according to emotion, the Euclidian distance (ED) is found between each CSI and all MCSIs of all emotions in the dataset. Each audio file is classified according to the nearest MCSI to the CSI representing it. The three datasets used in this work are Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS), Berlin (Emo-DB), and Surrey Audio-Visual Expressed Emotion (SAVEE). The proposed work shows fast classification and high accuracy of results. The classification accuracy, i.e., the proportion of samples assigned to the correct class, using FC-CSI without feature selection (FS), was 69.08%, 92.52%, and 89.1% with RAVDESS, Emo-DB, and SAVEE, respectively. The results of the proposed method were compared to those of a designed neural network called SER-NN. Comparisons were made with and without FS. FC-CSI outperformed SER-NN on Emo-DB and SAVEE, and underperformed on RAVDESS, without using an FS algorithm. It was noticed from experiments that FC-CSI operated faster than the same system utilizing SER-NN.","PeriodicalId":50357,"journal":{"name":"Intelligent Automation and Soft Computing","volume":"31 1","pages":"339-355"},"PeriodicalIF":2.0000,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"A Novel Classification Method with Cubic Spline Interpolation\",\"authors\":\"Husam Ali Abdulmohsin, H. A. Wahab, A. Hossen\",\"doi\":\"10.32604/IASC.2022.018045\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Classification is the last, and usually the most time-consuming step in recognition. Most recently proposed classification algorithms have adopted machine learning (ML) as the main classification approach, regardless of time consumption. This study proposes a statistical feature classification cubic spline interpolation (FC-CSI) algorithm to classify emotions in speech using a curve fitting technique. FC-CSI is utilized in a speech emotion recognition system (SERS). The idea is to sketch the cubic spline interpolation (CSI) for each audio file in a dataset and the mean cubic spline interpolations (MCSIs) representing each emotion in the dataset. CSI interpolation is generated by connecting the features extracted from each file in the feature extraction phase. The MCSI is generated by connecting the mean features of 70% of the files of each emotion in the dataset. Points on the CSI are considered the new generated features. To classify each audio file according to emotion, the Euclidian distance (ED) is found between each CSI and all MCSIs of all emotions in the dataset. Each audio file is classified according to the nearest MCSI to the CSI representing it. The three datasets used in this work are Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS), Berlin (Emo-DB), and Surrey Audio-Visual Expressed Emotion (SAVEE). The proposed work shows fast classification and high accuracy of results. The classification accuracy, i.e., the proportion of samples assigned to the correct class, using FC-CSI without feature selection (FS), was 69.08%, 92.52%, and 89.1% with RAVDESS, Emo-DB, and SAVEE, respectively. The results of the proposed method were compared to those of a designed neural network called SER-NN. Comparisons were made with and without FS. FC-CSI outperformed SER-NN on Emo-DB and SAVEE, and underperformed on RAVDESS, without using an FS algorithm. It was noticed from experiments that FC-CSI operated faster than the same system utilizing SER-NN.\",\"PeriodicalId\":50357,\"journal\":{\"name\":\"Intelligent Automation and Soft Computing\",\"volume\":\"31 1\",\"pages\":\"339-355\"},\"PeriodicalIF\":2.0000,\"publicationDate\":\"2022-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Intelligent Automation and Soft Computing\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.32604/IASC.2022.018045\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"Computer Science\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Intelligent Automation and Soft Computing","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.32604/IASC.2022.018045","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"Computer Science","Score":null,"Total":0}
A Novel Classification Method with Cubic Spline Interpolation
Classification is the last, and usually the most time-consuming step in recognition. Most recently proposed classification algorithms have adopted machine learning (ML) as the main classification approach, regardless of time consumption. This study proposes a statistical feature classification cubic spline interpolation (FC-CSI) algorithm to classify emotions in speech using a curve fitting technique. FC-CSI is utilized in a speech emotion recognition system (SERS). The idea is to sketch the cubic spline interpolation (CSI) for each audio file in a dataset and the mean cubic spline interpolations (MCSIs) representing each emotion in the dataset. CSI interpolation is generated by connecting the features extracted from each file in the feature extraction phase. The MCSI is generated by connecting the mean features of 70% of the files of each emotion in the dataset. Points on the CSI are considered the new generated features. To classify each audio file according to emotion, the Euclidian distance (ED) is found between each CSI and all MCSIs of all emotions in the dataset. Each audio file is classified according to the nearest MCSI to the CSI representing it. The three datasets used in this work are Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS), Berlin (Emo-DB), and Surrey Audio-Visual Expressed Emotion (SAVEE). The proposed work shows fast classification and high accuracy of results. The classification accuracy, i.e., the proportion of samples assigned to the correct class, using FC-CSI without feature selection (FS), was 69.08%, 92.52%, and 89.1% with RAVDESS, Emo-DB, and SAVEE, respectively. The results of the proposed method were compared to those of a designed neural network called SER-NN. Comparisons were made with and without FS. FC-CSI outperformed SER-NN on Emo-DB and SAVEE, and underperformed on RAVDESS, without using an FS algorithm. It was noticed from experiments that FC-CSI operated faster than the same system utilizing SER-NN.
期刊介绍:
An International Journal seeks to provide a common forum for the dissemination of accurate results about the world of intelligent automation, artificial intelligence, computer science, control, intelligent data science, modeling and systems engineering. It is intended that the articles published in the journal will encompass both the short and the long term effects of soft computing and other related fields such as robotics, control, computer, vision, speech recognition, pattern recognition, data mining, big data, data analytics, machine intelligence, cyber security and deep learning. It further hopes it will address the existing and emerging relationships between automation, systems engineering, system of systems engineering and soft computing. The journal will publish original and survey papers on artificial intelligence, intelligent automation and computer engineering with an emphasis on current and potential applications of soft computing. It will have a broad interest in all engineering disciplines, computer science, and related technological fields such as medicine, biology operations research, technology management, agriculture and information technology.