基于KNN算法的文本独立说话人识别与分类

2022 5th International Conference on Contemporary Computing and Informatics (IC3I) Pub Date : 2022-12-14 DOI:10.1109/IC3I56241.2022.10072615

Sanjay S. Tippannavar, R. Shashidhar, H. R. Sathvik, S. Varun, G. V. Punith, H. G. Nikshep

{"title":"基于KNN算法的文本独立说话人识别与分类","authors":"Sanjay S. Tippannavar, R. Shashidhar, H. R. Sathvik, S. Varun, G. V. Punith, H. G. Nikshep","doi":"10.1109/IC3I56241.2022.10072615","DOIUrl":null,"url":null,"abstract":"The method of automatically identifying the speaker using the speaker-specific data included in voice waves is known as speaker recognition. For speaker recognition, a variety of uses have been investigated. Monitoring, speech-activated secure access control, voice-activated customization of services or information for certain users, instances include using recorded voice samples in forensic and criminal investigations. The application that is now mentioned most often is access control, which also includes voice dialing, banking, telephone shopping, and database access services. Thus, it is projected that speaker recognition technology would provide new services in smart environments and enhance the comfort of daily life. Research has been done on the phenomenon known as “speaker idolization,” which occurs when speakers are automatically added to an input audio channel. It makes speech recognition easier, makes it easier to search and index audio archives, and gives machine transcriptions more depth and intelligibility. An important additional application for voice recognition technology is as a forensics tool. The speaker’s short-time spectral coefficients are described using vector quantization using a codebook. The success of these techniques is assessed from the perspective of robustness against utterance variation, such as variances in content, temporal variation, and changes in utterance pace. The voice of each individual is recorded three times. The experiment’s double distance measurement result is 96.97%, whereas the KNN technique’s single data center result is 84.85% The outcome shows that the twofold distance method increases the precision of voice recognition.","PeriodicalId":274660,"journal":{"name":"2022 5th International Conference on Contemporary Computing and Informatics (IC3I)","volume":"75 24 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Text Independent Speaker Recognition and Classification using KNN Algorithm\",\"authors\":\"Sanjay S. Tippannavar, R. Shashidhar, H. R. Sathvik, S. Varun, G. V. Punith, H. G. Nikshep\",\"doi\":\"10.1109/IC3I56241.2022.10072615\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The method of automatically identifying the speaker using the speaker-specific data included in voice waves is known as speaker recognition. For speaker recognition, a variety of uses have been investigated. Monitoring, speech-activated secure access control, voice-activated customization of services or information for certain users, instances include using recorded voice samples in forensic and criminal investigations. The application that is now mentioned most often is access control, which also includes voice dialing, banking, telephone shopping, and database access services. Thus, it is projected that speaker recognition technology would provide new services in smart environments and enhance the comfort of daily life. Research has been done on the phenomenon known as “speaker idolization,” which occurs when speakers are automatically added to an input audio channel. It makes speech recognition easier, makes it easier to search and index audio archives, and gives machine transcriptions more depth and intelligibility. An important additional application for voice recognition technology is as a forensics tool. The speaker’s short-time spectral coefficients are described using vector quantization using a codebook. The success of these techniques is assessed from the perspective of robustness against utterance variation, such as variances in content, temporal variation, and changes in utterance pace. The voice of each individual is recorded three times. The experiment’s double distance measurement result is 96.97%, whereas the KNN technique’s single data center result is 84.85% The outcome shows that the twofold distance method increases the precision of voice recognition.\",\"PeriodicalId\":274660,\"journal\":{\"name\":\"2022 5th International Conference on Contemporary Computing and Informatics (IC3I)\",\"volume\":\"75 24 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-12-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 5th International Conference on Contemporary Computing and Informatics (IC3I)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IC3I56241.2022.10072615\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 5th International Conference on Contemporary Computing and Informatics (IC3I)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IC3I56241.2022.10072615","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

摘要

使用语音波中包含的特定于说话人的数据自动识别说话人的方法称为说话人识别。对于说话人识别，已经研究了多种用途。监控、语音激活的安全访问控制、针对某些用户的语音激活服务或信息定制，例如在法医和刑事调查中使用录制的语音样本。现在最常提到的应用程序是访问控制，它还包括语音拨号、银行、电话购物和数据库访问服务。因此，预计语音识别技术将在智能环境中提供新的服务，提高日常生活的舒适度。人们对“扬声器偶像化”现象进行了研究，这种现象发生在扬声器被自动添加到输入音频通道时。它使语音识别更容易，使搜索和索引音频档案更容易，并使机器转录更有深度和可理解性。语音识别技术的另一个重要应用是作为法医工具。扬声器的短时间频谱系数是用码本矢量量化来描述的。这些技术的成功是从对话语变化的鲁棒性的角度来评估的，比如内容的变化、时间的变化和话语节奏的变化。每个人的声音被录了三遍。实验的双距离测量结果为96.97%，而KNN技术的单数据中心测量结果为84.85%，结果表明双距离方法提高了语音识别的精度。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Text Independent Speaker Recognition and Classification using KNN Algorithm

The method of automatically identifying the speaker using the speaker-specific data included in voice waves is known as speaker recognition. For speaker recognition, a variety of uses have been investigated. Monitoring, speech-activated secure access control, voice-activated customization of services or information for certain users, instances include using recorded voice samples in forensic and criminal investigations. The application that is now mentioned most often is access control, which also includes voice dialing, banking, telephone shopping, and database access services. Thus, it is projected that speaker recognition technology would provide new services in smart environments and enhance the comfort of daily life. Research has been done on the phenomenon known as “speaker idolization,” which occurs when speakers are automatically added to an input audio channel. It makes speech recognition easier, makes it easier to search and index audio archives, and gives machine transcriptions more depth and intelligibility. An important additional application for voice recognition technology is as a forensics tool. The speaker’s short-time spectral coefficients are described using vector quantization using a codebook. The success of these techniques is assessed from the perspective of robustness against utterance variation, such as variances in content, temporal variation, and changes in utterance pace. The voice of each individual is recorded three times. The experiment’s double distance measurement result is 96.97%, whereas the KNN technique’s single data center result is 84.85% The outcome shows that the twofold distance method increases the precision of voice recognition.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2022 5th International Conference on Contemporary Computing and Informatics (IC3I)

自引率

0.00%

发文量