{"title":"基于最可能音素序列的口语术语检测","authors":"G. Gosztolya, L. Tóth","doi":"10.1109/SAMI.2011.5738856","DOIUrl":null,"url":null,"abstract":"The aim of the spoken term detection task is to find the occurrence of user-entered keywords in an archive of audio recordings. In this area, besides the accuracy of hits returned, the speed of search is also very important, for which an intermediate representation of recordings is normally used. In this paper we evaluate a spoken term detection method which represents the speech signals by their most probable phoneme sequence, on which a dynamic search is then performed. As the accuracy of the phoneme recognizer used is vital, we shall test this method by using several approaches of phoneme identification. We found that our method already achieves satisfactory accuracy, although its run time is still rather high. We also found that this approach is heavily dependent on the performance of the phoneme recognizer.","PeriodicalId":202398,"journal":{"name":"2011 IEEE 9th International Symposium on Applied Machine Intelligence and Informatics (SAMI)","volume":"3 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"9","resultStr":"{\"title\":\"Spoken term detection based on the most probable phoneme sequence\",\"authors\":\"G. Gosztolya, L. Tóth\",\"doi\":\"10.1109/SAMI.2011.5738856\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The aim of the spoken term detection task is to find the occurrence of user-entered keywords in an archive of audio recordings. In this area, besides the accuracy of hits returned, the speed of search is also very important, for which an intermediate representation of recordings is normally used. In this paper we evaluate a spoken term detection method which represents the speech signals by their most probable phoneme sequence, on which a dynamic search is then performed. As the accuracy of the phoneme recognizer used is vital, we shall test this method by using several approaches of phoneme identification. We found that our method already achieves satisfactory accuracy, although its run time is still rather high. We also found that this approach is heavily dependent on the performance of the phoneme recognizer.\",\"PeriodicalId\":202398,\"journal\":{\"name\":\"2011 IEEE 9th International Symposium on Applied Machine Intelligence and Informatics (SAMI)\",\"volume\":\"3 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2011-03-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"9\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2011 IEEE 9th International Symposium on Applied Machine Intelligence and Informatics (SAMI)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/SAMI.2011.5738856\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2011 IEEE 9th International Symposium on Applied Machine Intelligence and Informatics (SAMI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SAMI.2011.5738856","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Spoken term detection based on the most probable phoneme sequence
The aim of the spoken term detection task is to find the occurrence of user-entered keywords in an archive of audio recordings. In this area, besides the accuracy of hits returned, the speed of search is also very important, for which an intermediate representation of recordings is normally used. In this paper we evaluate a spoken term detection method which represents the speech signals by their most probable phoneme sequence, on which a dynamic search is then performed. As the accuracy of the phoneme recognizer used is vital, we shall test this method by using several approaches of phoneme identification. We found that our method already achieves satisfactory accuracy, although its run time is still rather high. We also found that this approach is heavily dependent on the performance of the phoneme recognizer.