{"title":"使用标准文本索引器进行基于词格的口语文档索引","authors":"F. Seide, K. Thambiratnam, Roger Peng Yu","doi":"10.1109/SLT.2008.4777898","DOIUrl":null,"url":null,"abstract":"Indexing the spoken content of audio recordings requires automatic speech recognition, which is as of today not reliable. Unlike indexing text, we cannot reliably know from a speech recognizer whether a word is present at a given point in the audio; we can only obtain a probability for it. Correct use of these probabilities significantly improves spoken-document search accuracy. In this paper, we will first describe how to improve accuracy for \"web-search style\" (AND/phrase) queries into audio, by utilizing speech recognition alternates and word posterior probabilities based on word lattices. Then, we will present an end-to-end approach to doing so using standard text indexers, which by design cannot handle probabilities and unaligned alternates. We present a sequence of approximations that transform the numeric lattice-matching problem into a symbolic text-based one that can be implemented by a commercial full-text indexer. Experiments on a 170-hour lecture set show an accuracy improvement by 30-60% for phrase searches and by 130% for two-term AND queries, compared to indexing linear text.","PeriodicalId":186876,"journal":{"name":"2008 IEEE Spoken Language Technology Workshop","volume":"2 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2008-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"12","resultStr":"{\"title\":\"Word-lattice based spoken-document indexing with standard text indexers\",\"authors\":\"F. Seide, K. Thambiratnam, Roger Peng Yu\",\"doi\":\"10.1109/SLT.2008.4777898\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Indexing the spoken content of audio recordings requires automatic speech recognition, which is as of today not reliable. Unlike indexing text, we cannot reliably know from a speech recognizer whether a word is present at a given point in the audio; we can only obtain a probability for it. Correct use of these probabilities significantly improves spoken-document search accuracy. In this paper, we will first describe how to improve accuracy for \\\"web-search style\\\" (AND/phrase) queries into audio, by utilizing speech recognition alternates and word posterior probabilities based on word lattices. Then, we will present an end-to-end approach to doing so using standard text indexers, which by design cannot handle probabilities and unaligned alternates. We present a sequence of approximations that transform the numeric lattice-matching problem into a symbolic text-based one that can be implemented by a commercial full-text indexer. Experiments on a 170-hour lecture set show an accuracy improvement by 30-60% for phrase searches and by 130% for two-term AND queries, compared to indexing linear text.\",\"PeriodicalId\":186876,\"journal\":{\"name\":\"2008 IEEE Spoken Language Technology Workshop\",\"volume\":\"2 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2008-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"12\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2008 IEEE Spoken Language Technology Workshop\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/SLT.2008.4777898\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2008 IEEE Spoken Language Technology Workshop","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SLT.2008.4777898","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Word-lattice based spoken-document indexing with standard text indexers
Indexing the spoken content of audio recordings requires automatic speech recognition, which is as of today not reliable. Unlike indexing text, we cannot reliably know from a speech recognizer whether a word is present at a given point in the audio; we can only obtain a probability for it. Correct use of these probabilities significantly improves spoken-document search accuracy. In this paper, we will first describe how to improve accuracy for "web-search style" (AND/phrase) queries into audio, by utilizing speech recognition alternates and word posterior probabilities based on word lattices. Then, we will present an end-to-end approach to doing so using standard text indexers, which by design cannot handle probabilities and unaligned alternates. We present a sequence of approximations that transform the numeric lattice-matching problem into a symbolic text-based one that can be implemented by a commercial full-text indexer. Experiments on a 170-hour lecture set show an accuracy improvement by 30-60% for phrase searches and by 130% for two-term AND queries, compared to indexing linear text.