Philip Hanna, Ian M. O'Neill, C. Wootton, M. McTear
{"title":"Promoting extension and reuse in a spoken dialog manager: An evaluation of the queen's communicator","authors":"Philip Hanna, Ian M. O'Neill, C. Wootton, M. McTear","doi":"10.1145/1255171.1255173","DOIUrl":"https://doi.org/10.1145/1255171.1255173","url":null,"abstract":"This article describes how an object-oriented approach can be applied to the architectural design of a spoken language dialog system with the aim of facilitating the modification, extension, and reuse of discourse-related expertise. The architecture of the developed system is described and a functionally similar VoiceXML system is used to provide a comparative baseline across a range of modification and reuse scenarios. It is shown that the use of an object-oriented dialog manager can provide a capable means of reusing existing discourse expertise in a manner that limits the degree of structural decay associated with system change.","PeriodicalId":412532,"journal":{"name":"ACM Trans. Speech Lang. Process.","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126606525","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Chinese semantic dependency analysis: Construction of a treebank and its use in classification","authors":"Jiajun Yan, D. Bracewell, S. Kuroiwa, F. Ren","doi":"10.1145/1233912.1233914","DOIUrl":"https://doi.org/10.1145/1233912.1233914","url":null,"abstract":"Semantic analysis is a standard tool in the Natural Language Processing (NLP) toolbox with widespread applications. In this article, we look at tagging part of the Penn Chinese Treebank with semantic dependency. Then we take this tagged data to train a maximum entropy classifier to label the semantic relations between headwords and dependents to perform semantic analysis on Chinese sentences. The classifier was able to achieve an accuracy of over 84%. We then analyze the errors in classification to determine the problems and possible solutions for this type of semantic analysis.","PeriodicalId":412532,"journal":{"name":"ACM Trans. Speech Lang. Process.","volume":"5 2","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131752401","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The Pyramid Method: Incorporating human content selection variation in summarization evaluation","authors":"A. Nenkova, R. Passonneau, Kathleen McKeown","doi":"10.1145/1233912.1233913","DOIUrl":"https://doi.org/10.1145/1233912.1233913","url":null,"abstract":"Human variation in content selection in summarization has given rise to some fundamental research questions: How can one incorporate the observed variation in suitable evaluation measures? How can such measures reflect the fact that summaries conveying different content can be equally good and informative? In this article, we address these very questions by proposing a method for analysis of multiple human abstracts into semantic content units. Such analysis allows us not only to quantify human variation in content selection, but also to assign empirical importance weight to different content units. It serves as the basis for an evaluation method, the Pyramid Method, that incorporates the observed variation and is predictive of different equally informative summaries. We discuss the reliability of content unit annotation, the properties of Pyramid scores, and their correlation with other evaluation methods.","PeriodicalId":412532,"journal":{"name":"ACM Trans. Speech Lang. Process.","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115517325","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Confidence estimation for NLP applications","authors":"Simona Gandrabur, George F. Foster, G. Lapalme","doi":"10.1145/1177055.1177057","DOIUrl":"https://doi.org/10.1145/1177055.1177057","url":null,"abstract":"Confidence measures are a practical solution for improving the usefulness of Natural Language Processing applications. Confidence estimation is a generic machine learning approach for deriving confidence measures. We give an overview of the application of confidence estimation in various fields of Natural Language Processing, and present experimental results for speech recognition, spoken language understanding, and statistical machine translation.","PeriodicalId":412532,"journal":{"name":"ACM Trans. Speech Lang. Process.","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122704436","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An active approach to spoken language processing","authors":"Dilek Z. Hakkani-Tür, G. Riccardi, Gökhan Tür","doi":"10.1145/1177055.1177056","DOIUrl":"https://doi.org/10.1145/1177055.1177056","url":null,"abstract":"State of the art data-driven speech and language processing systems require a large amount of human intervention ranging from data annotation to system prototyping. In the traditional supervised passive approach, the system is trained on a given number of annotated data samples and evaluated using a separate test set. Then more data is collected arbitrarily, annotated, and the whole cycle is repeated. In this article, we propose the active approach where the system itself selects its own training data, evaluates itself and re-trains when necessary. We first employ active learning which aims to automatically select the examples that are likely to be the most informative for a given task. We use active learning for both selecting the examples to label and the examples to re-label in order to correct labeling errors. Furthermore, the system automatically evaluates itself using active evaluation to keep track of the unexpected events and decides on-demand to label more examples. The active approach enables dynamic adaptation of spoken language processing systems to unseen or unexpected events for nonstationary input while reducing the manual annotation effort significantly. We have evaluated the active approach with the AT&T spoken dialog system used for customer care applications. In this article, we present our results for both automatic speech recognition and spoken language understanding.","PeriodicalId":412532,"journal":{"name":"ACM Trans. Speech Lang. Process.","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127127966","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"One story, one flow: Hidden Markov Story Models for multilingual multidocument summarization","authors":"Pascale Fung, G. Ngai","doi":"10.1145/1149290.1151099","DOIUrl":"https://doi.org/10.1145/1149290.1151099","url":null,"abstract":"This article presents a multidocument, multilingual, theme-based summarization system based on modeling text cohesion (story flow). Conventional extractive summarization systems which pick out salient sentences to include in a summary often disregard any flow or sequence that might exist between these sentences. We argue that such inherent text cohesion exists and is (1) specific to a particular story and (2) specific to a particular language. Documents within the same story, and in the same language, share a common story flow, and this flow differs across stories, and across languages. We propose using Hidden Markov Models (HMMs) as story models. An unsupervised segmental K-means method is used to iteratively cluster multiple documents into different topics (stories) and learn the parameters of parallel Hidden Markov Story Models (HMSM), one for each story. We compare story models within and across stories and within and across languages (English and Chinese). The experimental results support our “one story, one flow” and “one language, one flow” hypotheses. We also propose a Naïve Bayes classifier for document summarization. The performance of our summarizer is superior to conventional methods that do not incorporate text cohesion information. Our HMSM method also provides a simple way to compile a single metasummary for multiple documents from individual summaries via state labeled sentences.","PeriodicalId":412532,"journal":{"name":"ACM Trans. Speech Lang. Process.","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121220976","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"High-quality speech-to-speech translation for computer-aided language learning","authors":"Chao Wang, S. Seneff","doi":"10.1145/1149290.1149291","DOIUrl":"https://doi.org/10.1145/1149290.1149291","url":null,"abstract":"This article describes our research on spoken language translation aimed toward the application of computer aids for second language acquisition. The translation framework is incorporated into a multilingual dialogue system in which a student is able to engage in natural spoken interaction with the system in the foreign language, while speaking a query in their native tongue at any time to obtain a spoken translation for language assistance. Thus the quality of the translation must be extremely high, but the domain is restricted. Experiments were conducted in the weather information domain with the scenario of a native English speaker learning Mandarin Chinese. We were able to utilize a large corpus of English weather-domain queries to explore and compare a variety of translation strategies: formal, example-based, and statistical. Translation quality was manually evaluated on a test set of 695 spontaneous utterances. The best speech translation performance (89.9% correct, 6.1% incorrect, and 4.0% rejected), is achieved by a system which combines the formal and example-based methods, using parsability by a domain-specific Chinese grammar as a rejection criterion.","PeriodicalId":412532,"journal":{"name":"ACM Trans. Speech Lang. Process.","volume":"58 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126642595","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Broad coverage paragraph segmentation across languages and domains","authors":"C. Sporleder, Mirella Lapata","doi":"10.1145/1149290.1151098","DOIUrl":"https://doi.org/10.1145/1149290.1151098","url":null,"abstract":"This article considers the problem of automatic paragraph segmentation. The task is relevant for speech-to-text applications whose output transcipts do not usually contain punctuation or paragraph indentation and are naturally difficult to read and process. Text-to-text generation applications (e.g., summarization) could also benefit from an automatic paragaraph segementation mechanism which indicates topic shifts and provides visual targets to the reader. We present a paragraph segmentation model which exploits a variety of knowledge sources (including textual cues, syntactic and discourse-related information) and evaluate its performance in different languages and domains. Our experiments demonstrate that the proposed approach significantly outperforms our baselines and in many cases comes to within a few percent of human performance. Finally, we integrate our method with a single document summarizer and show that it is useful for structuring the output of automatically generated text.","PeriodicalId":412532,"journal":{"name":"ACM Trans. Speech Lang. Process.","volume":"87 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126989790","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Acoustic environment classification","authors":"Ling Ma, B. Milner, Dan J. Smith","doi":"10.1145/1149290.1149292","DOIUrl":"https://doi.org/10.1145/1149290.1149292","url":null,"abstract":"The acoustic environment provides a rich source of information on the types of activity, communication modes, and people involved in many situations. It can be accurately classified using recordings from microphones commonly found in PDAs and other consumer devices. We describe a prototype HMM-based acoustic environment classifier incorporating an adaptive learning mechanism and a hierarchical classification model. Experimental results show that we can accurately classify a wide variety of everyday environments. We also show good results classifying single sounds, although classification accuracy is influenced by the granularity of the classification.","PeriodicalId":412532,"journal":{"name":"ACM Trans. Speech Lang. Process.","volume":"60 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114957065","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Henri Avancini, A. Lavelli, F. Sebastiani, Roberto Zanoli
{"title":"Automatic expansion of domain-specific lexicons by term categorization","authors":"Henri Avancini, A. Lavelli, F. Sebastiani, Roberto Zanoli","doi":"10.1145/1138379.1138380","DOIUrl":"https://doi.org/10.1145/1138379.1138380","url":null,"abstract":"We discuss an approach to the automatic expansion of<i>domain-specific lexicons</i>, that is, to the problem ofextending, for each <i>c</i><sub><i>i</i></sub> in a predefined set<i>C</i> ={<i>c</i><sub>1</sub>,…,<i>c</i><sub><i>m</i></sub>} ofsemantic <i>domains</i>, an initial lexicon<i>L</i><sup><i>i</i></sup><sub>0</sub> into a larger lexicon<i>L</i><sup><i>i</i></sup><sub>1</sub>. Our approach relies on<i>term categorization</i>, defined as the task of labelingpreviously unlabeled terms according to a predefined set ofdomains. We approach this as a supervised learning problem in whichterm classifiers are built using the initial lexicons as trainingdata. Dually to classic text categorization tasks in whichdocuments are represented as vectors in a space of terms, werepresent terms as vectors in a space of documents. We present theresults of a number of experiments in which we use a boosting-basedlearning device for training our term classifiers. We test theeffectiveness of our method by using WordNetDomains, a well-knownlarge set of domain-specific lexicons, as a benchmark. Ourexperiments are performed using the documents in the Reuters CorpusVolume 1 as implicit representations for our terms.","PeriodicalId":412532,"journal":{"name":"ACM Trans. Speech Lang. Process.","volume":"48 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129315441","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}