{"title":"Emerging Applications of Natural Language Generation in Information Visualization, Education, and Health Care","authors":"Barbara Maria Di Eugenio, N. Green","doi":"10.1201/9781420085938-c23","DOIUrl":"https://doi.org/10.1201/9781420085938-c23","url":null,"abstract":"","PeriodicalId":361311,"journal":{"name":"Handbook of Natural Language Processing","volume":"113 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131416669","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Multiword Expressions","authors":"Timothy Baldwin, Su Nam Kim","doi":"10.1201/9781420085938-c12","DOIUrl":"https://doi.org/10.1201/9781420085938-c12","url":null,"abstract":"","PeriodicalId":361311,"journal":{"name":"Handbook of Natural Language Processing","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116984889","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Word Sense Disambiguation","authors":"David Yarowsky","doi":"10.1201/9781420085938-c14","DOIUrl":"https://doi.org/10.1201/9781420085938-c14","url":null,"abstract":"This paper describes a program that disambignates English word senses in unrestricted text using statistical models of the major Roget's Thesaurus categories. Roget's categories serve as approximations of conceptual classes. The categories listed for a word in Roger's index tend to correspond to sense distinctions; thus selecting the most likely category provides a useful level of sense disambiguatiou. The selection of categories is accomplished by identifying and weighting words that are indicative of each category when seen in context, using a Bayesian theoretical framework. Other statistical approaches have required special corpora or hand-labeled training examples for much of the lexicon. Our use of class models overcomes this knowledge acquisition bottleneck, enabling training on unresUicted monolingual text without human intervention. Applied to the 10 million word Grolier's Encyclopedia, the system correctly disambiguated 92% of the instances of 12 polysemous words that have been previously studied in the literature. 1. Problem Formulation This paper presents an approach to word sense disambiguation that uses classes of words to derive models useful for disambignating individual words in context. \"Sense\" is not a well defined concept; it has been based on subjective and often subtle distinctions in topic, register, dialect, collocation, part of speech and valency. For the purposes of this study, we will define the senses of a word as the categories listed for that word in Roger's International Thesaurus (Fourth Edition Chapman, 1977). 1 Sense disambiguation will constitute 1. Note that this edition of Roger's Thesaurus is much more e0ttm$ive than the 1911 vm'sion, though somewhat more difficult to obtain in electronic form, One could me other other concept hlemrehics, such as WordNet (Miller, 1990) or the LDOCE mbject codes (Slator, 1991). All that it necessary is • set of semamic categories and • list of the words in each category. selecting the listed category which is most probable given the surrounding context. This may appear to be a particularly crude approximation, but as shown in the example below and in the table of results, it is surprisingly successful.","PeriodicalId":361311,"journal":{"name":"Handbook of Natural Language Processing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129795705","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Statistical Machine Translation","authors":"Abraham Ittycheriah","doi":"10.1201/9781420085938-c17","DOIUrl":"https://doi.org/10.1201/9781420085938-c17","url":null,"abstract":"","PeriodicalId":361311,"journal":{"name":"Handbook of Natural Language Processing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124342413","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Report Generation","authors":"E. Riloff, Leo Wanner","doi":"10.1201/9781420085938-c22","DOIUrl":"https://doi.org/10.1201/9781420085938-c22","url":null,"abstract":"","PeriodicalId":361311,"journal":{"name":"Handbook of Natural Language Processing","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123529254","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Alignment","authors":"Dekai Wu","doi":"10.1201/9781420085938-c16","DOIUrl":"https://doi.org/10.1201/9781420085938-c16","url":null,"abstract":"2016Professor, Department of Computer Science, Princeton University 2013-2016 Director, Center for Computational Molecular Biology, Brown University 2011-2016 Associate Professor, Department of Computer Science & Center for Computational Molecular Biology, Brown University 2006-2011 Assistant Professor, Department of Computer Science & Center for Computational Molecular Biology, Brown University 2005-2006 Burroughs Wellcome Postdoctoral Fellowship in Computer Science (Bioinformatics), University of California, San Diego. Sponsor: Professor Pavel Pevzner. 2002-2004 Alfred P. Sloan Postdoctoral Fellowship in Computer Science (Bioinformatics), University of California, San Diego. Sponsor: Professor Pavel Pevzner.","PeriodicalId":361311,"journal":{"name":"Handbook of Natural Language Processing","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128623493","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Part-of-Speech Tagging","authors":"Tunga Güngör","doi":"10.1201/9781420085938-c10","DOIUrl":"https://doi.org/10.1201/9781420085938-c10","url":null,"abstract":"","PeriodicalId":361311,"journal":{"name":"Handbook of Natural Language Processing","volume":"75 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128850024","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Statistical Parsing","authors":"Joakim Nivre","doi":"10.1201/9781420085938-c11","DOIUrl":"https://doi.org/10.1201/9781420085938-c11","url":null,"abstract":"","PeriodicalId":361311,"journal":{"name":"Handbook of Natural Language Processing","volume":"86 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124738003","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Chinese Machine Translation","authors":"Pascale Fung","doi":"10.1201/9781420085938-c18","DOIUrl":"https://doi.org/10.1201/9781420085938-c18","url":null,"abstract":"","PeriodicalId":361311,"journal":{"name":"Handbook of Natural Language Processing","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117344149","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Information Extraction","authors":"Jerry R. Hobbs","doi":"10.1201/9781420085938-c21","DOIUrl":"https://doi.org/10.1201/9781420085938-c21","url":null,"abstract":"Information Extraction (IE) techniques aim to extract the names of entities and objects from text and to identify the roles that they play in event descriptions. IE systems generally focus on a specific domain or topic, searching only for information that is relevant to a user's interests. In this chapter, we first give historical background on information extraction and discuss several kinds of information extraction tasks that have emerged in recent years. Next, we outline the series of steps that are involved in creating a typical information extraction system, which can be encoded as a cascaded finite-state transducer. Along the way, we present examples to illustrate what each step does. Finally, we present an overview of different learning-based methods for information extraction, including supervised learning approaches, weakly supervised and bootstrapping techniques, and discourse-oriented approaches. Information extraction (IE) is the process of scanning text for information relevant to some interest, including extracting entities, relations, and, most challenging, events–or who did what to whom when and where. It requires deeper analysis than key word searches, but its aims fall short of the very hard and long-term problem of text understanding, where we seek to capture all the information in a text, along with the speaker's or writer's intention.","PeriodicalId":361311,"journal":{"name":"Handbook of Natural Language Processing","volume":"55 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116076630","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}