Proceedings of the 20th international conference on Computational Linguistics - COLING '04最新文献

筛选
英文 中文
Semantic Role Labeling Via Integer Linear Programming Inference 基于整数线性规划推理的语义角色标注
Vasin Punyakanok, D. Roth, Wen-tau Yih, Dav Zimak
{"title":"Semantic Role Labeling Via Integer Linear Programming Inference","authors":"Vasin Punyakanok, D. Roth, Wen-tau Yih, Dav Zimak","doi":"10.3115/1220355.1220552","DOIUrl":"https://doi.org/10.3115/1220355.1220552","url":null,"abstract":"We present a system for the semantic role labeling task. The system combines a machine learning technique with an inference procedure based on integer linear programming that supports the incorporation of linguistic and structural constraints into the decision process. The system is tested on the data provided in CoNLL-2004 shared task on semantic role labeling and achieves very competitive results.","PeriodicalId":330668,"journal":{"name":"Proceedings of the 20th international conference on Computational Linguistics - COLING '04","volume":"35 4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-08-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115358454","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 201
Generalizing Dimensionality in Combinatory Categorial Grammar 组合范畴语法中的泛化维度
G. Kruijff, Jason Baldridge
{"title":"Generalizing Dimensionality in Combinatory Categorial Grammar","authors":"G. Kruijff, Jason Baldridge","doi":"10.3115/1220355.1220383","DOIUrl":"https://doi.org/10.3115/1220355.1220383","url":null,"abstract":"We extend Combinatory Categorial Grammar (CCG) with a generalized notion of multidimensional sign, inspired by the types of representations found in constraint-based frameworks like HPSG or LFG. The generalized sign allows multiple levels to share information, but only in a resource-bounded way through a very restricted indexation mechanism. This improves representational perspicuity without increasing parsing complexity, in contrast to full-blown unification used in HPSG and LFG. Well-formedness of a linguistic expressions remains entirely determined by the CCG derivation. We show how the multidimensionality and perspicuity of the generalized signs lead to a simplification of previous CCG accounts of how word order and prosody can realize information structure.","PeriodicalId":330668,"journal":{"name":"Proceedings of the 20th international conference on Computational Linguistics - COLING '04","volume":"68 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-08-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116237366","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Tagging with Hidden Markov Models Using Ambiguous Tags 使用模糊标签的隐马尔可夫模型标记
Alexis Nasr, Frédéric Béchet, A. Volanschi
{"title":"Tagging with Hidden Markov Models Using Ambiguous Tags","authors":"Alexis Nasr, Frédéric Béchet, A. Volanschi","doi":"10.3115/1220355.1220437","DOIUrl":"https://doi.org/10.3115/1220355.1220437","url":null,"abstract":"Part of speech taggers based on Hidden Markov Models rely on a series of hypotheses which make certain errors inevitable. The idea developed in this paper consists in allowing a limited, controlled ambiguity in the output of the tagger in order to avoid a number of errors. The ambiguity takes the form of ambiguous tags which denote subsets of the tagset. These tags are used when the tagger hesitates between the different components of the ambiguous tags. They are introduced in an existing lexicon and 3-gram database. Their lexical and syntactic counts are computed on the basis of the lexical and syntactic counts of their constituents, using impurity functions. The tagging process itself, based on the Viterbi algorithm, is unchanged. Experiments conducted on the Brown corpus show a recall of 0.982, for an ambiguity rate of 1.233 which is to be compared with a baseline recall of 0.978 for an ambiguity rate of 1.414 using the same ambiguous tags and with a recall of 0.955 corresponding to the one best solution of standard tagging (without ambiguous tags).","PeriodicalId":330668,"journal":{"name":"Proceedings of the 20th international conference on Computational Linguistics - COLING '04","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-08-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129811858","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Multilingual and cross-lingual news topic tracking 多语种和跨语种新闻话题跟踪
B. Pouliquen, R. Steinberger, C. Ignat, E. Käsper, Irina Temnikova
{"title":"Multilingual and cross-lingual news topic tracking","authors":"B. Pouliquen, R. Steinberger, C. Ignat, E. Käsper, Irina Temnikova","doi":"10.3115/1220355.1220493","DOIUrl":"https://doi.org/10.3115/1220355.1220493","url":null,"abstract":"We are presenting a working system for automated news analysis that ingests an average total of 7600 news articles per day in five languages. For each language, the system detects the major news stories of the day using a group-average unsupervised agglomerative clustering process. It also tracks, for each cluster, related groups of articles published over the previous seven days, using a cosine of weighted terms. The system furthermore tracks related news across languages, in all language pairs involved. The cross-lingual news cluster similarity is based on a linear combination of three types of input: (a) cognates, (b) automatically detected to geographical place names and (c) the results of a mapping process onto a multilingual classification system. A manual evaluation showed that the system produces good results.","PeriodicalId":330668,"journal":{"name":"Proceedings of the 20th international conference on Computational Linguistics - COLING '04","volume":"78 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-08-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114869773","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 69
Learning to Identify Single-Snippet Answers to Definition Questions 学习识别定义问题的单片段答案
Spyridoula Miliaraki, Ion Androutsopoulos
{"title":"Learning to Identify Single-Snippet Answers to Definition Questions","authors":"Spyridoula Miliaraki, Ion Androutsopoulos","doi":"10.3115/1220355.1220554","DOIUrl":"https://doi.org/10.3115/1220355.1220554","url":null,"abstract":"We present a learning-based method to identify single-snippet answers to definition questions in question answering systems for document collections. Our method combines and extends two previous techniques that were based mostly on manually crafted lexical patterns and WordNet hypernyms. We train a Support Vector Machine (SVM) on vectors comprising the verdicts or attributes of the previous techniques, and additional phrasal attributes that we acquire automatically. The SVM is then used to identify and rank single 250-character snippets that contain answers to definition questions. Experimental results indicate that our method clearly outperforms the techniques it builds upon.","PeriodicalId":330668,"journal":{"name":"Proceedings of the 20th international conference on Computational Linguistics - COLING '04","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-08-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125114499","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 47
Acquisition of Semantic Classes for Adjectives from Distributional Evidence 从分布证据中获取形容词语义类
Gemma Boleda, Toni Badia, E. Batlle
{"title":"Acquisition of Semantic Classes for Adjectives from Distributional Evidence","authors":"Gemma Boleda, Toni Badia, E. Batlle","doi":"10.3115/1220355.1220516","DOIUrl":"https://doi.org/10.3115/1220355.1220516","url":null,"abstract":"In this paper, we present a clustering experiment directed at the acquisition of semantic classes for adjectives in Catalan, using only shallow distributional features.We define a broad-coverage classification for adjectives based on Ontological Semantics. We classify along two parameters (number of arguments and ontological kind of denotation), achieving reliable agreement results among human judges. The clustering procedure achieves a comparable agreement score for one of the parameters, and a little lower for the other.","PeriodicalId":330668,"journal":{"name":"Proceedings of the 20th international conference on Computational Linguistics - COLING '04","volume":"589 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-08-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123180940","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 18
Analysis and Detection of Reading Miscues for Interactive Literacy Tutors 互动读写教师阅读错误的分析与检测
Katherine Lee, Andreas Hagen, Nicholas Romanyshyn, Sean Martin, B. Pellom
{"title":"Analysis and Detection of Reading Miscues for Interactive Literacy Tutors","authors":"Katherine Lee, Andreas Hagen, Nicholas Romanyshyn, Sean Martin, B. Pellom","doi":"10.3115/1220355.1220537","DOIUrl":"https://doi.org/10.3115/1220355.1220537","url":null,"abstract":"The Colorado Literacy Tutor (CLT) is a technology-based literacy program, designed on the basis of cognitive theory and scientifically motivated reading research, which aims to improve literacy and student achievement in public schools. One of the critical components of the CLT is a speech recognition system which is used to track the child's progress during oral reading and to provide sufficient information to detect reading miscues. In this paper, we extend on prior work by examining a novel labeling of children's oral reading audio data in order to better understand the factors that contribute most significantly to speech recognition errors. While these events make up nearly 8% of the data, they are shown to account for approximately 30% of the word errors in a state-of-the-art speech recognizer. Next, we consider the problem of detecting miscues during oral reading. Using features derived from the speech recognizer, we demonstrate that 67% of reading miscues can be detected at a false alarm rate of 3%.","PeriodicalId":330668,"journal":{"name":"Proceedings of the 20th international conference on Computational Linguistics - COLING '04","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-08-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115346371","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 24
Multi-level Bootstrapping For Extracting Parallel Sentences From a Quasi-Comparable Corpus 从拟可比语料库中提取平行句子的多级自举方法
Pascale Fung, Percy Cheung
{"title":"Multi-level Bootstrapping For Extracting Parallel Sentences From a Quasi-Comparable Corpus","authors":"Pascale Fung, Percy Cheung","doi":"10.3115/1220355.1220506","DOIUrl":"https://doi.org/10.3115/1220355.1220506","url":null,"abstract":"We propose a completely unsupervised method for mining parallel sentences from quasi-comparable bilingual texts which have very different sizes, and which include both in-topic and off-topic documents. We discuss and analyze different bilingual corpora with various levels of comparability. We propose that while better document matching leads to better parallel sentence extraction, better sentence matching also leads to better document matching. Based on this, we use multi-level bootstrapping to improve the alignments between documents, sentences, and bilingual word pairs, iteratively. Our method is the first method that does not rely on any supervised training data, such as a sentence-aligned corpus, or temporal information, such as the publishing date of a news article. It is validated by experimental results that show a 23% improvement over a method without multilevel bootstrapping.","PeriodicalId":330668,"journal":{"name":"Proceedings of the 20th international conference on Computational Linguistics - COLING '04","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-08-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115307176","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 86
Correcting Category Errors in Text Classification 修正文本分类中的类别错误
Fumiyo Fukumoto, Yoshimi Suzuki
{"title":"Correcting Category Errors in Text Classification","authors":"Fumiyo Fukumoto, Yoshimi Suzuki","doi":"10.3115/1220355.1220480","DOIUrl":"https://doi.org/10.3115/1220355.1220480","url":null,"abstract":"We address the problem dealing with category annotation errors which deteriorate the overall performance of text classification. We use two techniques. The first is support vectors which are extracted from the training samples by a machine learning technique, Support Vector Machines (SVM). The second is a loss function which measures the degree of our disappointment in any differences between the true distribution over inputs and the learner's prediction. We apply it to the extracted support vectors, and correct annotation errors. Experimental results with the RWCP and the Reuters 1996 corpora show that our method achieves high precision in detecting and correcting annotation errors. Further, results on text classification improves accuracy.","PeriodicalId":330668,"journal":{"name":"Proceedings of the 20th international conference on Computational Linguistics - COLING '04","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-08-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115312818","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 16
Computational Cognitive Linguistics 计算认知语言学
J. Feldman
{"title":"Computational Cognitive Linguistics","authors":"J. Feldman","doi":"10.3115/1220355.1220515","DOIUrl":"https://doi.org/10.3115/1220355.1220515","url":null,"abstract":"The talk will describe an ongoing project (modestly named the Neural Theory of Language) that is attempting to model language behavior in a way that is both neurally plausible and computationally practical. The cornerstone of the effort is a formalism called Embodied Construction Grammar (ECG). I will describe the formalism, a robust semantic parser based on it, and a variety of applications of moderate scale. These include a system for understanding the (probabilistic and metaphorical) implications of news stories, and the first cognitively plausible model of how children learn grammar.","PeriodicalId":330668,"journal":{"name":"Proceedings of the 20th international conference on Computational Linguistics - COLING '04","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-08-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131800513","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信