Proceedings of the 19th international conference on Computational linguistics -最新文献

筛选
英文 中文
An Agent-based Approach to Chinese Named Entity Recognition 基于agent的中文命名实体识别方法
Proceedings of the 19th international conference on Computational linguistics - Pub Date : 2002-08-24 DOI: 10.3115/1072228.1072308
Shiren Ye, Tat-Seng Chua, Jimin Liu
{"title":"An Agent-based Approach to Chinese Named Entity Recognition","authors":"Shiren Ye, Tat-Seng Chua, Jimin Liu","doi":"10.3115/1072228.1072308","DOIUrl":"https://doi.org/10.3115/1072228.1072308","url":null,"abstract":"Chinese NE (Named Entity) recognition is a difficult problem because of the uncertainty in word segmentation and flexibility in language structure. This paper proposes the use of a rationality model in a multi-agent framework to tackle this problem. We employ a greedy strategy and use the NE rationality model to evaluate and detect all possible NEs in the text. We then treat the process of selecting the best possible NEs as a multi-agent negotiation problem. The resulting system is robust and is able to handle different types of NE effectively. Our test on the MET-2 test corpus indicates that our system is able to achieve high F1 values of above 92% on all NE types.","PeriodicalId":437823,"journal":{"name":"Proceedings of the 19th international conference on Computational linguistics -","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-08-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125777558","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 24
Morphological Analysis of the Spontaneous Speech Corpus 自发语音语料库的形态分析
Proceedings of the 19th international conference on Computational linguistics - Pub Date : 2002-08-24 DOI: 10.3115/1071884.1071903
Kiyotaka Uchimoto, Chikashi Nobata, Atsushi Yamada, S. Sekine, H. Isahara
{"title":"Morphological Analysis of the Spontaneous Speech Corpus","authors":"Kiyotaka Uchimoto, Chikashi Nobata, Atsushi Yamada, S. Sekine, H. Isahara","doi":"10.3115/1071884.1071903","DOIUrl":"https://doi.org/10.3115/1071884.1071903","url":null,"abstract":"This paper describes a project tagging a spontaneous speech corpus with morphological information such as word segmentation and parts-of-speech. We use a morphological analysis system based on a maximum entropy model, which is independent of the domain of corpora. In this paper we show the tagging accuracy achieved by using the model and discuss problems in tagging the spontaneous speech corpus. We also show that a dictionary developed for a corpus on a certain domain is helpful for improving accuracy in analyzing a corpus on another domain.","PeriodicalId":437823,"journal":{"name":"Proceedings of the 19th international conference on Computational linguistics -","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-08-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116597959","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 15
A Robust Cross-Style Bilingual Sentences Alignment Model 鲁棒跨风格双语句子对齐模型
Proceedings of the 19th international conference on Computational linguistics - Pub Date : 2002-08-24 DOI: 10.3115/1072228.1072237
T. Kueng, Keh-Yih Su
{"title":"A Robust Cross-Style Bilingual Sentences Alignment Model","authors":"T. Kueng, Keh-Yih Su","doi":"10.3115/1072228.1072237","DOIUrl":"https://doi.org/10.3115/1072228.1072237","url":null,"abstract":"Most current sentence alignment approaches adopt sentence length and cognate as the alignment features; and they are mostly trained and tested in the documents with the same style. Since the length distribution, alignment-type distribution (used by length-based approaches) and cognate frequency vary significantly across texts with different styles, the length-based approaches fail to achieve similar performance when tested in corpora of different styles. The experiments show that the performance in F-measure could drop from 98.2% to 85.6% when a length-based approach is trained by a technical manual and then tested on a general magazine.Since a large percentage of content words in the source text would be translated into the corresponding translation duals to preserve the meaning in the target text, transfer lexicons are usually regarded as more reliable cues for aligning sentences when the alignment task is performed by human. To enhance the robustness, a robust statistical model based on both transfer lexicons and sentence lengths are proposed in this paper. After integrating the transfer lexicons into the model, a 60% F-measure error reduction (from 14.4% to 5.8%) is observed.","PeriodicalId":437823,"journal":{"name":"Proceedings of the 19th international conference on Computational linguistics -","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-08-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132139760","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Meta-evaluation of Summaries in a Cross-lingual Environment using Content-based Metrics 使用基于内容的度量在跨语言环境中对摘要进行元评价
Proceedings of the 19th international conference on Computational linguistics - Pub Date : 2002-08-24 DOI: 10.3115/1072228.1072301
Horacio Saggion, Dragomir R. Radev, Simone Teufel, Wai Lam
{"title":"Meta-evaluation of Summaries in a Cross-lingual Environment using Content-based Metrics","authors":"Horacio Saggion, Dragomir R. Radev, Simone Teufel, Wai Lam","doi":"10.3115/1072228.1072301","DOIUrl":"https://doi.org/10.3115/1072228.1072301","url":null,"abstract":"We describe a framework for the evaluation of summaries in English and Chinese using similarity measures. The framework can be used to evaluate extractive, non-extractive, single and multi-document summarization. We focus on the resources developed that are made available for the research community.","PeriodicalId":437823,"journal":{"name":"Proceedings of the 19th international conference on Computational linguistics -","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-08-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129036828","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 64
Structure Alignment Using Bilingual Chunking 使用双语分块法进行结构对齐
Proceedings of the 19th international conference on Computational linguistics - Pub Date : 2002-08-24 DOI: 10.3115/1072228.1072238
Wei Wang, M. Zhou, Jin-Xia Huang, C. Huang
{"title":"Structure Alignment Using Bilingual Chunking","authors":"Wei Wang, M. Zhou, Jin-Xia Huang, C. Huang","doi":"10.3115/1072228.1072238","DOIUrl":"https://doi.org/10.3115/1072228.1072238","url":null,"abstract":"A new statistical method called \"bilingual chunking\" for structure alignment is proposed. Different with the existing approaches which align hierarchical structures like sub-trees, our method conducts alignment on chunks. The alignment is finished through a simultaneous bilingual chunking algorithm. Using the constrains of chunk correspondence between source language (SL) and target language (TL), our algorithm can dramatically reduce search space, support time synchronous DP algorithm, and lead to highly consistent chunking. Furthermore, by unifying the POS tagging and chunking in the search process, our algorithm alleviates effectively the influence of POS tagging deficiency to the chunking result.The experimental results with English-Chinese structure alignment show that our model can produce 90% in precision for chunking, and 87% in precision for chunk alignment.","PeriodicalId":437823,"journal":{"name":"Proceedings of the 19th international conference on Computational linguistics -","volume":"151 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-08-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131852933","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 22
Looking for Candidate Translational Equivalents in Specialized, Comparable Corpora 在专业可比语料库中寻找候选翻译对等物
Proceedings of the 19th international conference on Computational linguistics - Pub Date : 2002-08-24 DOI: 10.3115/1071884.1071904
Yun-Chuang Chiao, Pierre Zweigenbaum
{"title":"Looking for Candidate Translational Equivalents in Specialized, Comparable Corpora","authors":"Yun-Chuang Chiao, Pierre Zweigenbaum","doi":"10.3115/1071884.1071904","DOIUrl":"https://doi.org/10.3115/1071884.1071904","url":null,"abstract":"Previous attempts at identifying translational equivalents in comparable corpora have dealt with very large 'general language' corpora and words. We address this task in a specialized domain, medicine, starting from smaller non-parallel, comparable corpora and an initial bilingual medical lexicon. We compare the distributional contexts of source and target words, testing several weighting factors and similarity measures. On a test set of frequently occurring words, for the best combination (the Jaccard similarity measure with or without tf.idf weighting), the correct translation is ranked first for 20% of our test words, and is found in the top 10 candidates for 50% of them. An additional reverse-translation filtering step improves the precision of the top candidate translation up to 74%, with a 33% recall.","PeriodicalId":437823,"journal":{"name":"Proceedings of the 19th international conference on Computational linguistics -","volume":"59 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-08-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131196356","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 146
Natural Language and Inference in a Computer Game 计算机游戏中的自然语言和推理
Proceedings of the 19th international conference on Computational linguistics - Pub Date : 2002-08-24 DOI: 10.3115/1072228.1072341
Malte Gabsdil, Alexander Koller, Kristina Striegnitz
{"title":"Natural Language and Inference in a Computer Game","authors":"Malte Gabsdil, Alexander Koller, Kristina Striegnitz","doi":"10.3115/1072228.1072341","DOIUrl":"https://doi.org/10.3115/1072228.1072341","url":null,"abstract":"We present an engine for text adventures - computer games with which the player interacts using natural language. The system employs current methods from computational linguistics and an efficient inference system for description logic to make the interaction more natural. The inference system is especially useful in the linguistic modules dealing with reference resolution and generation and we show how we use it to rank different readings in the case of referential and syntactic ambiguities. It turns out that the player's utterances are naturally restricted in the game scenario, which simplifies the language processing task.","PeriodicalId":437823,"journal":{"name":"Proceedings of the 19th international conference on Computational linguistics -","volume":"174 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-08-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125801461","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
Learning Verb Argument Structure from Minimally Annotated Corpora 从最小标注语料库中学习动词论点结构
Proceedings of the 19th international conference on Computational linguistics - Pub Date : 2002-08-24 DOI: 10.3115/1072228.1072268
Anoop Sarkar, Woottiporn Tripasai
{"title":"Learning Verb Argument Structure from Minimally Annotated Corpora","authors":"Anoop Sarkar, Woottiporn Tripasai","doi":"10.3115/1072228.1072268","DOIUrl":"https://doi.org/10.3115/1072228.1072268","url":null,"abstract":"In this paper we investigate the task of automatically identifying the correct argument structure for a set of verbs. The argument structure of a verb allows us to predict the relationship between the syntactic arguments of a verb and their role in the underlying lexical semantics of the verb. Following the method described in (Merlo and Stevenson, 2001), we exploit the distributions of some selected features from the local context of a verb. These features were extracted from a 23M word WSJ corpus based on part-of-speech tags and phrasal chunks alone. We constructed several decision tree classifiers trained on this data. The best performing classifier achieved an error rate of 33.4%. This work shows that a subcategorization frame (SF) learning algorithm previously applied to Czech (Sarkar and Zeman, 2000) is used to extract SFs in English. The extracted SFs are evaluated by classifying verbs into verb alternation classes.","PeriodicalId":437823,"journal":{"name":"Proceedings of the 19th international conference on Computational linguistics -","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-08-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130000109","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 18
Syntactic Features for High Precision Word Sense Disambiguation 高精度词义消歧的句法特征
Proceedings of the 19th international conference on Computational linguistics - Pub Date : 2002-08-24 DOI: 10.3115/1072228.1072340
David Martínez, Eneko Agirre, Lluís Màrquez i Villodre
{"title":"Syntactic Features for High Precision Word Sense Disambiguation","authors":"David Martínez, Eneko Agirre, Lluís Màrquez i Villodre","doi":"10.3115/1072228.1072340","DOIUrl":"https://doi.org/10.3115/1072228.1072340","url":null,"abstract":"This paper explores the contribution of a broad range of syntactic features to WSD: grammatical relations coded as the presence of adjuncts/arguments in isolation or as subcategorization frames, and instantiated grammatical relations between words. We have tested the performance of syntactic features using two different ML algorithms (Decision Lists and AdaBoost) on the Senseval-2 data. Adding syntactic features to a basic set of traditional features improves performance, especially for AdaBoost. In addition, several methods to build arbitrarily high accuracy WSD systems are also tried, showing that syntactic features allow for a precision of 86% and a coverage of 26% or 95% precision and 8% coverage.","PeriodicalId":437823,"journal":{"name":"Proceedings of the 19th international conference on Computational linguistics -","volume":"48 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-08-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134262848","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 38
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信
小红书