NER@ACL最新文献

筛选
英文 中文
Multilingual Resources for Entity Extraction 实体抽取的多语言资源
NER@ACL Pub Date : 2003-07-12 DOI: 10.3115/1119384.1119391
S. Strassel, A. Mitchell
{"title":"Multilingual Resources for Entity Extraction","authors":"S. Strassel, A. Mitchell","doi":"10.3115/1119384.1119391","DOIUrl":"https://doi.org/10.3115/1119384.1119391","url":null,"abstract":"Progress in human language technology requires increasing amounts of data and annotation in a growing variety of languages. Research in Named Entity extraction is no exception. Linguistic Data Consortium is creating annotated corpora to support information extraction in English, Chinese, Arabic, and other languages for a variety of US Government-sponsored programs. This paper covers the scope of annotation and research tasks within these programs, describes some of the challenges of multilingual corpus development for entity extraction, and concludes with a description of the corpora developed to support this research.","PeriodicalId":237242,"journal":{"name":"NER@ACL","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-07-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117047082","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 18
Chinese Named Entity Recognition Combining Statistical Model wih Human Knowledge 统计模型与人类知识相结合的中文命名实体识别
NER@ACL Pub Date : 2003-07-12 DOI: 10.3115/1119384.1119393
Youzheng Wu, Jun Zhao, Bo Xu
{"title":"Chinese Named Entity Recognition Combining Statistical Model wih Human Knowledge","authors":"Youzheng Wu, Jun Zhao, Bo Xu","doi":"10.3115/1119384.1119393","DOIUrl":"https://doi.org/10.3115/1119384.1119393","url":null,"abstract":"Named Entity Recognition is one of the key techniques in the fields of natural language processing, information retrieval, question answering and so on. Unfortunately, Chinese Named Entity Recognition (NER) is more difficult for the lack of capitalization information and the uncertainty in word segmentation. In this paper, we present a hybrid algorithm which can combine a class-based statistical model with various types of human knowledge very well. In order to avoid data sparseness problem, we employ a back-off model and [Abstract contained text which could not be captured.], a Chinese thesaurus, to smooth the parameters in the model. The F-measure of person names, location names, and organization names on the newswire test data for the 1999 IEER evaluation in Mandarin is 86.84%, 84.40% and 76.22% respectively.","PeriodicalId":237242,"journal":{"name":"NER@ACL","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-07-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131657093","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 53
Learning Formulation and Transformation Rules for Multilingual Named Entities 多语言命名实体的表述与转换规则学习
NER@ACL Pub Date : 2003-07-12 DOI: 10.3115/1119384.1119385
Hsin-Hsi Chen, Changhua Yang, Ying Lin
{"title":"Learning Formulation and Transformation Rules for Multilingual Named Entities","authors":"Hsin-Hsi Chen, Changhua Yang, Ying Lin","doi":"10.3115/1119384.1119385","DOIUrl":"https://doi.org/10.3115/1119384.1119385","url":null,"abstract":"This paper investigates three multilingual named entity corpora, including named people, named locations and named organizations. Frequency-based approaches with and without dictionary are proposed to extract formulation rules of named entities for individual languages, and transformation rules for mapping among languages. We consider the issues of abbreviation and compound keyword at a distance. Keywords specify not only the types of named entities, but also tell out which parts of a named entity should be meaning-translated and which part should be phoneme-transliterated. An application of the results on cross language information retrieval is also shown.","PeriodicalId":237242,"journal":{"name":"NER@ACL","volume":"64 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-07-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126419895","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 38
NE Recognition Without Training Data on a Language You Don't Speak 在你不会说的语言上没有训练数据的新神经网络识别
NER@ACL Pub Date : 2003-07-12 DOI: 10.3115/1119384.1119389
D. Maynard, V. Tablan, H. Cunningham
{"title":"NE Recognition Without Training Data on a Language You Don't Speak","authors":"D. Maynard, V. Tablan, H. Cunningham","doi":"10.3115/1119384.1119389","DOIUrl":"https://doi.org/10.3115/1119384.1119389","url":null,"abstract":"In this paper we describe an experiment to adapt a named entity recognition system from English to Cebuano as part of the TIDES surprise language program. With 4 person-days of effort, and with no previous knowledge of which language would be involved, no knowledge of the language in question once it was announced, and no training data available, we adapted the ANNIE system for Cebuano and achieved an F-measure of 77.5%.","PeriodicalId":237242,"journal":{"name":"NER@ACL","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-07-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122886762","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 27
Construction and Analysis of Japanese-English Broadcast News Corpus with Named Entity Tags 带命名实体标签的日英广播新闻语料库的构建与分析
NER@ACL Pub Date : 2003-07-12 DOI: 10.3115/1119384.1119387
T. Kumano, H. Kashioka, Hideki Tanaka, T. Fukusima
{"title":"Construction and Analysis of Japanese-English Broadcast News Corpus with Named Entity Tags","authors":"T. Kumano, H. Kashioka, Hideki Tanaka, T. Fukusima","doi":"10.3115/1119384.1119387","DOIUrl":"https://doi.org/10.3115/1119384.1119387","url":null,"abstract":"We are aiming to acquire named entity (NE) translation knowledge from nonparallel, content-aligned corpora, by utilizing NE extraction techniques. For this research, we are constructing a Japanese-English broadcast news corpus with NE tags. The tags represent not only NE class information but also coreference information within the same monolingual document and between corresponding Japanese-English document pairs. Analysis of about 1,100 annotated article pairs has shown that if NE occurrence information, such as classes, number of occurrence and occurrence order, is given for each language, it may provide a good clue for corresponding NEs across languages.","PeriodicalId":237242,"journal":{"name":"NER@ACL","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-07-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115086629","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Low-cost Named Entity Classification for Catalan: Exploiting Multilingual Resources and Unlabeled Data 加泰罗尼亚语的低成本命名实体分类:利用多语言资源和未标记数据
NER@ACL Pub Date : 2003-07-12 DOI: 10.3115/1119384.1119388
Lluís Màrquez i Villodre, A. Gispert, X. Carreras, Lluís Padró
{"title":"Low-cost Named Entity Classification for Catalan: Exploiting Multilingual Resources and Unlabeled Data","authors":"Lluís Màrquez i Villodre, A. Gispert, X. Carreras, Lluís Padró","doi":"10.3115/1119384.1119388","DOIUrl":"https://doi.org/10.3115/1119384.1119388","url":null,"abstract":"This work studies Named Entity Classification (NEC) for Catalan without making use of large annotated resources of this language. Two views are explored and compared, namely exploiting solely the Catalan resources, and a direct training of bilingual classification models (Spanish and Catalan), given that a large collection of annotated examples is available for Spanish. The empirical results obtained on real data point out that multilingual models clearly outperform monolingual ones, and that the resulting Catalan NEC models are easier to improve by bootstrapping on unlabelled data.","PeriodicalId":237242,"journal":{"name":"NER@ACL","volume":"91 10","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-07-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120825242","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Automatic Extraction of Named Entity Translingual Equivalence Based on Multi-Feature Cost Minimization 基于多特征成本最小化的命名实体翻译等价自动提取
NER@ACL Pub Date : 2003-07-12 DOI: 10.3115/1119384.1119386
Fei Huang, S. Vogel, A. Waibel
{"title":"Automatic Extraction of Named Entity Translingual Equivalence Based on Multi-Feature Cost Minimization","authors":"Fei Huang, S. Vogel, A. Waibel","doi":"10.3115/1119384.1119386","DOIUrl":"https://doi.org/10.3115/1119384.1119386","url":null,"abstract":"Translingual equivalence refers to the relationship between expressions of the same meaning from different languages. Identifying translingual equivalence of named entities (NE) can significantly contribute to multilingual natural language processing, such as crosslingual information retrieval, crosslingual information extraction and statistical machine translation. In this paper we present an integrated approach to extract NE translingual equivalence from a parallel Chinese-English corpus.Starting from a bilingual corpus where NEs are automatically tagged for each language, NE pairs are aligned in order to minimize the overall multi-feature alignment cost. An NE transliteration model is presented and iteratively trained using named entity pairs extracted from a bilingual dictionary. The transliteration cost, combined with the named entity tagging cost and word-based translation cost, constitute the multi-feature alignment cost. These features are derived from several information sources using unsupervised and partly supervised methods. A greedy search algorithm is applied to minimize the alignment cost. Experiments show that the proposed approach extracts NE translingual equivalence with 81% F-score and improves the translation score from 7.68 to 7.74.","PeriodicalId":237242,"journal":{"name":"NER@ACL","volume":"45 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-07-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130950645","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 71
Transliteration of Proper Names in Cross-Lingual Information Retrieval 跨语言信息检索中专名的音译
NER@ACL Pub Date : 2003-07-12 DOI: 10.3115/1119384.1119392
Paola Virga, S. Khudanpur
{"title":"Transliteration of Proper Names in Cross-Lingual Information Retrieval","authors":"Paola Virga, S. Khudanpur","doi":"10.3115/1119384.1119392","DOIUrl":"https://doi.org/10.3115/1119384.1119392","url":null,"abstract":"We address the problem of transliterating English names using Chinese orthography in support of cross-lingual speech and text processing applications. We demonstrate the application of statistical machine translation techniques to \"translate\" the phonemic representation of an English name, obtained by using an automatic text-to-speech system, to a sequence of initials and finals, commonly used sub-word units of pronunciation for Chinese. We then use another statistical translation model to map the initial/final sequence to Chinese characters. We also present an evaluation of this module in retrieval of Mandarin spoken documents from the TDT corpus using English text queries.","PeriodicalId":237242,"journal":{"name":"NER@ACL","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-07-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131857925","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 192
Multi-Language Named-Entity Recognition System based on HMM 基于HMM的多语言命名实体识别系统
NER@ACL Pub Date : 2003-07-12 DOI: 10.3115/1119384.1119390
Kuniko Saito, M. Nagata
{"title":"Multi-Language Named-Entity Recognition System based on HMM","authors":"Kuniko Saito, M. Nagata","doi":"10.3115/1119384.1119390","DOIUrl":"https://doi.org/10.3115/1119384.1119390","url":null,"abstract":"We introduce a multi-language named-entity recognition system based on HMM. Japanese, Chinese, Korean and English versions have already been implemented. In principle, it can analyze any other language if we have training data of the target language. This system has a common analytical engine and it can handle any language simply by changing the lexical analysis rules and statistical language model. In this paper, we describe the architecture and accuracy of the named-entity system, and report preliminary experiments on automatic bilingual named-entity dictionary construction using the Japanese and English named-entity recognizer.","PeriodicalId":237242,"journal":{"name":"NER@ACL","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-07-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124291136","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 18
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信