Proceedings of the 6th International Conference on Natural Language Processing and Knowledge Engineering(NLPKE-2010)最新文献

筛选
英文 中文
A method for generating document summary using field association knowledge and subjectively information 一种利用领域关联知识和主观信息生成文档摘要的方法
Abdunabi Ubul, E. Atlam, K. Morita, M. Fuketa, J. Aoe
{"title":"A method for generating document summary using field association knowledge and subjectively information","authors":"Abdunabi Ubul, E. Atlam, K. Morita, M. Fuketa, J. Aoe","doi":"10.1109/NLPKE.2010.5587853","DOIUrl":"https://doi.org/10.1109/NLPKE.2010.5587853","url":null,"abstract":"In the recent years, with the expansion of the Internet there has been tremendous growth in the volume of electronic text documents available information on the Web, which making difficulty for users to locate efficiently needed information. To facilitate efficient searching for information, research to summarize the general outline of a text document is essential. Moreover, as the information from bulletin boards, blogs, and other sources is being used as consumer generated media data, text summarization become necessary. In this paper a new method for document summary using three attribute information called: the field, associated terms, and attribute grammars is presented, this method establish a formal and efficient generation technology. From the experiments results it turns out that the summary accuracy rate, readability, and meaning integrity are 87.5%, 85%, and 86%, respectively using information from 400 blogs.","PeriodicalId":259975,"journal":{"name":"Proceedings of the 6th International Conference on Natural Language Processing and Knowledge Engineering(NLPKE-2010)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122098779","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Multi-Document summarization based on improved features and clustering 基于改进特征和聚类的多文档摘要
Ying Xiong, Hongyan Liu, Lei Li
{"title":"Multi-Document summarization based on improved features and clustering","authors":"Ying Xiong, Hongyan Liu, Lei Li","doi":"10.1109/NLPKE.2010.5587834","DOIUrl":"https://doi.org/10.1109/NLPKE.2010.5587834","url":null,"abstract":"Multi-Document summarization is an emerging technique for understanding the main purpose of many documents about the same topic. This paper proposes a new feature selection method to improve the summarization result. When calculating similarity, we use a modified TFIDF formula which achieves a better result. We adopt two ways for exactly extracting keywords. Experimental results demonstrate that our improved method performs better than the traditional one.","PeriodicalId":259975,"journal":{"name":"Proceedings of the 6th International Conference on Natural Language Processing and Knowledge Engineering(NLPKE-2010)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128687187","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Designing effective web mining-based techniques for OOV translation 设计有效的基于web挖掘的面向对象语言翻译技术
Haitao Yu, F. Ren, Degen Huang, Lishuang Li
{"title":"Designing effective web mining-based techniques for OOV translation","authors":"Haitao Yu, F. Ren, Degen Huang, Lishuang Li","doi":"10.1109/NLPKE.2010.5587807","DOIUrl":"https://doi.org/10.1109/NLPKE.2010.5587807","url":null,"abstract":"Due to a limited coverage of the existing bilingual dictionary, it is often difficult to translate the Out-Of-Vocabulary terms (OOV) in many natural language processing tasks. In this paper, we propose a general cascade mining technique of three steps, it leverages OOV category to optimize the effectiveness of each step. OOV category based expansion policy is suggested to get more relevant mixed-language documents. OOV category based hybrid extraction approach is suggested to perform a robust extraction. A more flexible model combination based on OOV category is also suggested. Moreover, we conducted experiments to evaluate the effectiveness of each step and the overall performance of the mining technique. The experimental results show significantly performance improvement than the existing methods.","PeriodicalId":259975,"journal":{"name":"Proceedings of the 6th International Conference on Natural Language Processing and Knowledge Engineering(NLPKE-2010)","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115674388","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Realization of a high performance bilingual OCR system for Thai-English printed documents 高性能泰英双语OCR系统的实现
S. Tangwongsan, Buntida Suvacharakulton
{"title":"Realization of a high performance bilingual OCR system for Thai-English printed documents","authors":"S. Tangwongsan, Buntida Suvacharakulton","doi":"10.1109/NLPKE.2010.5587781","DOIUrl":"https://doi.org/10.1109/NLPKE.2010.5587781","url":null,"abstract":"This paper presents a high performance bilingual OCR system for printed Thai and English text. With the complex nature of both Thai and English languages, the first stage is to identify languages within different zones by using geometric properties for differentiation. The second stage is the process of character recognition, in which the technique developed includes a feature extractor and a classifier. In the feature extraction, the thinned character image is analyzed and categorized into groups. Next, the classifier will take in two steps of recognition: the coarse level, followed by the fine level with a guide of decision trees. As to obtain an even better result, the final stage attempts to make use of dictionary look-up as to check for accuracy improvement in an overall performance. For verification, the system is tested by a series of experiments with printed documents in 141 pages and over 280,000 characters, the result shows that the system could obtain an accuracy of 100% in Thai monolingual, 98.18% in English monolingual, and 99.85% in bilingual documents on the average. In the final stage with a dictionary look-up, the system could yield a better accuracy of improvement up to 99.98% in bilingual documents as expected.","PeriodicalId":259975,"journal":{"name":"Proceedings of the 6th International Conference on Natural Language Processing and Knowledge Engineering(NLPKE-2010)","volume":"56 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121132894","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Automatic filtration of multiword units 自动过滤多字单位
Y. Liu, Zheng Tie
{"title":"Automatic filtration of multiword units","authors":"Y. Liu, Zheng Tie","doi":"10.1109/NLPKE.2010.5587783","DOIUrl":"https://doi.org/10.1109/NLPKE.2010.5587783","url":null,"abstract":"This paper studies how to filtrate multiword units. We use normalized expectation (NE) to extract multiword unit candidates from patent corpus. Then the multiword unit candidates are filtrated using stop words, frequency, first stop words, last stop words, and contextual entropy. The experimental result shows that the precision rate of multiword units is improved by 8.7% after filtration.","PeriodicalId":259975,"journal":{"name":"Proceedings of the 6th International Conference on Natural Language Processing and Knowledge Engineering(NLPKE-2010)","volume":"261 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131807931","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Document expansion using relevant web documents for spoken document retrieval 文档扩展使用相关的网络文档为口语文档检索
Ryo Masumura, A. Ito, Yu Uno, Masashi Ito, S. Makino
{"title":"Document expansion using relevant web documents for spoken document retrieval","authors":"Ryo Masumura, A. Ito, Yu Uno, Masashi Ito, S. Makino","doi":"10.1109/NLPKE.2010.5587854","DOIUrl":"https://doi.org/10.1109/NLPKE.2010.5587854","url":null,"abstract":"Recently, automatic indexing of a spoken document using a speech recognizer attracts attention. However, index generation from an automatic transcription has many problems because the automatic transcription has many recognition errors and Out-Of-Vocabulary words. To solve this problem, we propose a document expansion method using Web documents. To obtain important keywords which included in the spoken document but lost by recognition errors, we acquire Web documents relevant to the spoken document. Then, an index of the spoken document is generated by combining an index that generated from the automatic transcription and the Web documents. We propose a method for retrieval of relevant documents, and the experimental result shows that the retrieved Web document contained many OOV words. Next, we propose a method for combining the recognized index and the Web index. The experimental result shows that the index of the spoken document generated by the document expansion was closer to an index from the manual transcription than the index generated by the conventional method. Finally, we conducted a spoken document retrieval experiment, and the document-expansion-based index gave better retrieval precision than the conventional indexing method.","PeriodicalId":259975,"journal":{"name":"Proceedings of the 6th International Conference on Natural Language Processing and Knowledge Engineering(NLPKE-2010)","volume":"27 8","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121007971","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Needs and challenges of care robots in nursing care setting: A literature review 护理机器人在护理环境中的需求与挑战:文献综述
Yuko Nagai, T. Tanioka, Shoko Fuji, Yuko Yasuhara, Sakiko Sakamaki, Narimi Taoka, R. Locsin, Fuji Ren, Kazuyuki Matsumoto
{"title":"Needs and challenges of care robots in nursing care setting: A literature review","authors":"Yuko Nagai, T. Tanioka, Shoko Fuji, Yuko Yasuhara, Sakiko Sakamaki, Narimi Taoka, R. Locsin, Fuji Ren, Kazuyuki Matsumoto","doi":"10.1109/NLPKE.2010.5587815","DOIUrl":"https://doi.org/10.1109/NLPKE.2010.5587815","url":null,"abstract":"This study aims to identify needs and challenges of care robot in nursing care setting through an extensive search of the literature. As the result shows, there exists a shortage of information about results of the introduction of care robots, the needs of recipients and care providers, and relevant ethical problems. To advance our research and to introduce care robots into setting, there are so many things to do; consider the application of natural language processing technology by collaborating with researchers in the robotics field, carry out an investigation, extract the needs, clarify ethical problems and seek solutions, conduct the on-site experiment study, and so on.","PeriodicalId":259975,"journal":{"name":"Proceedings of the 6th International Conference on Natural Language Processing and Knowledge Engineering(NLPKE-2010)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132952033","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
A new cascade algorithm based on CRFs for recognizing Chinese verb-object collocation 基于CRFs的汉语动宾搭配级联识别新算法
Guiping Zhang, Zhichao Liu, Qiaoli Zhou, Dongfeng Cai, Jiao Cheng
{"title":"A new cascade algorithm based on CRFs for recognizing Chinese verb-object collocation","authors":"Guiping Zhang, Zhichao Liu, Qiaoli Zhou, Dongfeng Cai, Jiao Cheng","doi":"10.1109/NLPKE.2010.5587828","DOIUrl":"https://doi.org/10.1109/NLPKE.2010.5587828","url":null,"abstract":"This paper proposes a new cascade algorithm based on conditional random fields. The algorithm is applied to automatic recognition of Chinese verb-object collocation, and combined with a new sequence labeling of “ONIY”. Experiments compare identified results under two segmentations and part-of-speech tag sets. The comprehensive experimental results show that the best performance is 90.65 % in F-score over Tsinghua Treebank, and 82.00 % in F-score over the segmentation and part-of-speech tagging scheme of Peking University. Our experiments show that the proposed algorithm can greatly improve recognition accuracy of multi-nested collocation, and play a positive role on long distance collocation.","PeriodicalId":259975,"journal":{"name":"Proceedings of the 6th International Conference on Natural Language Processing and Knowledge Engineering(NLPKE-2010)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114551334","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Negation disambiguation using the maximum entropy model 最大熵模型的否定消歧
Chunliang Zhang, Xiaoxu Fei, Jingbo Zhu
{"title":"Negation disambiguation using the maximum entropy model","authors":"Chunliang Zhang, Xiaoxu Fei, Jingbo Zhu","doi":"10.1109/NLPKE.2010.5587857","DOIUrl":"https://doi.org/10.1109/NLPKE.2010.5587857","url":null,"abstract":"Handling negation issue is of great significance for sentiment analysis. Most previous studies adopted a simple heuristic rule for sentiment negation disambiguation within a fixed context window. In this paper we present a supervised method to disambiguate which sentiment word is attached to the negator such as “(not)” in an opinionated sentence. Experimental results show that our method can achieve better performance than traditional methods.","PeriodicalId":259975,"journal":{"name":"Proceedings of the 6th International Conference on Natural Language Processing and Knowledge Engineering(NLPKE-2010)","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117237956","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Distributed training for Conditional Random Fields 条件随机场的分布式训练
Xiaojun Lin, Liang Zhao, Dianhai Yu, Xihong Wu
{"title":"Distributed training for Conditional Random Fields","authors":"Xiaojun Lin, Liang Zhao, Dianhai Yu, Xihong Wu","doi":"10.1109/NLPKE.2010.5587803","DOIUrl":"https://doi.org/10.1109/NLPKE.2010.5587803","url":null,"abstract":"This paper proposes a novel distributed training method of Conditional Random Fields (CRFs) by utilizing the clusters built from commodity computers. The method employs Message Passing Interface (MPI) to deal with large-scale data in two steps. Firstly, the entire training data is divided into several small pieces, each of which can be handled by one node. Secondly, instead of adopting a root node to collect all features, a new criterion is used to split the whole feature set into non-overlapping subsets and ensure that each node maintains the global information of one feature subset. Experiments are carried out on the task of Chinese word segmentation (WS) with large scale data, and we observed significant reduction on both training time and space, while preserving the performance.","PeriodicalId":259975,"journal":{"name":"Proceedings of the 6th International Conference on Natural Language Processing and Knowledge Engineering(NLPKE-2010)","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123421571","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信