2009 International Conference on Asian Language Processing最新文献

筛选
英文 中文
Research on Chinese Text Summarization Algorithm Based on Statistics and Rules 基于统计和规则的中文文本摘要算法研究
2009 International Conference on Asian Language Processing Pub Date : 2009-12-07 DOI: 10.1109/IALP.2009.56
Faguo Zhou, Fan Zhang, Bingru Yang
{"title":"Research on Chinese Text Summarization Algorithm Based on Statistics and Rules","authors":"Faguo Zhou, Fan Zhang, Bingru Yang","doi":"10.1109/IALP.2009.56","DOIUrl":"https://doi.org/10.1109/IALP.2009.56","url":null,"abstract":"Text summarization is a meaningful part of the research of natural language document understanding, and it is an important branch of natural language processing. In this paper, on the basis of the research status quo of the researchers and experts both home and abroad, two text summarization algorithms are proposed. And one algorithm is rule-based, and the other is based on statistics.","PeriodicalId":156840,"journal":{"name":"2009 International Conference on Asian Language Processing","volume":"48 2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123554337","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Segmenting Long Sentence Pairs for Statistical Machine Translation 面向统计机器翻译的长句对分词
2009 International Conference on Asian Language Processing Pub Date : 2009-12-07 DOI: 10.1109/IALP.2009.20
Biping Meng, Shujian Huang, Xinyu Dai, Jiajun Chen
{"title":"Segmenting Long Sentence Pairs for Statistical Machine Translation","authors":"Biping Meng, Shujian Huang, Xinyu Dai, Jiajun Chen","doi":"10.1109/IALP.2009.20","DOIUrl":"https://doi.org/10.1109/IALP.2009.20","url":null,"abstract":"In phrase-based statistical machine translation, the knowledge about phrase translation and phrase reordering is learned from the bilingual corpora. However, words may be poorly aligned in long sentence pairs in practice, which will then do harm to the following steps of the translation, such as phrase extraction, etc. A possible solution to this problem is segmenting long sentence pairs into shorter ones. In this paper, we present an effective approach to segmenting sentences based on the modified IBM Translation Model 1. We find that by taking into account the semantics of some words, as well as the length ratio of source and target sentences, the segmentation result is largely improved. We also discuss the effect of length factor to the segmentation result. Experiments show that our approach can improve the BLEU score of a phrase-based translation system by about 0.5 points.","PeriodicalId":156840,"journal":{"name":"2009 International Conference on Asian Language Processing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129133891","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Advances in Acoustic Modeling for Vietnamese LVCSR 越南LVCSR声学模拟研究进展
2009 International Conference on Asian Language Processing Pub Date : 2009-12-07 DOI: 10.1109/IALP.2009.66
Tuan-Nam Nguyen, Q. Vu
{"title":"Advances in Acoustic Modeling for Vietnamese LVCSR","authors":"Tuan-Nam Nguyen, Q. Vu","doi":"10.1109/IALP.2009.66","DOIUrl":"https://doi.org/10.1109/IALP.2009.66","url":null,"abstract":"In this paper, we present our experiments on the selection of basic phonetic units for the Vietnamese large vocabulary continuous speech recognition (LVCSR). Two acoustic models were compared. The first model has just used vowels or monophthongs as phonemes [2] while the second one, which was proposed in this paper, has explored the use of diphthongs and triphthongs as phonemes as well. The two models were trained and evaluated on a Broadcast News corpus containing 27 hours of acoustic training data and 1 hour of acoustic testing data. Moreover, an 146M-word corpus collection of newspaper was employed for building the language models. Experimental results indicate significant improvements in both word accuracy rate and time-execution. With the second acoustic model, the word accuracy rates reach 86.06% on the best case and the execution time is faster than the real-time.","PeriodicalId":156840,"journal":{"name":"2009 International Conference on Asian Language Processing","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122179905","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Learning Method for Extraction of Partial Correspondence from Parallel Corpus 平行语料库部分对应提取的学习方法
2009 International Conference on Asian Language Processing Pub Date : 2009-12-07 DOI: 10.1109/IALP.2009.69
Ryo Terashima, Hiroshi Echizen-ya, K. Araki
{"title":"Learning Method for Extraction of Partial Correspondence from Parallel Corpus","authors":"Ryo Terashima, Hiroshi Echizen-ya, K. Araki","doi":"10.1109/IALP.2009.69","DOIUrl":"https://doi.org/10.1109/IALP.2009.69","url":null,"abstract":"For machine translations using a parallel corpus, it is effective to extract partial correspondences: pairs of phrases of the source language(SL) and target language(TL) in bilingual sentences. However, it is difficult to extract the partial correspondences correctly and efficiently in the data sparse corpus. In this paper, we propose a new learning method that extracts the partial correspondences solely from the parallel corpus without any analytical tools. In the proposed method, the extraction rules are automatically acquired from bilingual sentences using bi-gram statistics in each language sentence and the similarity based on Dice coefficient between SL words and TL words. The acquired extraction rules possess information about the first parts(e.g., \"a\", \"the\") or the last parts in phrases. Moreover, the partial correspondences are extracted from the bilingual sentences using the extraction rules correctly and efficiently. Evaluation experiments indicated that our proposed method can improve the translation quality of the learning-type machine translation by correctly and efficiently extracting the partial correspondences in bilingual sentences.","PeriodicalId":156840,"journal":{"name":"2009 International Conference on Asian Language Processing","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121500459","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Dialog-Act Recognition Using Discourse and Sentence Structure Information 基于话语和句子结构信息的对话行为识别
2009 International Conference on Asian Language Processing Pub Date : 2009-12-07 DOI: 10.1109/IALP.2009.12
Keyan Zhou, Chengqing Zong
{"title":"Dialog-Act Recognition Using Discourse and Sentence Structure Information","authors":"Keyan Zhou, Chengqing Zong","doi":"10.1109/IALP.2009.12","DOIUrl":"https://doi.org/10.1109/IALP.2009.12","url":null,"abstract":"Automatic recognition of Dialog-act (DA) is one of the most important processes in understanding spontaneous dialog. Most existing studies have been working on how to use various classifying methods in DA recognition; meanwhile, less attention has been paid to feature selection specifically. This paper introduces several textual features for DA recognizing, and proposes a novel usage for sentence structure features. Especially, this paper investigates the effect of discourse structure features in DA recognition, which are little studied before. The experimental results on both Chinese corpus and English Corpus show the selected features and feature combination rules significantly improve the overall performance. The accuracy of DA recognition rises from 77.05% to 88.21% on Chinese corpus, and from 59.08% to 64.92% as well on English corpus.","PeriodicalId":156840,"journal":{"name":"2009 International Conference on Asian Language Processing","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117013382","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Information Focus Synthesis Based on Question Answer Chain 基于问答链的信息焦点综合
2009 International Conference on Asian Language Processing Pub Date : 2009-12-07 DOI: 10.1109/IALP.2009.16
Jing Wan, Han Ren
{"title":"Information Focus Synthesis Based on Question Answer Chain","authors":"Jing Wan, Han Ren","doi":"10.1109/IALP.2009.16","DOIUrl":"https://doi.org/10.1109/IALP.2009.16","url":null,"abstract":"While speech synthesis technologies have come a long way in recent ten years, there is still room for improvement. This paper describes a technique called based on joint information structure, syntax and prosody method, which demonstrates noticeable improvements to existing speech synthesis system. As an important parameter for prosody proceedings in mandarin, information focus prosodic distribution features are typical for hearing natural, speech understanding and in-formation acquisition. Because of the complex mapping relation between information structure, syntax and prosody, we present an efficient method for retrieval information focus to augment a naturalness speech synthesis. We use question answering chain to extract information focus and discover them how to move. Then, we adopt feature classification and prosody predictive modeling to deal with fo-cus’s F0 and time period and obtain them features module. Based on the features module, should significantly increase the accuracy and naturalness of speech synthesis. The rest of this paper is organized as follows. Section 2 summarizes the previously proposed theory for information focus extraction, and derives a new method. Experiments are expressed in Section 3. And experimental results are shown in Section 4. Concluding remarks are presented in the final section.","PeriodicalId":156840,"journal":{"name":"2009 International Conference on Asian Language Processing","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121136159","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Study on Semantic Role Labeling of Korean Sentence 朝鲜语句子语义角色标注研究
2009 International Conference on Asian Language Processing Pub Date : 2009-12-07 DOI: 10.1109/IALP.2009.30
Yude Bi, Jing Chen
{"title":"A Study on Semantic Role Labeling of Korean Sentence","authors":"Yude Bi, Jing Chen","doi":"10.1109/IALP.2009.30","DOIUrl":"https://doi.org/10.1109/IALP.2009.30","url":null,"abstract":"The study of semantic role labeling is a hotspot in the field of Natural Language Processing. This paper, together with rationalism and empiricism, with the principle of pragmatism, from the perspective of semantic information processing, poses an approach to label the semantic role of Korean. The approach theoretically based on the level-framework of Korean verbs’ syntax and semantic, assisted with feature vector-based approach, combined with the classification marked database of category and concept, is used to test marked corpus for semantic role labeling study.","PeriodicalId":156840,"journal":{"name":"2009 International Conference on Asian Language Processing","volume":"121 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116836691","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Chinese Named Entity Recognition Using a Morpheme-Based Chunking Tagger 基于语素分块标注器的中文命名实体识别
2009 International Conference on Asian Language Processing Pub Date : 2009-12-07 DOI: 10.1109/IALP.2009.68
G. Fu
{"title":"Chinese Named Entity Recognition Using a Morpheme-Based Chunking Tagger","authors":"G. Fu","doi":"10.1109/IALP.2009.68","DOIUrl":"https://doi.org/10.1109/IALP.2009.68","url":null,"abstract":"Most previous studies formalize Chinese named entity recognition (NER) as a chunking task with either characters or lexicon words as the basic tokens for chunking. However, it is difficult under this formulation to explore lexical information for NER. Furthermore, traditional NER chunking systems usually employ an exhaustive strategy for entity candidate generation, obviously resulting in efficiency loss during entity decoding. In this paper we propose a morpheme-based chunking framework for Chinese NER and implement an efficient three-stage tagger using the pipeline strategy. To tackle the problem of out-of-vocabulary words and to more effectively explore lexical cues for NER as well, we distinguish named entities from common words and choose morphemes as the basic tokens for entity chunking. To reduce the space of entity candidates and improve the efficiency of entity decoding, we employ internal entity formation pattern rules during entity candidate generation. Our experiments on different datasets show that our system can greatly improve NER efficiency without much degradation of performance.","PeriodicalId":156840,"journal":{"name":"2009 International Conference on Asian Language Processing","volume":"102 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129483955","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
An Experimental Study of Vietnamese Question Answering System 越南语问答系统的实验研究
2009 International Conference on Asian Language Processing Pub Date : 2009-12-07 DOI: 10.1109/IALP.2009.39
V. Tran, V. Nguyen, Oanh T. K. Tran, Uyen Thu Thi Pham, Quang-Thuy Ha
{"title":"An Experimental Study of Vietnamese Question Answering System","authors":"V. Tran, V. Nguyen, Oanh T. K. Tran, Uyen Thu Thi Pham, Quang-Thuy Ha","doi":"10.1109/IALP.2009.39","DOIUrl":"https://doi.org/10.1109/IALP.2009.39","url":null,"abstract":"The development of World Wide Web calls for how to efficiently exploit the information. Mostly, current search engines return a set of related documents which contain keywords. However, users expect the exact and concrete answer for each question. Therefore, it is necessary to build an automatic question answering system (QA). In this paper, we focus on building a QA for Vietnamese. This task especially becomes more and more difficult because of the lack of available tools for processing Vietnamese text. Based on previous research for English, this paper proposed an implementation for Vietnamese question answering system by combining SnowBall system [1] and semantic relation extraction using search engines [4]. The experimental results on travelling domain proved that this proposed method is sufficient for Vietnamese question answering system. We achieved 89.7% precision and 91.4% ability to give the answers when testing on travelling domain","PeriodicalId":156840,"journal":{"name":"2009 International Conference on Asian Language Processing","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129630495","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
Stability vs. Effectiveness: Improved Sentence-Level Combination of Machine Translation Based on Weighted MBR 稳定性与有效性:基于加权MBR的改进句子级机器翻译组合
2009 International Conference on Asian Language Processing Pub Date : 2009-12-07 DOI: 10.1109/IALP.2009.17
Bo Wang, T. Zhao, Muyun Yang, Hongfei Jiang, Sheng Li
{"title":"Stability vs. Effectiveness: Improved Sentence-Level Combination of Machine Translation Based on Weighted MBR","authors":"Bo Wang, T. Zhao, Muyun Yang, Hongfei Jiang, Sheng Li","doi":"10.1109/IALP.2009.17","DOIUrl":"https://doi.org/10.1109/IALP.2009.17","url":null,"abstract":"We describe an improved strategy to combine the outputs of machine translation on sentence-level balancing the stability and the effectiveness of the combination. The new method alternates the classical MBR-based sentence-level combination with weighted Minimum Bayes Risk (wMBR). During the calculation of the risk, we weight the hypotheses with the performance of the MT system, which is measured by the automatic evaluation metrics on the development data. In experiments, the wMBR-based method stably achieve better results than other sentence-level methods and get the best position in CWMT08 evaluation track outperforming the other word-level and sentence-level combination systems.","PeriodicalId":156840,"journal":{"name":"2009 International Conference on Asian Language Processing","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127476196","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信