{"title":"Maximum entropy based emotion classification of Chinese blog sentences","authors":"Cheng Wang, Changqin Quan, F. Ren","doi":"10.1109/NLPKE.2010.5587798","DOIUrl":"https://doi.org/10.1109/NLPKE.2010.5587798","url":null,"abstract":"At present there are increasing studies on the classification of textual emotions. Especially with the rapid developments of Internet technology, classifying blog emotions has become a new research field. In this paper, we classified the sentence emotion using the machine learning method based on the maximum entropy model and the Chinese emotion corpus (Ren-CECps)*. Ren-CECps contains eight basic emotion categories (expect, joy, love, surprise, anxiety, sorrow, hate and anger), which presents us with the opportunity to systematically analyze the complex human emotions. Three features (keywords, POS and intensity) were considered for sentence emotion classification, and three aspect experiments have been carried out: 1) classification of any two emotions, 2) classification of eight emotions, and 3) classification of positive and negative emotions. The highest classification accuracies of the three aspect experiments were 90.62%, 35.66% and 73.96%, respectively.","PeriodicalId":259975,"journal":{"name":"Proceedings of the 6th International Conference on Natural Language Processing and Knowledge Engineering(NLPKE-2010)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126633942","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Graph-based service quality evaluation through mining Web reviews","authors":"Suke Li, Jinmei Hao, Zhong Chen","doi":"10.1109/NLPKE.2010.5587817","DOIUrl":"https://doi.org/10.1109/NLPKE.2010.5587817","url":null,"abstract":"This work tries to find a possible solution to the basic research problem: how to conduct service quality evaluation through mining Web reviews? To address this problem, this work proposes a novel approach to service quality evaluation which has two essential subtasks: 1) finding the most important service aspects, and 2) measuring service quality using ranked service aspects. We propose three graph-based ranking models to rank service aspects and a simple linear method of measuring service quality. Empirical experimental results show all our three methods outperform the approach of Noun Frequency. We also show the effectiveness of our service quality evaluation method by conducting intensive regression experiments.","PeriodicalId":259975,"journal":{"name":"Proceedings of the 6th International Conference on Natural Language Processing and Knowledge Engineering(NLPKE-2010)","volume":"170 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116845825","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An map based sentence ranking approach to automatic summarization","authors":"Xiaofeng Wu, Chengqing Zong","doi":"10.1109/NLPKE.2010.5587824","DOIUrl":"https://doi.org/10.1109/NLPKE.2010.5587824","url":null,"abstract":"While the current main stream of automatic summarization is to extract sentences, that is, to use various machine learning methods to give each sentence of a document a score and get the highest sentences according to a ratio. This is quite similar to the current more and more active field —learning to rank. A few pair-wised learning to rank approaches have been tested for query summarization. In this paper we are the pioneers to use a new general summarization approach based on learning to rank approach, and adopt a list-wised optimizing object MAP to extract sentences from documents, which is a widely used evaluation measure in information retrieval (IR). Specifically, we use SVMMAP toolkit which can give global optimal solution to train and score each sentences. Our experiment results shows that our approach could outperform the stand-of-the-art pair-wised approach greatly by using the same features, and even slightly better then the reported best result which based on sequence labeling approach CRF.","PeriodicalId":259975,"journal":{"name":"Proceedings of the 6th International Conference on Natural Language Processing and Knowledge Engineering(NLPKE-2010)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129786677","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Minimum edit distance-based text matching algorithm","authors":"Yu Zhao, Huixing Jiang, Xiaojie Wang","doi":"10.1109/NLPKE.2010.5587852","DOIUrl":"https://doi.org/10.1109/NLPKE.2010.5587852","url":null,"abstract":"This paper proposes a measurement based on Minimum Edit Distance (MED) to the similarity between two sets of MultiWord Expressions (MWEs), which we use to calculate matching degree between two documents. We test the matching algorithm in the position searching system. Experiments show that the new measurement has higher performance than the cosine distance.","PeriodicalId":259975,"journal":{"name":"Proceedings of the 6th International Conference on Natural Language Processing and Knowledge Engineering(NLPKE-2010)","volume":"63 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129939663","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mohammad Daoud, K. Kageura, C. Boitet, A. Kitamoto, Daoud M. Daoud
{"title":"Passive and active contribution to multilingual lexical resources through online cultural activities","authors":"Mohammad Daoud, K. Kageura, C. Boitet, A. Kitamoto, Daoud M. Daoud","doi":"10.1109/NLPKE.2010.5587808","DOIUrl":"https://doi.org/10.1109/NLPKE.2010.5587808","url":null,"abstract":"In this paper we are proposing a contribution scheme for multilingual (preterminology). Preterminology is a lexical resource of unconfirmed terminology. We are explaining the difficulties of building lexical resources collaboratively. And we suggest a scheme of active and passive contributions that will ease the process and satisfy the Contribution Factors. We experimented our passive and active approaches with the Digital Silk Road Project where we analyzed visitors' behavior to find interesting trends and terminology candidates. And we built a contribution gateway that interacts with the visitors in order to motivate them to contribute while they are doing their usual activities.","PeriodicalId":259975,"journal":{"name":"Proceedings of the 6th International Conference on Natural Language Processing and Knowledge Engineering(NLPKE-2010)","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121296811","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Automatic classification of documents by formality","authors":"Fadi Abu Sheikha, D. Inkpen","doi":"10.1109/NLPKE.2010.5587767","DOIUrl":"https://doi.org/10.1109/NLPKE.2010.5587767","url":null,"abstract":"This paper addresses the task of classifying documents into formal or informal style. We studied the main characteristics of each style in order to choose features that allowed us to train classifiers that can distinguish between the two styles. We built our data set by collecting documents for both styles, from different sources. We tested several classification algorithms, namely Decision Trees, Naïve Bayes, and Support Vector Machines, to choose the classifier that leads to the best classification results. We performed attribute selection in order to determine the contribution of each feature to our model.","PeriodicalId":259975,"journal":{"name":"Proceedings of the 6th International Conference on Natural Language Processing and Knowledge Engineering(NLPKE-2010)","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126478324","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yuan Sun, Xiaodong Yan, Xiaobing Zhao, Guosheng Yang
{"title":"Research on automatic recognition of Tibetan personal names based on multi-features","authors":"Yuan Sun, Xiaodong Yan, Xiaobing Zhao, Guosheng Yang","doi":"10.1109/NLPKE.2010.5587820","DOIUrl":"https://doi.org/10.1109/NLPKE.2010.5587820","url":null,"abstract":"Tibetan name has strong religious and cultural connotations, and the construction is different from Chinese name. So far, there is limited research on Tibetan name recognition, and current method for recognition of Chinese names does not work on Tibetan names. Therefore, through the analysis of Tibetan names characteristics, this paper proposes an automatic recognition method of Tibetan name based on multi-features. This method uses the internal features of names, contextual features and boundary features of names, and establishes the dictionary and feature base of Tibetan names. Finally, an experiment is conducted, and the results prove the algorithm is effective.","PeriodicalId":259975,"journal":{"name":"Proceedings of the 6th International Conference on Natural Language Processing and Knowledge Engineering(NLPKE-2010)","volume":"1148 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134353563","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A lightweight Chinese semantic dependency parsing model based on sentence compression","authors":"Xin Wang, Weiwei Sun, Zhifang Sui","doi":"10.1109/NLPKE.2010.5587780","DOIUrl":"https://doi.org/10.1109/NLPKE.2010.5587780","url":null,"abstract":"This paper is concerned with lightweight semantic dependency parsing for Chinese. We propose a novel sentence compression based model for semantic dependency parsing without using any syntactic dependency information. Our model divides semantic dependency parsing into two sequential sub-tasks: sentence compression and semantic dependency recognition. Sentence compression method is used to get backbone information of the sentence, conveying candidate heads of arguments to the next step. The bilexical semantic relations between words in the compressed sentence and predicates are then recognized in a pairwise way. We present encouraging results on the Chinese data set from CoNLL 2009 shared task. Without any syntactic information, our semantic dependency parsing model still outperforms the best reported system in the literature.","PeriodicalId":259975,"journal":{"name":"Proceedings of the 6th International Conference on Natural Language Processing and Knowledge Engineering(NLPKE-2010)","volume":"185 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115594516","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Comparative evaluation of two arabic speech corpora","authors":"Y. Alotaibi, A. Meftah","doi":"10.1109/NLPKE.2010.5587819","DOIUrl":"https://doi.org/10.1109/NLPKE.2010.5587819","url":null,"abstract":"The aim of this paper is to conduct a constructive and comparative evaluation between two important Arabic corpora for two different Arabic dialects, namely, Saudi dialect corpus that was collected by King Abdulaziz City for Science and Technology (KACST), and a Levantine Arabic dialect corpus. Levantine dialect is spoken by ordinary Lebanese, Jordanian, Syrian, and Palestinian people. The later one was produced by the Linguistic Data Consortium (LDC). Advantages and disadvantages of these two corpora were presented and discussed. This discussion is aiming to help digital speech processing researchers to figure out the weakness and strength sides of these important corpora before considering them in their experiments. Moreover, this paper can motivate in designing, maintaining, distributing, and upgrading Arabic corpora to help Arabic language speech research communities.","PeriodicalId":259975,"journal":{"name":"Proceedings of the 6th International Conference on Natural Language Processing and Knowledge Engineering(NLPKE-2010)","volume":"47 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115681821","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Augmenting the automated extracted tree adjoining grammars by semantic representation","authors":"Heshaam Faili, A. Basirat","doi":"10.1109/NLPKE.2010.5587766","DOIUrl":"https://doi.org/10.1109/NLPKE.2010.5587766","url":null,"abstract":"MICA [1] is a fast and accurate dependency parser for English that uses an automatically LTAG derived from Penn Treebank (PTB) using the Chen's approach [7]. However, there is no semantic representation related to its grammar. On the other hand, XTAG [20] grammar is a hand crafted LTAG that its elementary trees were enriched with the semantic representation by experts. The linguistic knowledge embedded in the XTAG grammar caused it to being used in wide variety of natural language applications. However, the current XTAG parser is not as fast and accurate as well as the MICA parser.","PeriodicalId":259975,"journal":{"name":"Proceedings of the 6th International Conference on Natural Language Processing and Knowledge Engineering(NLPKE-2010)","volume":"104 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124822055","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}