M. Jinnai, Y. Akashi, S. Nakaya, F. Ren, M. Fukumi
{"title":"Recognition of abnormal vibrational responses of signposts using the Two-dimensional Geometric Distance and Wilcoxon test","authors":"M. Jinnai, Y. Akashi, S. Nakaya, F. Ren, M. Fukumi","doi":"10.1109/NLPKE.2010.5587837","DOIUrl":"https://doi.org/10.1109/NLPKE.2010.5587837","url":null,"abstract":"In expressway companies, workers have been impacting signposts using wooden hammers and estimating the degree of the corrosion by listening to the sound. In order to automate this, we have been developing software that recognizes an abnormal impact vibrational response due to corrosion. This software extracts sonograms from impact vibrational waves using the LPC spectrum analysis, and matches images of the sonogram between a standard and an input impact vibrations using the Two-dimensional Geometric Distance. Then, the software distinguishes the abnormality of the input impact vibration using Wilcoxon rank-sum test. We have measured the impact vibrations of five normal signposts and five abnormal signposts, and carried out the automatic recognition experiments. As a result, the software has recognized correctly in all cases. We have verified the effectiveness of the proposed method.","PeriodicalId":259975,"journal":{"name":"Proceedings of the 6th International Conference on Natural Language Processing and Knowledge Engineering(NLPKE-2010)","volume":"191 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116781424","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Marine literature categorization based on minimizing the labelled data","authors":"Wei Zhang, Qiuhong Wang, Yeheng Deng, R. Du","doi":"10.1109/NLPKE.2010.5587847","DOIUrl":"https://doi.org/10.1109/NLPKE.2010.5587847","url":null,"abstract":"In marine literature categorization, supervised machine learning method will take a lot of time for labelling the samples by hand. So we utilize Co-training method to decrease the quantities of labelled samples needed for training the classifier. In this paper, we only select features from the text details and add attribute labels to them. It can greatly boost the efficiency of text processing. For building up two views, we split features into two parts, each of which can form an independent view. One view is made up of the feature set of abstract, and the other is made up of the feature sets of title, keywords, creator and department. In experiments, the F1 value and error rate of the categorization system could reach about 0.863 and 14.26%.They are close to the performance of supervised classifier (0.902 and 9.13%), which is trained by more than 1500 labelled samples, however, the labelled samples used by Co-training categorization method to train the original classifier are only one positive sample and one negative sample. In addition we consider joining the idea of the active-learning in Co-training method.","PeriodicalId":259975,"journal":{"name":"Proceedings of the 6th International Conference on Natural Language Processing and Knowledge Engineering(NLPKE-2010)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115182144","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Anusaaraka: An expert system based machine translation system","authors":"Sriram Chaudhury, A. Rao, D. Sharma","doi":"10.1109/NLPKE.2010.5587789","DOIUrl":"https://doi.org/10.1109/NLPKE.2010.5587789","url":null,"abstract":"Most research in Machine translation is about having the computers completely bear the load of translating one human language into another. This paper looks at the machine translation problem afresh and observes that there is a need to share the load between man and machine, distinguish reliable knowledge from the heuristics, provide a spectrum of outputs to serve different strata of people, and finally make use of existing resources instead of reinventing the wheel. This paper describes a unique approach to develop machine translation system based on the insights of information dynamics from Paninian Grammar Formalism. Anusaaraka is a Language Accessor cum Machine Translation system based on the fundamental premise of sharing the load producing good enough results according to the needs of the reader. The system promises to give faithful representation of the translated text, no loss of information while translating and graceful degradation (robustness) in case of failure. The layered output provides an access to all the stages of translation making the whole process transparent. Thus, Anusaaraka differs from the Machine Translation systems in two respects: (1) its commitment to faithfulness and thereby providing a layer of 100% faithful output so that a user with some training can “access the source text” faithfully. (2) The system is so designed that a user can contribute to it and participate in improving its quality. Further Anusaaraka provides an eclectic combination of the Apertium architecture with the forward chaining expert system, allowing use of both the deep parser and shallow parser outputs to analyze the SL text. Existing language resources (parsers, taggers, chunkers) available under GPL are used instead of rewriting it again. Language data and linguistic rules are independent from the core programme, making it easy for linguists to modify and experiment with different language phenomena to improve the system. Users can become contributors by contributing new word sense disambiguation (WSD) rules of the ambiguous words through a web-interface available over internet. The system uses forward chaining of expert system to infer new language facts from the existing language data. It helps to solve the complex behavior of language translation by applying specific knowledge rather than specific technique creating a vast language knowledge base in electronic form. Or in other words, the expert system facilitates the transformation of subject matter expert's (SME) knowledge available with humans into a computer processable knowledge base.","PeriodicalId":259975,"journal":{"name":"Proceedings of the 6th International Conference on Natural Language Processing and Knowledge Engineering(NLPKE-2010)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129896254","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Extracting opinion sentence by combination of SVM and syntactic templates","authors":"Bo Zhang, Yanquan Zhou, Yu Mao","doi":"10.1109/NLPKE.2010.5587835","DOIUrl":"https://doi.org/10.1109/NLPKE.2010.5587835","url":null,"abstract":"This paper presents a combined method of syntactic structure, dependency relation and SVM classifier to extract opinion sentences. At first, we use the syntactic structure templates with high confidence summarized artificially and the dependency relation templates with high precision obtained by a dependency relation extraction algorithm to tag sentences as opinion sentence. Then we input the remaining test data to a trained SVM classifier which is created by a rigorous process of feature selection. Eventually the combined method performed well, achieving 92.6% recall with 85.5% precision.","PeriodicalId":259975,"journal":{"name":"Proceedings of the 6th International Conference on Natural Language Processing and Knowledge Engineering(NLPKE-2010)","volume":"29 2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128669541","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Improving emotion recognition from text with fractionation training","authors":"Ye Wu, F. Ren","doi":"10.1109/NLPKE.2010.5587800","DOIUrl":"https://doi.org/10.1109/NLPKE.2010.5587800","url":null,"abstract":"Previous approaches of emotion recognition from text were mostly implemented under keyword-based or learning-based frameworks. However, keyword-based systems are unable to recognize emotion from text with no emotional keywords, and constructing an emotion lexicon is a tough work because of ambiguity in defining all emotional keywords. Completely prior-knowledge-free supervised machine learning methods for emotion recognition also do not perform as well as on some traditional tasks. In this paper, a fractionation training approach is proposed, utilizing the emotion lexicon extracted from an annotated blog emotion corpus to train SVM classifiers. Experimental results show the effectiveness of the proposed approach, and the use of some other experimental design also improves the classification accuracy.","PeriodicalId":259975,"journal":{"name":"Proceedings of the 6th International Conference on Natural Language Processing and Knowledge Engineering(NLPKE-2010)","volume":"75 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128574637","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Detection of users suspected of using multiple user accounts and manipulating evaluations in a community site","authors":"Naoki Ishikawa, Kenji Umemoto, Yasuhiko Watanabe, Yoshihiro Okada, Ryo Nishimura, M. Murata","doi":"10.1109/NLPKE.2010.5587765","DOIUrl":"https://doi.org/10.1109/NLPKE.2010.5587765","url":null,"abstract":"Some users in a community site abuse the anonymity and attempt to manipulate communications in a community site. These users and their submissions discourage other users, keep them from retrieving good communication records, and decrease the credibility of the communication site. To solve this problem, we conducted an experimental study to detect users suspected of using multiple user accounts and manipulating evaluations in a community site. In this study, we used messages in the data of Yahoo! chiebukuro for data training and examination.","PeriodicalId":259975,"journal":{"name":"Proceedings of the 6th International Conference on Natural Language Processing and Knowledge Engineering(NLPKE-2010)","volume":"243 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129773764","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Shiqi Li, T. Zhao, Hanjing Li, Shui Liu, Pengyuan Liu
{"title":"Using cognitive model to automatically analyze Chinese predicate","authors":"Shiqi Li, T. Zhao, Hanjing Li, Shui Liu, Pengyuan Liu","doi":"10.1109/NLPKE.2010.5587843","DOIUrl":"https://doi.org/10.1109/NLPKE.2010.5587843","url":null,"abstract":"This paper presents an cognitive approach to semantic role labeling in Chinese based on an extension of Construction-Integration (CI) model. The method can implicitly integrate more contextual and general knowledge into the calculating process in contrast with the machine learning methods. First, we define a proposition representation as the basic unit for semantic role labeling using CI model. Then the contextually appropriate propositions will be strengthened and inappropriate ones will be inhibited by simulating the spreading activation of human mind. Finally, experimental results show an encouraging performance on Chinese PropBank (CPB) and other two datasets.","PeriodicalId":259975,"journal":{"name":"Proceedings of the 6th International Conference on Natural Language Processing and Knowledge Engineering(NLPKE-2010)","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117129179","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Generating english-persian parallel corpus using an automatic anchor finding sentence aligner","authors":"Meisam Vosoughpour Yazdchi, Heshaam Faili","doi":"10.1109/NLPKE.2010.5587769","DOIUrl":"https://doi.org/10.1109/NLPKE.2010.5587769","url":null,"abstract":"The more we can enlarge a parallel bilingual corpus, the more we have made it effective and powerful. Providing such corpora demands special efforts both in seeking for as much already translated texts as possible and also in designing appropriate sentence alignment algorithms with as less time complexity as possible. In this paper, we propose algorithms for sentence aligning of two Persian-English texts in linear time complexity and with a surprisingly high accuracy. This linear time-complexity is achieved through our new language-independent anchor finding algorithm which enables us to align as a big parallel text as a whole book in a single attempt and with a high accuracy. As far as we know, this project is the first automatic construction of an English-Persian parallel sentence-level corpus.","PeriodicalId":259975,"journal":{"name":"Proceedings of the 6th International Conference on Natural Language Processing and Knowledge Engineering(NLPKE-2010)","volume":"55 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122993030","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
M. Shimura, Fumiaki Monma, S. Mitsuyoshi, M. Shuzo, Taishi Yamamoto, I. Yamada
{"title":"Descriptive analysis of emotion and feeling in voice","authors":"M. Shimura, Fumiaki Monma, S. Mitsuyoshi, M. Shuzo, Taishi Yamamoto, I. Yamada","doi":"10.1109/NLPKE.2010.5587794","DOIUrl":"https://doi.org/10.1109/NLPKE.2010.5587794","url":null,"abstract":"Recognition of human “emotions” or “feelings” from voice is important to research on human communications. Although there has been much research on emotions or feelings in voice, definitions of these terms have been inconsistent. We reviewed previous papers in linguistics, brain science, information science, etc. and developed specific definitions for these term. In our paper, “emotion” is defined as an involuntary reaction in the human brain; it has two states: pleasure and displeasure. “Feeling” (e.g., anger, enjoyment, sadness, fear, and distress) is defined as a state voluntarily resulting from an emotion. Here, we should notice that the pleasure-displeasure direction does not always correspond to the feeling. So, our objective is to obtain sufficient amount of voice data and to analyze the relationship between emotions and feelings. In voice recording experiments, the voice database from about 100 participants with various natural feelings was constructed. A result of descriptive analysis showed that pleasure-displeasure direction did not correspond to the each feeling in 5% of voice data. This result suggested that, if an experimental situation is constructed that tends to arouse various feelings, data with less variability can be obtained. Further analysis of the characteristics of the data obtained to identify situations in which the pleasure-displeasure direction does not necessarily correspond to the basic feeling should lead to improved accuracy of voice emotion recognition.","PeriodicalId":259975,"journal":{"name":"Proceedings of the 6th International Conference on Natural Language Processing and Knowledge Engineering(NLPKE-2010)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124413153","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
C. Boitet, H. Blanchon, Mark Seligman, Valérie Bellynck
{"title":"MT on and for the Web","authors":"C. Boitet, H. Blanchon, Mark Seligman, Valérie Bellynck","doi":"10.1109/NLPKE.2010.5587865","DOIUrl":"https://doi.org/10.1109/NLPKE.2010.5587865","url":null,"abstract":"A Systran MT server became available on the minitel network in 1984, and on Internet in 1994. Since then we have come to a better understanding of the nature of MT systems by separately analyzing their linguistic, computational, and operational architectures. Also, thanks to the CxAxQ metatheorem, the systems' inherent limits have been clarified, and design choices can now be made in an informed manner according to the translation situations. MT evaluation has also matured: tools based on reference translations are useful for measuring progress; those based on subjective judgments for estimating future usage quality; and task-related objective measures (such as post-editing distances) for measuring operational quality. Moreover, the same technological advances that have led to “Web 2.0” have brought several futuristic predictions to fruition. Free Web MT services have democratized assimilation MT beyond belief. Speech translation research has given rise to usable systems for restricted tasks running on PDAs or on mobile phones connected to servers. New man-machine interface techniques have made interactive disambiguation usable in large-coverage multimodal MT. Increases in computing power have made statistical methods workable, and have led to the possibility of building low-linguistic-quality but still useful MT systems by machine learning from aligned bilingual corpora (SMT, EBMT). In parallel, progress has been made in developing interlingua-based MT systems, using hybrid methods. Unfortunately, many misconceptions about MT have spread among the public, and even among MT researchers, because of ignorance of the past and present of MT R&D. A compensating factor is the willingness of end users to freely contribute to building essential parts of the linguistic knowledge needed to construct MT systems, whether corpus-related or lexical. Finally, some developments we anticipated fifteen years ago have not yet materialized, such as online writing tools equipped with interactive disambiguation, and as a corollary the possibility of transforming source documents into self-explaining documents (SEDs) and of producing corresponding SEDs fully automatically in several target languages. These visions should now be realized, thanks to the evolution of Web programming and multilingual NLP techniques, leading towards a true Semantic Web, “Web 3.0”, which will support ubilingual (ubiquitous multilingual) computing.","PeriodicalId":259975,"journal":{"name":"Proceedings of the 6th International Conference on Natural Language Processing and Knowledge Engineering(NLPKE-2010)","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120994756","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}