{"title":"A Systematic Investigation of Neural Models for Chinese Implicit Discourse Relationship Recognition","authors":"Dejian Li, Man Lan, Yuanbin Wu","doi":"10.1109/IALP48816.2019.9037686","DOIUrl":"https://doi.org/10.1109/IALP48816.2019.9037686","url":null,"abstract":"The Chinese implicit discourse relationship recognition is more challenging than English due to the lack of discourse connectives and high frequency in the text. So far, there is no systematical investigation into the neural components for Chinese implicit discourse relationship. To fill this gap, in this work we present a component-based neural framework to systematically study the Chinese implicit discourse relationship. Experimental results showed that our proposed neural Chinese implicit discourse parser achieves the SOTA performance in CoNLL-2016 corpus.","PeriodicalId":208066,"journal":{"name":"2019 International Conference on Asian Language Processing (IALP)","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127896435","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Automatic answer ranking based on sememe vector in KBQA","authors":"Yadi Li, Lingling Mu, Hao Li, Hongying Zan","doi":"10.1109/IALP48816.2019.9037712","DOIUrl":"https://doi.org/10.1109/IALP48816.2019.9037712","url":null,"abstract":"This paper proposes an answer ranking method used in Knowledge Base Question Answering (KBQA) system. This method first extracts the features of predicate sequence similarity based on sememe vector, predicates’ edit distances, predicates’ word co-occurrences and classification. Then the above features are used as inputs of the ranking learning algorithm Ranking SVM to rank the candidate answers. In this paper, the experimental results on the data set of KBQA system evaluation task in the 2016 Natural Language Processing & Chinese Computing (NLPCC 2016) show that, the method of word similarity calculation based on sememe vector has better results than the method based on word2vec. Its accuracy, recall rate and average F1 value respectively are 73.88%, 82.29% and 75.88%. The above results show that the word representation with knowledge has import effect on natural language processing.","PeriodicalId":208066,"journal":{"name":"2019 International Conference on Asian Language Processing (IALP)","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123498382","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Japanese grammatical simplification with simplified corpus","authors":"Yumeto Inaoka, Kazuhide Yamamoto","doi":"10.1109/IALP48816.2019.9037675","DOIUrl":"https://doi.org/10.1109/IALP48816.2019.9037675","url":null,"abstract":"We construct a Japanese grammatical simplification corpus and established automatic simplification methods. We compare the conventional machine translation approach, our proposed method, and a hybrid method by automatic and manual evaluation. The results of the automatic evaluation show that the proposed method exhibits a lower score than the machine translation approach; however, the hybrid method garners the highest score. According to those results, the machine translation approach and proposed method present different sentences that can be simplified, while the hybrid version is effective in grammatical simplification.","PeriodicalId":208066,"journal":{"name":"2019 International Conference on Asian Language Processing (IALP)","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127064645","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Chinese word segment model for energy literature based on Neural Networks with Electricity User Dictionary","authors":"Bochuan Song, Bo Chai, Qiang Zhang, Quanye Jia","doi":"10.1109/IALP48816.2019.9037728","DOIUrl":"https://doi.org/10.1109/IALP48816.2019.9037728","url":null,"abstract":"Traditional Chinese word segmentation (CWS) methods are based on supervised machine learning such as Condtional Random Fields(CRFs), Maximum Entropy(ME), whose features are mostly manual features. These manual features are often derived from local contexts. Currently, most state-of-art methods for Chinese word segmentation are based on neural networks. However these neural networks rarely introduct the user dictionary. We propose a LSTMbased Chinese word segmentation which can take advantage of the user dictionary. The experiments show that our model performs better than a popular segment tool in electricity domain. It is noticed that it achieves a better performance when transfered to a new domain using the user dictionary.","PeriodicalId":208066,"journal":{"name":"2019 International Conference on Asian Language Processing (IALP)","volume":"105 23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127456072","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Classified Description and Application of Chinese Constitutive Role","authors":"Mengxiang Wang, Cuiyan Ma","doi":"10.1109/IALP48816.2019.9037730","DOIUrl":"https://doi.org/10.1109/IALP48816.2019.9037730","url":null,"abstract":"Constitutive role is one of the 4 qualia roles, which expresses a kind of constitutive relationship between nouns. According to the original definition and description characteristics, this paper divides the constitutive roles into two categories: materials and components. At the same time, combined with the previous methods of extracting the role automatically, this paper optimizes the method of extracting the role automatically. Relying on auxiliary grammatical constructions, we extract noun-noun pairs from large-scale corpus to extract descriptive features of constitutive roles, and then classifies these descriptive knowledge by manual double-blind proofreading. Finally, the author discusses the application of Chinese constitutive roles in word-formational analysis, syntactic analysis and synonym discrimination.","PeriodicalId":208066,"journal":{"name":"2019 International Conference on Asian Language Processing (IALP)","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126439107","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Construction of Quantitative Index System of Vocabulary Difficulty in Chinese Grade Reading","authors":"Huiping Wang, Lijiao Yang, Huimin Xiao","doi":"10.1109/IALP48816.2019.9037664","DOIUrl":"https://doi.org/10.1109/IALP48816.2019.9037664","url":null,"abstract":"Chinese grade reading for children has a broad application prospect. In this paper, Chinese textbooks for grade 1 to 6 of primary schools published by People’s Education Press are taken as data sets, and the texts are divided into 12 difficulty levels successively. The effective lexical indexes to measure the readability of texts are discussed, and a regression model to effectively measure the lexical difficulty of Chinese texts is established. The study firstly collected 30 indexes at the text lexical level from the three dimensions of lexical richness, semantic transparency and contextual dependence, selected the 7 indexes with the highest relevance to the text difficulty through Person correlation coefficient, and finally constructed a Regression to predict the text difficulty based on Lasso Regression, ElasticNet, Ridge Regression and other algorithms. The regression results show that the model fits well, and the predicted value could explain 89.3% of the total variation of text difficulty, which proves that the quantitative index of vocabulary difficulty of Chinese text constructed in this paper is effective, and can be applied to Chinese grade reading and computer automatic grading of Chinese text difficulty.","PeriodicalId":208066,"journal":{"name":"2019 International Conference on Asian Language Processing (IALP)","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127626679","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ying Chen, Jiajing Zhang, Bingying Ye, Chenfang Zhou
{"title":"Prosodic Realization of Focus in Changchun Mandarin and Nanjing Mandarin","authors":"Ying Chen, Jiajing Zhang, Bingying Ye, Chenfang Zhou","doi":"10.1109/IALP48816.2019.9037655","DOIUrl":"https://doi.org/10.1109/IALP48816.2019.9037655","url":null,"abstract":"This study was designed to explore the prosodic patterns of focus in two dialects of Mandarin. One is Changchun Mandarin and the other is Nanjing Mandarin. The current paper compares the acoustics of their prosodic realization of focus in a production experiment. Similar to standard Mandarin, which uses in-focus expansion and concomitantly post-focus compression (PFC) to code focus, results in the current study indicate that both Changchun and Nanjing speakers produced significant in-focus expansion of pitch, intensity and duration and PFC of pitch and intensity in their Mandarin dialects. Meanwhile, the results show no significant difference of prosodic changes between Changchun and Nanjing Mandarin productions. These results reveal that PFC not only exists in standard Mandarin but also in Mandarin dialects.","PeriodicalId":208066,"journal":{"name":"2019 International Conference on Asian Language Processing (IALP)","volume":"75 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127673259","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A New Method of Tonal Determination for Chinese Dialects","authors":"Yan Li, Zhiyi Wu","doi":"10.1109/IALP48816.2019.9037711","DOIUrl":"https://doi.org/10.1109/IALP48816.2019.9037711","url":null,"abstract":"Values of the basic tones are the key to do research on dialects in China. The traditional method of determining tones by ear and the more popular method used in experimental phonetics are either inaccurate to some degree or difficult to learn. The method provided and discussed in this paper is simple and reliable, requiring the use of only Praat and fundamental frequency value. More examples are given to prove this method’s effectiveness.","PeriodicalId":208066,"journal":{"name":"2019 International Conference on Asian Language Processing (IALP)","volume":"103 2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123525775","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Exploring Context’s Diversity to Improve Neural Language Model","authors":"Yanchun Zhang, Xingyuan Chen, Peng Jin, Yajun Du","doi":"10.1109/IALP48816.2019.9037662","DOIUrl":"https://doi.org/10.1109/IALP48816.2019.9037662","url":null,"abstract":"The neural language models (NLMs), such as long short term memery networks (LSTMs), have achieved great success over the years. However the NLMs usually only minimize a loss between the prediction results and the target words. In fact, the context has natural diversity, i.e. there are few words that could occur more than once in a certain length of word sequence. We report the natural diversity as context’s diversity in this paper. The context’s diversity, in our model, means there is a high probability that the target words predicted by any two contexts are different given a fixed input sequence. Namely the softmax results of any two contexts should be diverse. Based on this observation, we propose a new cross-entropy loss function which is used to calculate the cross-entropy loss of the softmax outputs for any two different given contexts. Adding the new cross-entropy loss, our approach could explicitly consider the context’s diversity, therefore improving the model’s sensitivity of prediction for every context. Based on two typical LSTM models, one is regularized by dropout while the other is not, the results of our experiment show its effectiveness on the benchmark dataset.","PeriodicalId":208066,"journal":{"name":"2019 International Conference on Asian Language Processing (IALP)","volume":"191 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124245898","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Exploring Letter’s Differences between Partial Indonesian Branch Language and English","authors":"Nankai Lin, Sihui Fu, Jiawen Huang, Sheng-yi Jiang","doi":"10.1109/IALP48816.2019.9037715","DOIUrl":"https://doi.org/10.1109/IALP48816.2019.9037715","url":null,"abstract":"Differences of letter usage are the most basic differences between different languages, which can reflect the most essential diversity. Many linguists study the letter differences between common languages, but seldom research those between non-common languages. This paper selects three representative languages from the Indonesian branch of the Austronesian language family, namely Malay, Indonesian and Filipino. To study the letter differences between these three languages and English, we concentrate on word length distribution, letter frequency distribution, commonly used letter pairs, commonly used letter trigrams, and ranked letter frequency distribution. The results show that great differences do exist between three Indonesian-branch languages and English, and the differences between Malay and Indonesian are the smallest.","PeriodicalId":208066,"journal":{"name":"2019 International Conference on Asian Language Processing (IALP)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129712179","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}