Proceedings of the 2021 5th International Conference on Natural Language Processing and Information Retrieval最新文献_第2页

Feature Extraction Technique Based on Conv1D and Conv2D Network for Thai Speech Emotion Recognition 基于Conv1D和Conv2D网络的泰语语音情感识别特征提取技术

Proceedings of the 2021 5th International Conference on Natural Language Processing and Information Retrieval Pub Date : 2021-12-17 DOI: 10.1145/3508230.3508238

Naris Prombut, S. Waijanya, Nuttachot Promrit

引用次数: 4

Automated Intention Mining with Comparatively Fine-tuning BERT 基于相对微调BERT的自动化意图挖掘

Proceedings of the 2021 5th International Conference on Natural Language Processing and Information Retrieval Pub Date : 2021-12-17 DOI: 10.1145/3508230.3508254

Xuan Sun, Luqun Li, F. Mercaldo, Yichen Yang, A. Santone, F. Martinelli

引用次数: 0

CBCP: A Method of Causality Extraction from Unstructured Financial Text CBCP:一种非结构化金融文本的因果关系提取方法

Proceedings of the 2021 5th International Conference on Natural Language Processing and Information Retrieval Pub Date : 2021-12-17 DOI: 10.1145/3508230.3508250

Lang Cao, Shihuangzhai Zhang, Juxing Chen

{"title":"CBCP: A Method of Causality Extraction from Unstructured Financial Text","authors":"Lang Cao, Shihuangzhai Zhang, Juxing Chen","doi":"10.1145/3508230.3508250","DOIUrl":"https://doi.org/10.1145/3508230.3508250","url":null,"abstract":"Extracting causality information from unstructured natural language text is a challenging problem in natural language processing. However, there are no mature special causality extraction systems. Most people use basic sequence labeling methods, such as BERT-CRF model, to extract causal elements from unstructured text and the results are usually not well. At the same time, there is a large number of causal event relations in the field of finance. If we can extract enormous financial causality, this information will help us better understand the relationships between financial events and build related event evolutionary graphs in the future. In this paper, we propose a causality extraction method for this question, named CBCP (Center word-based BERT-CRF with Pattern extraction), which can directly extract cause elements and effect elements from unstructured text. Compared to BERT-CRF model, our model incorporates the information of center words as prior conditions and performs better in the performance of entity extraction. Moreover, our method combined with pattern can further improve the effect of extracting causality. Then we evaluate our method and compare it to the basic sequence labeling method. We prove that our method performs better than other basic extraction methods on causality extraction tasks in the finance field. At last, we summarize our work and prospect some future work.","PeriodicalId":252146,"journal":{"name":"Proceedings of the 2021 5th International Conference on Natural Language Processing and Information Retrieval","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128896098","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Improved Bi-GRU Model for Imbalanced English Toxic Comments Dataset 不平衡英语有毒评论数据集的改进Bi-GRU模型

Proceedings of the 2021 5th International Conference on Natural Language Processing and Information Retrieval Pub Date : 2021-12-17 DOI: 10.1145/3508230.3508234

Zhongguo Wang, Bao Zhang

引用次数: 1

Scored and Error-annotated Essay Dataset of Chinese EFL/ESL Learners 中国EFL/ESL学习者的记分和纠错论文数据集

Proceedings of the 2021 5th International Conference on Natural Language Processing and Information Retrieval Pub Date : 2021-12-17 DOI: 10.1145/3508230.3508245

Kai Jin, Wuying Liu

引用次数: 0

Topic Segmentation for Interview Dialogue System 访谈对话系统的话题分割

Proceedings of the 2021 5th International Conference on Natural Language Processing and Information Retrieval Pub Date : 2021-12-17 DOI: 10.1145/3508230.3508237

Taiga Kirihara, Kazuyuki Matsumoto, M. Sasayama, Minoru Yoshida, K. Kita

引用次数: 1

Research on judgment reasoning using natural language inference in Chinese medical texts 基于自然语言推理的中医文本判断推理研究

Proceedings of the 2021 5th International Conference on Natural Language Processing and Information Retrieval Pub Date : 2021-12-17 DOI: 10.1145/3508230.3508248

Xin Li, Wenping Kong

{"title":"Research on judgment reasoning using natural language inference in Chinese medical texts","authors":"Xin Li, Wenping Kong","doi":"10.1145/3508230.3508248","DOIUrl":"https://doi.org/10.1145/3508230.3508248","url":null,"abstract":"Machine reading comprehension (MRC) is a task used to test the degree to which a machine understands natural language by asking the machine to answer questions according to a given context. Judgment reasoning is one of MRC tasks which means that given a context and questions, let machine gives the true and false answers, for some real-world data, there will be another option of unknown. Considering the current research status, this paper uses natural language inference (NLI) models to further study this judgment reasoning task, which is mainly to judge the semantic relationship between two sentences. In our paper, we first explain how the NLI task can be used to train universal sentence encoding models in the judgment reasoning process and subsequently describe the architectures used in NLI task, which covers a suitable range of sentence encoders currently in use and take the bi-directional long short-term memory (BI-LSTM) model with max-pooling over the hidden representations as an example explained in this paper. After some comparative experiments, we have verified that our NLI models are effective strategies to improve the performance of judgment reasoning in Chinese medical texts, which can effectively improve the accuracy values.","PeriodicalId":252146,"journal":{"name":"Proceedings of the 2021 5th International Conference on Natural Language Processing and Information Retrieval","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124430715","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Low-Resource NMT: A Case Study on the Written and Spoken Languages in Hong Kong 低资源的新语言机器学习:以香港书面语和口语为例

Proceedings of the 2021 5th International Conference on Natural Language Processing and Information Retrieval Pub Date : 2021-12-17 DOI: 10.1145/3508230.3508242

Hei Yi Mak, Tan Lee

{"title":"Low-Resource NMT: A Case Study on the Written and Spoken Languages in Hong Kong","authors":"Hei Yi Mak, Tan Lee","doi":"10.1145/3508230.3508242","DOIUrl":"https://doi.org/10.1145/3508230.3508242","url":null,"abstract":"The majority of inhabitants in Hong Kong are able to read and write in standard Chinese but use Cantonese as the primary spoken language in daily life. Spoken Cantonese can be transcribed into Chinese characters, which constitute the so-called written Cantonese. Written Cantonese exhibits significant lexical and grammatical differences from standard written Chinese. The rise of written Cantonese is increasingly evident in the cyber world. The growing interaction between Mandarin speakers and Cantonese speakers is leading to a clear demand for automatic translation between Chinese and Cantonese. This paper describes a transformer-based neural machine translation (NMT) system for written-Chinese-to-written-Cantonese translation. Given that parallel text data of Chinese and Cantonese are extremely scarce, a major focus of this study is on the effort of preparing good amount of training data for NMT. In addition to collecting 28K parallel sentences from previous linguistic studies and scattered internet resources, we devise an effective approach to obtaining 72K parallel sentences by automatically extracting pairs of semantically similar sentences from parallel articles on Chinese Wikipedia and Cantonese Wikipedia. We show that leveraging highly similar sentence pairs mined from Wikipedia improves translation performance in all test sets. Our system outperforms Baidu Fanyi's Chinese-to-Cantonese translation on 6 out of 8 test sets in BLEU scores. Translation examples reveal that our system is able to capture important linguistic transformations between standard Chinese and spoken Cantonese.","PeriodicalId":252146,"journal":{"name":"Proceedings of the 2021 5th International Conference on Natural Language Processing and Information Retrieval","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129902136","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Natural Language Processing Applied on Large Scale Data Extraction from Scientific Papers in Fuel Cells 自然语言处理在燃料电池科学论文大规模数据提取中的应用

Proceedings of the 2021 5th International Conference on Natural Language Processing and Information Retrieval Pub Date : 2021-12-17 DOI: 10.1145/3508230.3508256

Feifan Yang

引用次数: 0

Examination of the quality of Conceptnet relations for PubMed abstracts PubMed摘要的概念关系质量检验

Proceedings of the 2021 5th International Conference on Natural Language Processing and Information Retrieval Pub Date : 2021-12-17 DOI: 10.1145/3508230.3508243

Rajeswaran Viswanathan, S. Priya

引用次数: 0