Proceedings of the 4th International Conference on Natural Language Processing and Information Retrieval最新文献

筛选
英文 中文
IME-Spell: Chinese Spelling Check based on Input Method 基于输入法的中文拼写检查
Qingbiao Zhao, Xingfa Shen, Jian Yao
{"title":"IME-Spell: Chinese Spelling Check based on Input Method","authors":"Qingbiao Zhao, Xingfa Shen, Jian Yao","doi":"10.1145/3443279.3443297","DOIUrl":"https://doi.org/10.1145/3443279.3443297","url":null,"abstract":"Intended for reducing manual inspection costs and semantic misunderstandings, Chinese Spelling Check (CSC) has been investigated extensively in natural language processing. However, little work has yet been done on input-method-based CSC in which CSC can make use of Pinyin information to improve spelling correction efficiency. This paper proposes a novel CSC architecture, IME-Spell, based on pre-trained context vectors for input methods, which consists of two parts as follows. The Chinese spelling detection part of the architecture adopts the fusion vectors of character-based pre-trained context vectors and Pinyin vectors, and uses the method of sequence labeling to detect the error characters. The Chinese spelling correction part of the architecture adopts Masked Language Model (MLM) to generate a candidate set of erroneous characters, and uses XGBoost and Pinyin-to-Character conversion models to filter correct characters and correct the error characters for users. IME-Spell has a significant improvement over the benchmark models on the SIGHAN dataset, whose maximum difference of F1 in the spelling detection and correction subtasks reach 48.9% and 27.8%, respectively.","PeriodicalId":414366,"journal":{"name":"Proceedings of the 4th International Conference on Natural Language Processing and Information Retrieval","volume":"49 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122745752","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Automatic Summarization of Stock Market News Articles 股票市场新闻文章的自动摘要
J. Logeesan, Y. Rishoban, H. A. Caldera
{"title":"Automatic Summarization of Stock Market News Articles","authors":"J. Logeesan, Y. Rishoban, H. A. Caldera","doi":"10.1145/3443279.3443289","DOIUrl":"https://doi.org/10.1145/3443279.3443289","url":null,"abstract":"Stock market news articles published by leading companies are read by every trader to carry out their trading activities as they provide real time and reliable information about the organization. These news articles help in analyzing and identifying essential facts in trading. If these facts can be quickly captured from the articles could lead to look for more articles for better accuracy on their decision making. This research focuses on the single document based abstractive summarization of stock market investment news articles for traders. A summarization tool to extract the salient sentences from stock market investment news article on trading is developed in this research. In methodology, A keyword based weighting in extracting the sentences are used to enrich the domain relevancy. Domain is one of the deterministic factors in summarization which helps to correctly interpret the words. Finally an efficient graph algorithm is used to obtain the fluent summary. Then these summaries were compared with the domain expert summary to identify how far the summarization is useful for the traders.","PeriodicalId":414366,"journal":{"name":"Proceedings of the 4th International Conference on Natural Language Processing and Information Retrieval","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122530398","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Machine Learning-based Automated Essay Scoring System for Chinese Proficiency Test (HSK) 基于机器学习的HSK自动作文评分系统
Rui Xiao, W. Guo, Yunchun Zhang, Xiaoyan Ma, Jiaqi Jiang
{"title":"Machine Learning-based Automated Essay Scoring System for Chinese Proficiency Test (HSK)","authors":"Rui Xiao, W. Guo, Yunchun Zhang, Xiaoyan Ma, Jiaqi Jiang","doi":"10.1145/3443279.3443299","DOIUrl":"https://doi.org/10.1145/3443279.3443299","url":null,"abstract":"Automated essay scoring (AES) gains momentum recently in English-based environment. However, the development of Chinese AES system is slow and fruitless. Many foreign students participate in the Chinese Proficiency Test (HSK) so a HSK automated essay scoring system (HSK AES) is in high demand. To develop an effective and reliable HSK AES system, this paper proposes three machine learning and deep learning models that take HSK essays as input. We apply Word2vec and TF-IDF (term frequency-inverse document frequency) methods to extract important features from the original essays. Three machine learning models, including XGBoost, one deep neural network with flatten and dense layer and another deep neural network with LSTM (long short-term memory) and dense layer, are trained. The experimental results show that XGBoost with TF-IDF outperforms the other two models with the lowest MAE (mean absolute error) as 6.7%. We also prove that deep neural networks either with LSTM (long short-term memory) or with flatten perform unsatisfactory on HSK AES.","PeriodicalId":414366,"journal":{"name":"Proceedings of the 4th International Conference on Natural Language Processing and Information Retrieval","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123904917","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Myanmar Text-to-Speech Synthesis Using End-to-End Model 使用端到端模型的缅甸文本到语音合成
Qinglai Qin, Jian Yang, Peiying Li
{"title":"Myanmar Text-to-Speech Synthesis Using End-to-End Model","authors":"Qinglai Qin, Jian Yang, Peiying Li","doi":"10.1145/3443279.3443295","DOIUrl":"https://doi.org/10.1145/3443279.3443295","url":null,"abstract":"In this paper, we propose a Myanmar speech synthesis system based on an End-to-End neural network model, which integrates the Myanmar phone model into the Tacotron2 End-to-End model. Based on the Seq2seq model architecture, we use phone-level embedding to form a feature prediction network from phone sequences to Mel spectrum, and combine with a semi-supervised speech generation network to generate high-quality Myanmar synthesized speech. In addition, we introduced the BERT pre-training decoder module to assist the phone feature extraction, which reduces the system's dependence on the phone feature extraction network and improve the text feature richness. Compared with other Myanmar speech synthesis systems, this method effectively improves the naturalness and accuracy of synthesized speech under low resource conditions.","PeriodicalId":414366,"journal":{"name":"Proceedings of the 4th International Conference on Natural Language Processing and Information Retrieval","volume":"304 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124283630","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
A BERT-Based Semantic Matching Ranker for Open-Domain Question Answering 基于bert的开放域问答语义匹配排序器
Shiyi Xu, Feng Liu, Zhen Huang, Yuxing Peng, Dongsheng Li
{"title":"A BERT-Based Semantic Matching Ranker for Open-Domain Question Answering","authors":"Shiyi Xu, Feng Liu, Zhen Huang, Yuxing Peng, Dongsheng Li","doi":"10.1145/3443279.3443301","DOIUrl":"https://doi.org/10.1145/3443279.3443301","url":null,"abstract":"Open-domain question answering (QA) is a hot topic in recent years. Previous work has shown that an effective ranker can improve the overall QA performance by denoising irrelevant context. There are also some recent works leveraged BERT pre-trained model to tackle with open-domain QA tasks, and achieved significant improvements. Nevertheless, these BERT-based models simply concatenates a paragraph with a question, ignoring the semantic similarity of them. In this paper, we propose a simple but effective BERT-based semantic matching ranker to compute the semantic similarity between the paragraph and given question, in which three different representation aggregation functions are explored. To validate the generalized performance of our ranker, we conduct a series of experiments on two public open-domain QA datasets. Experimental results demonstrate that the proposed ranker contributes significant improvements on both the ranking and the final QA performances.","PeriodicalId":414366,"journal":{"name":"Proceedings of the 4th International Conference on Natural Language Processing and Information Retrieval","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123087225","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
An Approach to the Exact Packed String Matching Problem 精确打包字符串匹配问题的一种方法
Michael Olson, Daniel Davis, Jae Woong Lee
{"title":"An Approach to the Exact Packed String Matching Problem","authors":"Michael Olson, Daniel Davis, Jae Woong Lee","doi":"10.1145/3443279.3443296","DOIUrl":"https://doi.org/10.1145/3443279.3443296","url":null,"abstract":"Searching data is a natural behavior of humankind and is also a fundamental operation in both industrial and academic areas. There has been research into developing sophisticated methods and techniques for string searching. The common method is to make use of prefixes and suffixes to move through the text while searching a string query in a text. It suffers from high computational complexity since it has to repeatedly search each character in the string query sequentially. In this paper, we address the difficulty and propose a novel approach that enables a parallel search of the text without indexing the text or the query which is needed for sequential search. The proposed approach utilizes an XNOR operation in conjunction with the shift method to find the instance of a query. We validated it through experiments finding improvement against other methods.","PeriodicalId":414366,"journal":{"name":"Proceedings of the 4th International Conference on Natural Language Processing and Information Retrieval","volume":"122 24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128345554","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Ranking Hotel Reviews Based on User's Aspects Importance and Opinions 根据用户的方面、重要性和意见对酒店评论进行排名
Diego Bonesso, Karin Becker, François Portet, C. Labbé
{"title":"Ranking Hotel Reviews Based on User's Aspects Importance and Opinions","authors":"Diego Bonesso, Karin Becker, François Portet, C. Labbé","doi":"10.1145/3443279.3443280","DOIUrl":"https://doi.org/10.1145/3443279.3443280","url":null,"abstract":"Online product reviews have become fundamental to users' purchasing decisions. Many websites provide rating-based ranking of entities, but analyzing the set of textual reviews is still time-consuming. Indeed, each user (reader) must build his/her own judgment from the set of reviews of the other users (writers), who might not have the same expectations and needs. To speed up this process, work have proposed more personalized rankings, which are restricted to the writer's perspective. In this work, we present an approach to rank reviews of an entity of interest, a hotel, based on the reader's profile. The method extracts a profile from free-text reviews and uses it to assess the degree of relevance of each review to rank according to the user's interests. The results obtained in the experiment exhibit a Mean Reciprocal Rank (MRR) of 0.72%, which is higher than comparable approaches of the literature. This paper also emphasizes the lack of available material to undertake such research, and sketches a methodology for evaluation.","PeriodicalId":414366,"journal":{"name":"Proceedings of the 4th International Conference on Natural Language Processing and Information Retrieval","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121032903","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An Adaptive Filtering Technique for Segmentation of Tuberculosis in Microscopic Images 一种用于显微图像结核分割的自适应滤波技术
Z. Khan, Waseem Ullah, Amin Ullah, Seungmin Rho, Mi Young Lee, S. Baik
{"title":"An Adaptive Filtering Technique for Segmentation of Tuberculosis in Microscopic Images","authors":"Z. Khan, Waseem Ullah, Amin Ullah, Seungmin Rho, Mi Young Lee, S. Baik","doi":"10.1145/3443279.3443283","DOIUrl":"https://doi.org/10.1145/3443279.3443283","url":null,"abstract":"Tuberculosis disease is one of the most leading cause of fatality worldwide. however, it can be reduced if diagnosed and treated on time. Normally the method name Ziehl-Neelsen is used to diagnose Tuberculosis and a human specialist analyzes it using an optical microscope to find tuberculosis bacilli. Since this process is time-consuming, an automatic bacilli recognition system allows the diagnosis process faster. In this work, an automatic tuberculosis bacilli segmentation system is developed. Initially, the input image is preprocessed by applying adaptive mean filter (AMD) to remove impulse noise and power law transformation to enhance the image then transform the color space from RGB to HSV. The HSV color space is more suitable for image processing because each element is isolated in it. Next, we employed the multi-level thresholding algorithm to correctly segment each bacillus in the input sample and improved 2.13% accuracy when compared to state-of-the-art techniques.","PeriodicalId":414366,"journal":{"name":"Proceedings of the 4th International Conference on Natural Language Processing and Information Retrieval","volume":"76 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128589779","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
A Classification on Different Aspects of User Modelling in Personalized Web Search 个性化网络搜索中用户建模的不同方面分类
Sara Abri, Rayan Abri, S. Cetin
{"title":"A Classification on Different Aspects of User Modelling in Personalized Web Search","authors":"Sara Abri, Rayan Abri, S. Cetin","doi":"10.1145/3443279.3443291","DOIUrl":"https://doi.org/10.1145/3443279.3443291","url":null,"abstract":"In the context of personalization has recently been doing a lot of researches and applications. A common component of all research in the field of personalization is user modeling that also called user profiling. The main work of the user modeling in the field of personalization in the first step is capturing information about users and in the next step is to identify the user's preferences and interests and efficient use this information for increasing the retrieval performance. How to collect information about the user, user model structures and the used techniques to create a user model is different in each of personalized applications. In the previous studies, there was not a complete classification on the major dimensions of user models. In this research, we present an appropriate classification on the major dimensions of user models. We aim to present a survey on applications and techniques of user modeling and make a classification of user modeling by considering the existing literature and research and we hope can help to researchers in better-developing on the area.","PeriodicalId":414366,"journal":{"name":"Proceedings of the 4th International Conference on Natural Language Processing and Information Retrieval","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122035567","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Identifying the Motives of Using Weibo from Digital Traces 从数字痕迹看微博使用动机
Bi Li, Boyu Chen, Yan Wu, Juan Wang, Xueming Yan, Yahui Yang
{"title":"Identifying the Motives of Using Weibo from Digital Traces","authors":"Bi Li, Boyu Chen, Yan Wu, Juan Wang, Xueming Yan, Yahui Yang","doi":"10.1145/3443279.3443294","DOIUrl":"https://doi.org/10.1145/3443279.3443294","url":null,"abstract":"Billions of users around the world are using social networking sites (SNS) to express everyday thoughts and feelings. Investigating motives of using SNS is attracting scholarly attention. The common way to assess users' motives is analyzing data from self-report questionnaires. The current research aims to identifying undergraduate students' motives of using Weibo from digital traces, in an effort to alleviate the distortion in self-report data. The term frequency-inverse document frequency (Tf-idf) was employed to obtain key terms and their weights in digital traces crawled from Weibo. Top frequent terms, based on Tf-idf, indicate that entertainment, information seeking and sharing, and alleviating life stress are among the major motives of using Weibo. This study underscores the feasibility and importance of directly detecting motives of using SNS from digital traces.","PeriodicalId":414366,"journal":{"name":"Proceedings of the 4th International Conference on Natural Language Processing and Information Retrieval","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131604837","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信