Proceedings of the 2nd International Conference on Natural Language Processing and Information Retrieval最新文献

筛选
英文 中文
A Novel Feature Hashing With Efficient Collision Resolution for Bag-of-Words Representation of Text Data 一种具有高效冲突分辨率的文本数据词袋表示特征哈希
Bobby A. Eclarin, Arnel C. Fajardo, Ruji P. Medina
{"title":"A Novel Feature Hashing With Efficient Collision Resolution for Bag-of-Words Representation of Text Data","authors":"Bobby A. Eclarin, Arnel C. Fajardo, Ruji P. Medina","doi":"10.1145/3278293.3278301","DOIUrl":"https://doi.org/10.1145/3278293.3278301","url":null,"abstract":"Text Mining is widely used in many areas transforming unstructured text data from all sources such as patients' record, social media network, insurance data, and news, among others into an invaluable source of information. The Bag Of Words (BoW) representation is a means of extracting features from text data for use in modeling. In text classification, a word in a document is assigned a weight according to its frequency and frequency between different documents; therefore, words together with their weights form the BoW. One way to solve the issue of voluminous data is to use the feature hashing method or hashing trick. However, collision is inevitable and might change the result of the whole process of feature generation and selection. Using the vector data structure, the lookup performance is improved while resolving collision and the memory usage is also efficient.","PeriodicalId":183745,"journal":{"name":"Proceedings of the 2nd International Conference on Natural Language Processing and Information Retrieval","volume":"50 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-09-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116790913","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Multi-Attention Network for Sentiment Analysis 情感分析的多注意网络
Tingting Du, Yunyin Huang, X. Wu, Huiyou Chang
{"title":"Multi-Attention Network for Sentiment Analysis","authors":"Tingting Du, Yunyin Huang, X. Wu, Huiyou Chang","doi":"10.1145/3278293.3278295","DOIUrl":"https://doi.org/10.1145/3278293.3278295","url":null,"abstract":"Sentiment analysis is an active research area in natural language processing. However, most existing methods use extra data such as pre-specified syntactic structure or user preference information. In this work, we propose a multiple attention network (MAN) that learns both word- and phrase-level features for sentiment analysis. MAN uses vector representation of the input sequence as target in the first attention layer to locate the words that contribute to the sentence sentiment. However, although an isolated word may indicate subjectivity, there may be insufficient context to determine sentiment orientation. We argue that the sentence sentiment often requires multiple steps of reasoning. Thus, we apply the second attention layer to explore the phrase information around the keyword. We experiment our method on three benchmark datasets and the results show that our model achieves state-of-the-art performance without any extra data. The visualization of the attention layers illustrates the effectiveness of our attention based model.","PeriodicalId":183745,"journal":{"name":"Proceedings of the 2nd International Conference on Natural Language Processing and Information Retrieval","volume":"81 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-09-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121731537","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Automatic Recovery of Broken Links Using Information Retrieval Techniques 利用信息检索技术自动恢复断开的链接
Shoaib Hayat, Yue Li, Muhammad Riaz
{"title":"Automatic Recovery of Broken Links Using Information Retrieval Techniques","authors":"Shoaib Hayat, Yue Li, Muhammad Riaz","doi":"10.1145/3278293.3278296","DOIUrl":"https://doi.org/10.1145/3278293.3278296","url":null,"abstract":"World Wide Web is very dynamic in its nature and we experienced changes in web pages every day. Web pages are updated, deleted, created or moved from one domain to another. Due to its dynamic nature often the web users experience broken links. Internet has been suffering from broken links problem despite of its contemporary services. Broken links are frequent problem occurring in web domain. Sometimes the page which was pointing from another page has been disappeared forever or moved to some other location. There are numerous reasons behind broken links. Some of these are permanently deleted Web pages, or modification made in Web pages causes broken links or the link of target page has some errors in code of source page. Researchers proposed several techniques in order to recover the broken links or at least retrieve some relevant pages. Number of sources have been used in research community for broken links recover like URL of target page, Anchor text, surround text near to anchor text and text in the source pages. All these sources of information are useful for retrieving the candidate pages relevant to broken links. System returns a ranked list of highly relevant candidate pages on submitting a query which has been extracted from different sources listed above. Previous work relies on TF (Term Frequency) or DF (Document Frequency) weights for extracting term from anchor text and full text of page containing missing links but not showed good results which cause the problem of retrieving similar pages for multiple broken links. In this paper we investigate the use of term proximity (position) relationship between the terms of anchor text and full text in order to extract relevant (good and bad) terms through classification model. This solves the problem by providing different query terms for multiple broken links and also increases the effectiveness as the terms that are proximity close to each other reveal more relevance.","PeriodicalId":183745,"journal":{"name":"Proceedings of the 2nd International Conference on Natural Language Processing and Information Retrieval","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-09-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129837616","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Automated Teller Machines Location's Information Retrieval Search Engine Using Suffix Tree Clustering Technique 基于后缀树聚类技术的自动柜员机位置信息检索搜索引擎
Gil L. Fabon, Arnel C. Fajardo, Ruji P. Medina
{"title":"Automated Teller Machines Location's Information Retrieval Search Engine Using Suffix Tree Clustering Technique","authors":"Gil L. Fabon, Arnel C. Fajardo, Ruji P. Medina","doi":"10.1145/3278293.3278298","DOIUrl":"https://doi.org/10.1145/3278293.3278298","url":null,"abstract":"In this paper, the researcher presented the Automated Teller Machines Location's Information Retrieval Search Engine using Suffix Tree Clustering Technique. This new offering is very helpful to the day to day evolving demands of money transactions of the bank customers, especially during unexpected Automated Teller Machines failures. With an application of Suffix Tree Clustering Technique, the proposed Automated Teller Machines Location Information Retrieval Search Engine is not only limited to produce more efficient, accurate and precise Automated Teller Machines location search results than the current bank existing system. It also provides easier access focused control in information dissemination to provides 24/7 access to the list of ATM location booth with the corresponding ATM information's according to the areas of familiarity of the bank customers. It's also conveying innovation to the bank online services to exploit the provisions of Online and Offline ATM status transparency to the bank customers and avoid the bank customers to other banks high transactions fees. This claim is reinforced by 100% average effectiveness of precision, recall and F-measure experimental results on bank ATM locations data set. In spite of the rapid growth of improving and modernizing the Automated Teller Machines services there is still a lot ideas in fetching new offerings to modified Automated Teller Machines Location's Information Retrieval Search Engine in the country to dig up more accessible ATM location with a timely manner as contribution to the fast-paced era of modernization and technology.","PeriodicalId":183745,"journal":{"name":"Proceedings of the 2nd International Conference on Natural Language Processing and Information Retrieval","volume":"58 11","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-09-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120809378","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Novel Learning Rate Decay Function of Kohonen Self-Organizing Maps Using the Exponential Decay Average Rate of Change for Image Clustering 基于指数衰减平均变化率的Kohonen自组织映射学习率衰减函数用于图像聚类
Edwin F. Galutira, Arnel C. Fajardo, Ruji P. Medina
{"title":"A Novel Learning Rate Decay Function of Kohonen Self-Organizing Maps Using the Exponential Decay Average Rate of Change for Image Clustering","authors":"Edwin F. Galutira, Arnel C. Fajardo, Ruji P. Medina","doi":"10.1145/3278293.3278299","DOIUrl":"https://doi.org/10.1145/3278293.3278299","url":null,"abstract":"Clustering requires efficient selection of similarities among sample vectors and true clustering capability of the algorithm. The Kohonen Self-Organizing Maps is the most preferred unsupervised Artificial Neural Network clustering algorithm for high-dimensional or multi-dimensional data. This study introduces a new way of improving the clustering capability of the algorithm by enhancing its learning rate decay function to decrease its learning rate gradually as the training goes on through the use of the Exponential Decay Average Rate of Change. The new function allows the Enhanced Kohonen Self-Organizing Maps algorithm to converge to the minimum producing a more robust clustered datasets. The enhanced algorithm and the conventional algorithm were applied for image clustering, and the EKSOM remarkably outperformed the clustering capability of the KSOM. The introduction of EDARC function paves the way to explore the clustering and classification capability of KSOM further.","PeriodicalId":183745,"journal":{"name":"Proceedings of the 2nd International Conference on Natural Language Processing and Information Retrieval","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-09-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132151975","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
A Combination of Text Mining Techniques for Relevant Literature Search and Extractive Summarization 相关文献检索与摘录摘要的文本挖掘技术结合
Thiptanawat Phongwattana, Jonathan H. Chan
{"title":"A Combination of Text Mining Techniques for Relevant Literature Search and Extractive Summarization","authors":"Thiptanawat Phongwattana, Jonathan H. Chan","doi":"10.1145/3278293.3278300","DOIUrl":"https://doi.org/10.1145/3278293.3278300","url":null,"abstract":"Over the past few years, the amount of research papers published has dramatically increased. Consequently, researchers spend a lot of time reviewing relevant literature in order to better understand their domain of interest and keep up with new developments. After doing literature reviews in the area of text mining, we found many works proposing the means of sentence representation in machine learning for finding sentence similarity. These include average bag of words, weight average word vectors, bag of n-grams, and matrix-vector operations. However, these techniques are limited in word ordering and semantic analysis. This paper proposes a framework that combines two text mining techniques, paragraph vectors and TextRank, for the selection of relevant research paper and extractive summarization, respectively. Our training corpus includes over 20 million research papers. The aim of this work is to build a supplementary research tool that assists researchers in saving time conducting literature reviews. As the result, we can rank all relevant research papers potentially within the corpus, and utilize the outputs in our literature reviews. Moreover, the tool can extract all potential keywords in a single task as well.","PeriodicalId":183745,"journal":{"name":"Proceedings of the 2nd International Conference on Natural Language Processing and Information Retrieval","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-09-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122890646","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Domain-Specific Ontology Concept Extraction and Hierarchy Extension 特定领域本体概念提取与层次扩展
Grace Zhao, Xiaowen Zhang
{"title":"Domain-Specific Ontology Concept Extraction and Hierarchy Extension","authors":"Grace Zhao, Xiaowen Zhang","doi":"10.1145/3278293.3278302","DOIUrl":"https://doi.org/10.1145/3278293.3278302","url":null,"abstract":"The domain-specific vernaculars and notations have been a hurdle to automatic ontology building and augmentation, since most of the ontology learning methods are essentially based on the natural language studies and lexicosyntactic pattern explorations. This paper proposes two robust approaches to ontology hierarchical enhancement, in particular, adding new terms to the ontology graph. We designed our learning models from a computational vantage point, examining the inter-relationship between documents, ontology dictionary terms, and the graph structure of the seed ontology. We then take advantage of late studies of neural networks and machine learning to perform classification over the inter-related data, and insert the new term at the most desirable nodal place on the domain ontology graph.","PeriodicalId":183745,"journal":{"name":"Proceedings of the 2nd International Conference on Natural Language Processing and Information Retrieval","volume":"45 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-09-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114503143","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Classification of Emoji Categories from Tweet Based on Deep Neural Networks 基于深度神经网络的推文表情符号分类
Kazuyuki Matsumoto, Minoru Yoshida, K. Kita
{"title":"Classification of Emoji Categories from Tweet Based on Deep Neural Networks","authors":"Kazuyuki Matsumoto, Minoru Yoshida, K. Kita","doi":"10.1145/3278293.3278306","DOIUrl":"https://doi.org/10.1145/3278293.3278306","url":null,"abstract":"In this paper, we describe the sentiment analysis method from tweets based on emoji's category. Many of existing study about sentiment analysis focused on the emotional expressions included in sentence. However, because there are various kinds of emotional expressions, such as Internet slang, it cannot be constructed that the fixed emotional expression dictionary. The most of the methods based on corpus and machine learning, its performance is quite depended on the quality of annotation. Therefore, we attempt to use categories which are expressed by emoji as sentiment label instead of manually annotated labels. Our proposed method uses automatically annotated category label by emoji which is annotated to sentence, and train word embedding feature by deep neural networks. As the result of the experiment, our proposed method overcome the simple word feature based method.","PeriodicalId":183745,"journal":{"name":"Proceedings of the 2nd International Conference on Natural Language Processing and Information Retrieval","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-09-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131494300","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Computational Pragmatics: A Survey in China and the World 计算语用学:中国与世界综述
Xianbo Li, Zhixin Ma
{"title":"Computational Pragmatics: A Survey in China and the World","authors":"Xianbo Li, Zhixin Ma","doi":"10.1145/3278293.3278304","DOIUrl":"https://doi.org/10.1145/3278293.3278304","url":null,"abstract":"The definition and scope of computational linguistics are reconstructed to distinguish it with other interdisciplinary disciplines like mathematical pragmatics, formal pragmatics, corpus pragmatics, etc. Meanwhile, the position of computational pragmatics and its relationship with other disciplines are displayed in this paper. Then, we reviewed the research status of computational pragmatics in China and the world, finding out the current study of computational pragmatics is still in its infancy, awaiting further research in four aspects: discipline construction, philosophy and methodology, fundamental theory, and application. Literatures indicate that computational pragmatics plays a vital role in both theory for a lot of disciplines and applications in our life.","PeriodicalId":183745,"journal":{"name":"Proceedings of the 2nd International Conference on Natural Language Processing and Information Retrieval","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-09-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133602869","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Implementation of GA-Based Feature Selection in the Classification and Mapping of Disaster-Related Tweets 基于遗传算法的特征选择在灾害相关推文分类与映射中的实现
Ian P. Benitez, Ariel M. Sison, Ruji P. Medina
{"title":"Implementation of GA-Based Feature Selection in the Classification and Mapping of Disaster-Related Tweets","authors":"Ian P. Benitez, Ariel M. Sison, Ruji P. Medina","doi":"10.1145/3278293.3278297","DOIUrl":"https://doi.org/10.1145/3278293.3278297","url":null,"abstract":"The extracted features from Twitter messages were transformed into feature vector matrix for which feature selection using an improved Genetic Algorithm was applied. The features selected were used to train and test the classifiers. The evaluation showed the effectiveness of the implemented feature selection method in the dimensionality reduction of the feature space and in increasing the accuracy of Multinomial Naive Bayes. Moreover, a web-based prototype utilizing the model was developed and was used to analyze tweet data pertaining to natural disasters in the Philippines. The prototype exhibited potential to harness the capability of social media as a tool in helping the affected community in times of natural crisis. This work may spark ideas for a more advanced development of IT-based disaster management applications.","PeriodicalId":183745,"journal":{"name":"Proceedings of the 2nd International Conference on Natural Language Processing and Information Retrieval","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-09-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124562392","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信