2020 IEEE 21st International Conference on Information Reuse and Integration for Data Science : IRI 2020 : proceedings : virtual conference, 11-13 August 2020. IEEE International Conference on Information Reuse and Integration (21st : 2...最新文献

筛选
英文 中文
A Study of Deep Learning for Factoid Question Answering System 基于深度学习的虚假问答系统研究
Min-Yuh Day, Yu-Ling Kuo
{"title":"A Study of Deep Learning for Factoid Question Answering System","authors":"Min-Yuh Day, Yu-Ling Kuo","doi":"10.1109/IRI49571.2020.00070","DOIUrl":"https://doi.org/10.1109/IRI49571.2020.00070","url":null,"abstract":"End-to-end question answering system has attracted considerable attention in the artificial intelligence research community in recent years. In this paper, we proposed an integrated deep learning model for factoid question answering system. This study uses the Delta Reading Comprehension Dataset (DRCD) to build a model to implement a factoid question answering system and to combine the classification of question and answer to evaluate with exact match (EM) and F1 score. The study determines whether the comparison can increase the proportion of EM and whether the expected answer type can effectively increase the answer accuracy rate. To perfect the transformation, a question-and-answer system that uses the BERT pre-training model is applied to the DRCD dataset together with the expected answer type analysis and comparison. The contribution of this paper is that we proposed a system architecture of factoid question answering (QA) system using BERT with question expected answer type (Q-EAT) and answer type classification (AT) models. Findings confirm that the classification of question and answer can improve the EM ratio. When the question sentence and the answer classification are the same, the prediction accuracy EM of the question answering system will be improved.","PeriodicalId":93159,"journal":{"name":"2020 IEEE 21st International Conference on Information Reuse and Integration for Data Science : IRI 2020 : proceedings : virtual conference, 11-13 August 2020. IEEE International Conference on Information Reuse and Integration (21st : 2...","volume":"12 1","pages":"419-424"},"PeriodicalIF":0.0,"publicationDate":"2020-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79060693","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
The Democratization of Machine Learning Features 机器学习特征的民主化
Jayesh Patel
{"title":"The Democratization of Machine Learning Features","authors":"Jayesh Patel","doi":"10.1109/IRI49571.2020.00027","DOIUrl":"https://doi.org/10.1109/IRI49571.2020.00027","url":null,"abstract":"In the Machine Age, Machine learning (ML) becomes a secret sauce to success for any business. Machine learning applications are not limited to autonomous cars or robotics but are widely used in almost all sectors including finance, healthcare, entertainment, government systems, telecommunications, and many others. Due to a lack of enterprise ML strategy, many enterprises still repeat the tedious steps and spend most of the time massaging the required data. It is easier to access a variety of data because of big data lakes and data democratization. Despite it and decent advances in ML, engineers still spend significant time in data cleansing and feature engineering. Most of the steps are often repeated in this exercise. As a result, it generates identical features with variations that lead to inconsistent results in testing and training ML applications. It often stretches the time to go-live and increases the number of iterations to ship a final ML application. Sharing the best practices and best features are not only time-savers but they also help to jumpstart ML application development. The democratization of ML features is a powerful way to share useful features, to reduce time go-live, and to enable rapid ML application development. It is one of the emerging trends in enterprise ML application development and this paper presents details about a way to achieve ML feature democratization.","PeriodicalId":93159,"journal":{"name":"2020 IEEE 21st International Conference on Information Reuse and Integration for Data Science : IRI 2020 : proceedings : virtual conference, 11-13 August 2020. IEEE International Conference on Information Reuse and Integration (21st : 2...","volume":"22 1","pages":"136-141"},"PeriodicalIF":0.0,"publicationDate":"2020-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90390176","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Addressing Imbalanced Data Problem with Generative Adversarial Network For Intrusion Detection 基于生成对抗网络的入侵检测数据不平衡问题
Ibrahim Yilmaz, Rahat Masum, Ambareen Siraj
{"title":"Addressing Imbalanced Data Problem with Generative Adversarial Network For Intrusion Detection","authors":"Ibrahim Yilmaz, Rahat Masum, Ambareen Siraj","doi":"10.1109/IRI49571.2020.00012","DOIUrl":"https://doi.org/10.1109/IRI49571.2020.00012","url":null,"abstract":"Machine learning techniques help to understand underlying patterns in datasets to develop defense mechanisms against cyber attacks. Multilayer Perceptron (MLP) technique is a machine learning technique used in detecting attack vs. benign data. However, it is difficult to construct any effective model when there are imbalances in the dataset that prevent proper classification of attack samples in data. In this research, we use UGR’16 dataset to conduct data wrangling initially. This technique helps to prepare a test set from the original dataset to train the neural network model effectively. We experimented with a series of inputs of varying sizes (i.e. 10000, 50000, 1 million) to observe the performance of the MLP neural network model with distribution of features over accuracy. Later, we use Generative Adversarial Network (GAN) model that produces samples of different attack labels (e.g. blacklist, anomaly spam, ssh scan) for balancing the dataset. These samples are generated based on data from the UGR’16 dataset. Further experiments with MLP neural network model shows that a balanced attack sample dataset, made possible with GAN, produces more accurate results than an imbalanced one.","PeriodicalId":93159,"journal":{"name":"2020 IEEE 21st International Conference on Information Reuse and Integration for Data Science : IRI 2020 : proceedings : virtual conference, 11-13 August 2020. IEEE International Conference on Information Reuse and Integration (21st : 2...","volume":"24 1","pages":"25-30"},"PeriodicalIF":0.0,"publicationDate":"2020-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74315019","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 28
Automated Filtering of Eye Gaze Metrics from Dynamic Areas of Interest 从感兴趣的动态区域自动过滤眼睛注视指标
Gavindya Jayawardena, S. Jayarathna
{"title":"Automated Filtering of Eye Gaze Metrics from Dynamic Areas of Interest","authors":"Gavindya Jayawardena, S. Jayarathna","doi":"10.1109/IRI49571.2020.00018","DOIUrl":"https://doi.org/10.1109/IRI49571.2020.00018","url":null,"abstract":"Eye-tracking experiments usually involves areas of interests (AOIs) for the analysis of eye gaze data as they could reveal potential cognitive load, and attentional patterns yielding interesting results about participants. While there are tools to define AOIs to extract eye movement data for the analysis of gaze measurements, they may require users to draw boundaries of AOIs on eye tracking stimuli manually or use markers to define AOIs in the space to generate AOI-mapped gaze locations. In this paper, we introduce a novel method to dynamically filter eye movement data from AOIs for the analysis of advanced eye gaze metrics. We incorporate pre-trained object detectors for offline detection of dynamic AOIs in dynamic eye-tracking stimuli such as video streams. We present our implementation and evaluation of object detectors to find the best object detector to be integrated in a real-time eye movement analysis pipeline to filter eye movement data that falls within the polygonal boundaries of detected dynamic AOIs. Our results indicate the utility of our method by applying it to a publicly available dataset.","PeriodicalId":93159,"journal":{"name":"2020 IEEE 21st International Conference on Information Reuse and Integration for Data Science : IRI 2020 : proceedings : virtual conference, 11-13 August 2020. IEEE International Conference on Information Reuse and Integration (21st : 2...","volume":"107 1","pages":"67-74"},"PeriodicalIF":0.0,"publicationDate":"2020-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79574731","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Using a Deep Learning Model, Content Features, and Author Metadata to Recommend Research Papers 使用深度学习模型、内容特征和作者元数据推荐研究论文
Si-Hong Lam, Eric Brewer, Yiu-Kai Ng
{"title":"Using a Deep Learning Model, Content Features, and Author Metadata to Recommend Research Papers","authors":"Si-Hong Lam, Eric Brewer, Yiu-Kai Ng","doi":"10.1109/IRI49571.2020.00045","DOIUrl":"https://doi.org/10.1109/IRI49571.2020.00045","url":null,"abstract":"According to the Canadian Science Publishing, there are approximately 2.5 million scientific papers published each year. The huge volume of publications can be contributed to a substantial increase in the total number of academic journals, including the increasing number of predatory or fake scientific journals, which yield high volumes of poor-quality research work. The effect of this scenario is that there is an obsolete jungle of journals to flip through in searching for high-quality and relevant references for researchers, ranging from the ones who simply look for citations to cite or latest development and knowledge in a specific scientific area of study. Querying existing web search engines and research paper archived websites is not the solution to the problem, since they are m-equipped to suggest high quality publications to meet the users’ information needs. In solving this problem, we propose an elegant research paper recommender, which is unique compared with existing ones, since besides considering the topics and contents of related publications, it also examines the authority and popularity of each publication to ensure its quality. Conducted empirical study shows that our recommender outperforms existing research paper recommenders and contributes to the design of searching relevant publications.","PeriodicalId":93159,"journal":{"name":"2020 IEEE 21st International Conference on Information Reuse and Integration for Data Science : IRI 2020 : proceedings : virtual conference, 11-13 August 2020. IEEE International Conference on Information Reuse and Integration (21st : 2...","volume":"42 1","pages":"265-270"},"PeriodicalIF":0.0,"publicationDate":"2020-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76863501","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
DataOps for Societal Intelligence: a Data Pipeline for Labor Market Skills Extraction and Matching 社会智能的数据操作:劳动力市场技能提取和匹配的数据管道
D. Tamburri, W. Heuvel, Martin Garriga
{"title":"DataOps for Societal Intelligence: a Data Pipeline for Labor Market Skills Extraction and Matching","authors":"D. Tamburri, W. Heuvel, Martin Garriga","doi":"10.1109/IRI49571.2020.00063","DOIUrl":"https://doi.org/10.1109/IRI49571.2020.00063","url":null,"abstract":"Big Data analytics supported by AI algorithms enable skills localization and retrieval, in the context of a labor market intelligence problem. We formulate and solve this problem through specific DataOps models, blending data sources from administrative and technical partners in several countries into cooperation, creating shared knowledge to support policy and decision-making. We then focus on the critical task of skills extraction from resumes and vacancies featuring state-of-the-art machine learning models. We showcase preliminary results with applied machine learning on real data from the employment agencies of the Netherlands and the Flemish region in Belgium. The final goal is to match these skills to standard ontologies of skills, jobs and occupations.","PeriodicalId":93159,"journal":{"name":"2020 IEEE 21st International Conference on Information Reuse and Integration for Data Science : IRI 2020 : proceedings : virtual conference, 11-13 August 2020. IEEE International Conference on Information Reuse and Integration (21st : 2...","volume":"89 1","pages":"391-394"},"PeriodicalIF":0.0,"publicationDate":"2020-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78989268","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 16
Studying the impact of streetlights on street crime rate using geo-statistics 利用地理统计学研究路灯对街道犯罪率的影响
Srikanth Vadlamani, M. Hashemi
{"title":"Studying the impact of streetlights on street crime rate using geo-statistics","authors":"Srikanth Vadlamani, M. Hashemi","doi":"10.1109/IRI49571.2020.00040","DOIUrl":"https://doi.org/10.1109/IRI49571.2020.00040","url":null,"abstract":"Lack of adequate streetlights likely affect public safety, particularly in neighborhoods with higher crime rates. Several researchers have studied the influence of streetlights on crime. However, those studies compare the crime rate during the day and not night or explore crime patterns in socially disorganized communities. This study focuses on detecting the pattern of nighttime street crime near a broken or due-for-repair streetlights. Historical crime data and data on city streetlight service requests studied in this project. Analytical approaches for this projects include the least squares linear regression model applied to determine the relationship between streetlight and crime data and Ripley’s K function is used to detect crime clusters near broken streetlights. The Moran’s I index is used to measuring the spatial correlation between broken streetlights and crime rates. Optimized hotspot analysis is used to predict crime locations. This study found that broken streetlights cause increasing trends of crime near them The Moran’s I index’s large positive value underscored the statistically-significant clustering of street crimes around broken streetlights","PeriodicalId":93159,"journal":{"name":"2020 IEEE 21st International Conference on Information Reuse and Integration for Data Science : IRI 2020 : proceedings : virtual conference, 11-13 August 2020. IEEE International Conference on Information Reuse and Integration (21st : 2...","volume":"42 1","pages":"231-236"},"PeriodicalIF":0.0,"publicationDate":"2020-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75126564","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Approximate Matching of Spatiotemporal RDF Data by Path 时空RDF数据的路径近似匹配
Jiajia Lu, Xiaofeng Di, Luyi Bai
{"title":"Approximate Matching of Spatiotemporal RDF Data by Path","authors":"Jiajia Lu, Xiaofeng Di, Luyi Bai","doi":"10.1109/IRI49571.2020.00032","DOIUrl":"https://doi.org/10.1109/IRI49571.2020.00032","url":null,"abstract":"Due to an ever-increasing number of RDF data with time features and space features, it is an important task to query efficiently spatiotemporal RDF data over RDF datasets. In this paper, the spatiotemporal RDF data contains time features, space features and text features, which are processed separately to facilitate query. Meanwhile the decomposition graph algorithm and the combination query paths algorithm are designed. The query graph with spatiotemporal features is split into multiple paths, and then every path in the query graph is used to search for the best matching path in the path sets contained in the data graph. Due to the existence of inaccurate matchings, approximate matchings are performed according to the evaluation function to find the best matching path. Finally, all the best paths are combined to generate a matching result graph. Our approach is evaluated from approximate performances and query performances. The experimental results show that the effectiveness and efficiency of our method","PeriodicalId":93159,"journal":{"name":"2020 IEEE 21st International Conference on Information Reuse and Integration for Data Science : IRI 2020 : proceedings : virtual conference, 11-13 August 2020. IEEE International Conference on Information Reuse and Integration (21st : 2...","volume":"61 1575 1","pages":"172-179"},"PeriodicalIF":0.0,"publicationDate":"2020-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82879699","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
IRI 2020 Committees IRI 2020委员会
Abdulhamid A. Adebayo
{"title":"IRI 2020 Committees","authors":"Abdulhamid A. Adebayo","doi":"10.1109/iri49571.2020.00008","DOIUrl":"https://doi.org/10.1109/iri49571.2020.00008","url":null,"abstract":"Abdulhamid Adebayo, IBM T.J. Watson Research Center, USA Abdulrhman M Alshareef, King AbdulAziz University, Saudi Arabia Anna Squicciarini, Pennsylvania State University, USA Arun Thapa, Tuskegee University, USA Balaji Palanisamy, University of Pittsburgh, USA Bharat Rawal, Pennsylvania State University, USA Caojin Zhang, Wayne State University, USA Chin-Wan Chung, Korea Advanced Institute of Science and Technology, South Korea Chongyang Shi, Beijing Institute of Technology, China Da Yan, University of Alabama at Birmingham, USA Dalei Wu, University of Tennessee at Chattanooga, USA Du Zhang, California State University, USA Elisa Bertino, Purdue University, USA Fei Zhao, University of Alabama at Birmingham, USA Feifei Zhang, Institute of Automation, Chinese Academy of Sciences, China Haiman Tian, Florida International University, USA Hao Wang, Louisiana State University, USA Hemanth Gudaparthi, University of Cincinnati, USA Hung T Nguyen, Carnegie Mellon University, USA Kayhan Ghafoor, Salahaddin University-Erbil, Iraq Kouichi Sakurai, Kyushu University, Japan Lidan Shou, Zhejiang University, China Ling Zhou, Jiangsu University, China Lixiao Huang, Arizona State University, USA Maria Presa-Reyes, Florida International University, USA Mei-Ling Shyu, University of Miami, USA Mengjun Xie, University of Tennessee at Chattanooga, USA Mohan Baruwal, Swinburne University of Technology, Australia Mortada Al-Banna, University of New South Wales, Australia Mounifah Alenazi, University of Cincinnati, USA Mukesh Saini, Indian Institute of Technology Ropar, India Nathalie Baracaldo, IBM Almaden Research Center, USA Nuray Baltaci, University of Pittsburgh, USA Omair Shafiq, Carleton University, Canada Orhun Vural, University of Alabama at Birmingham, USA Raj Gaire, CSIRO, Australia Ronald Doku, Howard University, USA Saad Sadiq, University of Miami, USA Sachin S Shetty, Old Dominion University, USA Samira Pouyanfar, Microsoft, USA Sandeep Reddivari, University of North Florida, USA Shihong Huang, Florida Atlantic University, USA Soumyanil Banerjee, Wayne State University, USA Taghi M. Khoshgoftaar, Florida Atlantic University, USA Tanmay Bhowmik, Mississippi State University, USA Tanvir Ahmed, Oracle, USA","PeriodicalId":93159,"journal":{"name":"2020 IEEE 21st International Conference on Information Reuse and Integration for Data Science : IRI 2020 : proceedings : virtual conference, 11-13 August 2020. IEEE International Conference on Information Reuse and Integration (21st : 2...","volume":"50 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2020-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88822360","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Development of Sentiment Lexicon in Bengali utilizing Corpus and Cross-lingual Resources 利用语料库和跨语资源开发孟加拉语情感词典
Salim Sazzed
{"title":"Development of Sentiment Lexicon in Bengali utilizing Corpus and Cross-lingual Resources","authors":"Salim Sazzed","doi":"10.1109/IRI49571.2020.00041","DOIUrl":"https://doi.org/10.1109/IRI49571.2020.00041","url":null,"abstract":"Bengali, one of the most spoken languages, lacks tools and resources for sentiment analysis. To date, the Bengali language does not have any sentiment lexicon of its own; only the translated versions of English lexica are available. Therefore, in this work, we focus on developing a Bengali sentiment lexicon from a large Bengali review corpus utilizing a cross-lingual approach. To build the sentiment dictionary, we first created a Bengali corpus of around 42000 drama reviews; among them, we manually annotated around 12000 reviews. Utilizing a machine translation system, labeled and unlabeled Bengali review corpus, English sentiment lexica, pointwise mutual information (PMI), and supervised machine learning (ML) classifiers in different phases, we develop a Bengali sentiment lexicon of around 1000 sentiment words. We compare the coverage of our lexicon with the translated English lexica in two evaluation datasets. The proposed lexicon achieves 70%-74% coverage in document-level and around 65% coverage in word-level, which is approximately 30%-100% improvement over the translated lexica in word-level and 30%-50% in document-level. The results demonstrate that our developed lexicon is highly effective in recognizing sentiments in the Bengali text.","PeriodicalId":93159,"journal":{"name":"2020 IEEE 21st International Conference on Information Reuse and Integration for Data Science : IRI 2020 : proceedings : virtual conference, 11-13 August 2020. IEEE International Conference on Information Reuse and Integration (21st : 2...","volume":"96 1","pages":"237-244"},"PeriodicalIF":0.0,"publicationDate":"2020-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73589858","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信