2020 IEEE 21st International Conference on Information Reuse and Integration for Data Science : IRI 2020 : proceedings : virtual conference, 11-13 August 2020. IEEE International Conference on Information Reuse and Integration (21st : 2...最新文献
{"title":"Modeling of Clinical Mammography Recognition","authors":"Kuo-Chung Chu, Po-Yao Tsai, Tien-Yu Chang, Yu-Shu Wu","doi":"10.1109/IRI49571.2020.00068","DOIUrl":"https://doi.org/10.1109/IRI49571.2020.00068","url":null,"abstract":"Breast cancer screening can detect and treat early, mammography is one of popular screening methods. Recognition of mammography image depends on the radiologist, but human interpretation of mammography image has its limitations. Recently, for precision medicine, deep learning technology is applied on medical images to reduce the risk of the interpretation on breast lesion types (BIRADS, Breast Imaging Reporting and Data System, divided into 0 to 6 categories). This study proposes a mammography recognition model that is based on deep learning method to support clinical diagnosis of breast cancer. The model is try to improve medical quality.","PeriodicalId":93159,"journal":{"name":"2020 IEEE 21st International Conference on Information Reuse and Integration for Data Science : IRI 2020 : proceedings : virtual conference, 11-13 August 2020. IEEE International Conference on Information Reuse and Integration (21st : 2...","volume":"123 1","pages":"409-411"},"PeriodicalIF":0.0,"publicationDate":"2020-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75103865","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Tyler Westland, Nan Niu, R. Jha, David Kapp, T. Kebede
{"title":"Relating the Empirical Foundations of Attack Generation and Vulnerability Discovery","authors":"Tyler Westland, Nan Niu, R. Jha, David Kapp, T. Kebede","doi":"10.1109/IRI49571.2020.00014","DOIUrl":"https://doi.org/10.1109/IRI49571.2020.00014","url":null,"abstract":"Automatically generating exploits for attacks receives much attention in security testing and auditing. However, little is known about the continuous effect of automatic attack generation and detection. In this paper, we develop an analytic model to understand the cost-benefit tradeoffs in light of the process of vulnerability discovery. We develop a three-phased model, suggesting that the cumulative malware detection has a productive period before the rate of gain flattens. As the detection mechanisms co-evolve, the gain will likely increase. We evaluate our analytic model by using an anti-virus tool to detect the thousands of Trojans automatically created. The anti-virus scanning results over five months show the validity of the model and point out future research directions.","PeriodicalId":93159,"journal":{"name":"2020 IEEE 21st International Conference on Information Reuse and Integration for Data Science : IRI 2020 : proceedings : virtual conference, 11-13 August 2020. IEEE International Conference on Information Reuse and Integration (21st : 2...","volume":"87 1","pages":"37-44"},"PeriodicalIF":0.0,"publicationDate":"2020-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72636894","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Lydia Bouzar-Benlabiod, S. Rubin, Kahina Belaidi, Nour ElHouda Haddar
{"title":"RNN-VED for Reducing False Positive Alerts in Host-based Anomaly Detection Systems","authors":"Lydia Bouzar-Benlabiod, S. Rubin, Kahina Belaidi, Nour ElHouda Haddar","doi":"10.1109/IRI49571.2020.00011","DOIUrl":"https://doi.org/10.1109/IRI49571.2020.00011","url":null,"abstract":"Host-based Intrusion Detection Systems HIDS are often based on anomaly detection. Several studies deal with anomaly detection by analyzing the system-call traces and get good detection rates but also a high rate off alse positives. In this paper, we propose a new anomaly detection approach applied on the system-call traces. The normal behavior learning is done using a Sequence to sequence model based on a Variational Encoder-Decoder (VED) architecture that integrates Recurrent Neural Networks (RNN) cells. We exploit the semantics behind the invoking order of system-calls that are then seen as sentences. A preprocessing phase is added to structure and optimize the model input-data representation. After the learning step, a one-class classification is run to categorize the sequences as normal or abnormal. The architecture may be used for predicting abnormal behaviors. The tests are achieved on the ADFA-LD dataset.","PeriodicalId":93159,"journal":{"name":"2020 IEEE 21st International Conference on Information Reuse and Integration for Data Science : IRI 2020 : proceedings : virtual conference, 11-13 August 2020. IEEE International Conference on Information Reuse and Integration (21st : 2...","volume":"26 1","pages":"17-24"},"PeriodicalIF":0.0,"publicationDate":"2020-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76781284","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yaohua Chang, Jin Chen, Tyler Franklin, Lei Zhang, Arber Ruci, Hao Tang, Zhigang Zhu
{"title":"Multimodal Information Integration for Indoor Navigation Using a Smartphone","authors":"Yaohua Chang, Jin Chen, Tyler Franklin, Lei Zhang, Arber Ruci, Hao Tang, Zhigang Zhu","doi":"10.1109/IRI49571.2020.00017","DOIUrl":"https://doi.org/10.1109/IRI49571.2020.00017","url":null,"abstract":"We propose an accessible indoor navigation application. The solution integrates information of floor plans, Bluetooth beacons, Wi-Fi/cellular data connectivity, 2D/3D visual models, and user preferences. Hybrid models of interiors are created in a modeling stage with Wi-/cellular data connectivity, beacon signal strength, and a 3D spatial model. This data is collected, as the modeler walks through the building, and is mapped to the floor plan. Client-server architecture allows scaling to large areas by lazy-loading models according to beacon signals and/or adjacent region proximity. During the navigation stage, a user with the designed mobile app is localized within the floor plan, using visual, connectivity, and user preference data, along an optimal route to their destination. User interfaces for both modeling and navigation use visual, audio, and haptic feedback for targeted users. While the current pandemic event precludes our user study, we describe its design and preliminary results.","PeriodicalId":93159,"journal":{"name":"2020 IEEE 21st International Conference on Information Reuse and Integration for Data Science : IRI 2020 : proceedings : virtual conference, 11-13 August 2020. IEEE International Conference on Information Reuse and Integration (21st : 2...","volume":"68 1","pages":"59-66"},"PeriodicalIF":0.0,"publicationDate":"2020-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78328923","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Wrapping a NoSQL Datastore for Stream Analytics","authors":"Khalid Mahmood, Kjell Orsborn, T. Risch","doi":"10.1109/IRI49571.2020.00050","DOIUrl":"https://doi.org/10.1109/IRI49571.2020.00050","url":null,"abstract":"With the advent of the Industrial Internet of Things (IIoT) and Industrial Analytics, numerous application scenarios emerge, where business and mission-critical decisions depend upon large scale analytics of sensor streams. However, very large volumes of data from data streams generated at a high rate pose substantial challenges in providing scalable analytics from existing Database Management Systems (DBMS). While scalability can be provided by high-performance distributed datastores, due to the simple query operations, access to high-level query-based data analytics is usually limited. This work combines high-level query-based data analytics capabilities with high-performance distributed scalability by applying a wrapper-mediator approach. The Amos II extensible main-memory DBMS provides online query processing data analytics engine in front of the MongoDB distributed NoSQL datastore to support large-scale distributed data analytics over persisted data streams. Thus, the implemented system enables query-based online data stream analytics over persisted data streams stored/logged in distributed NoSQL datastores.","PeriodicalId":93159,"journal":{"name":"2020 IEEE 21st International Conference on Information Reuse and Integration for Data Science : IRI 2020 : proceedings : virtual conference, 11-13 August 2020. IEEE International Conference on Information Reuse and Integration (21st : 2...","volume":"40 1","pages":"301-305"},"PeriodicalIF":0.0,"publicationDate":"2020-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79886938","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Kimberley Hemmings-Jarrett, Terryann Barnett, Julian Jarrett, M. Blake, Denise E. Agosto
{"title":"Quality not Quantity! A Qualitative Evaluation and Proposal for Understanding the Depth of Audience “Knowledge” Post Data Extraction","authors":"Kimberley Hemmings-Jarrett, Terryann Barnett, Julian Jarrett, M. Blake, Denise E. Agosto","doi":"10.1109/IRI49571.2020.00031","DOIUrl":"https://doi.org/10.1109/IRI49571.2020.00031","url":null,"abstract":"Knowledge is defined as…the result of machine extracted patterns; humans making sense of their environment; information generated and aggregated from software services or as the lowest form of human cognition. Different perspectives, different domains, but one concept. Information scientists are often concerned with retrieving knowledge from data sources and sharing that knowledge with concerned stakeholders; with such differing views on what qualifies as knowledge a cross-domain approach might prove beneficial. This work is a qualitative assessment of the layers of knowledge intended to bridge the gap between the analyst and their intended or unintended audiences. It examines the benefit of abstracting concepts used in the education discipline to justify including a post-evaluation stage to the Knowledge Discovered through Databases (KDD) framework. It also intends to promote awareness of the various human cognitive capacities and provide a useful approach for communicating and evaluating machine-extracted knowledge that supports higher order thinking.","PeriodicalId":93159,"journal":{"name":"2020 IEEE 21st International Conference on Information Reuse and Integration for Data Science : IRI 2020 : proceedings : virtual conference, 11-13 August 2020. IEEE International Conference on Information Reuse and Integration (21st : 2...","volume":"175 1","pages":"164-171"},"PeriodicalIF":0.0,"publicationDate":"2020-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76965553","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Semantic Data Understanding with Character Level Learning","authors":"Michael J. Mior, K. Pu","doi":"10.1109/IRI49571.2020.00043","DOIUrl":"https://doi.org/10.1109/IRI49571.2020.00043","url":null,"abstract":"Databases are growing in size and complexity. With the emergence of data lakes, databases have become open, fast evolving and highly heterogeneous. Understanding the complex relationships among different entity types in such scenarios is both challenging and necessary to data scientists. We propose an approach that utilizes a convolutional neural network to learn patterns associated with each entity type in the database at the character level. We demonstrate that the learned character-level patterns can capture sufficient semantic information for many useful applications including data lake schema exploration, and interactive data cleaning.","PeriodicalId":93159,"journal":{"name":"2020 IEEE 21st International Conference on Information Reuse and Integration for Data Science : IRI 2020 : proceedings : virtual conference, 11-13 August 2020. IEEE International Conference on Information Reuse and Integration (21st : 2...","volume":"48 1","pages":"253-258"},"PeriodicalIF":0.0,"publicationDate":"2020-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76290397","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"IRI 2020 Commentary","authors":"","doi":"10.1109/iri49571.2020.00001","DOIUrl":"https://doi.org/10.1109/iri49571.2020.00001","url":null,"abstract":"","PeriodicalId":93159,"journal":{"name":"2020 IEEE 21st International Conference on Information Reuse and Integration for Data Science : IRI 2020 : proceedings : virtual conference, 11-13 August 2020. IEEE International Conference on Information Reuse and Integration (21st : 2...","volume":"86 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2020-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90052049","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"IRI 2020 Index","authors":"","doi":"10.1109/iri49571.2020.00077","DOIUrl":"https://doi.org/10.1109/iri49571.2020.00077","url":null,"abstract":"","PeriodicalId":93159,"journal":{"name":"2020 IEEE 21st International Conference on Information Reuse and Integration for Data Science : IRI 2020 : proceedings : virtual conference, 11-13 August 2020. IEEE International Conference on Information Reuse and Integration (21st : 2...","volume":"4 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2020-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88869542","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Lacramioara Mazilu, N. Paton, Nikolaos Konstantinou, A. Fernandes
{"title":"Fairness in Data Wrangling","authors":"Lacramioara Mazilu, N. Paton, Nikolaos Konstantinou, A. Fernandes","doi":"10.1109/IRI49571.2020.00056","DOIUrl":"https://doi.org/10.1109/IRI49571.2020.00056","url":null,"abstract":"At the core of many data analysis processes lies the challenge of properly gathering and transforming data. This problem is known as data wrangling, and it can become even more challenging if the data sources that need to be transformed are heterogeneous and autonomous, i.e., have different origins, and if the output is meant to be used as a training dataset, thus, making it paramount for the dataset to be fair. Given the rise in usage of artificial intelligence (AI) systems for a variety of domains, it is necessary to take into account fairness issues while building these systems. In this paper, we aim to bridge the gap between gathering the data and making the datasets fair by proposing a method for performing data wrangling while considering fairness. To this end, our method comprises a data wrangling pipeline whose behaviour can be adjusted through a set of parameters. Based on the fairness metrics run on the output datasets, the system plans a set of data wrangling interventions with the aim of lowering the bias in the output dataset. The system uses Tabu Search to explore the space of candidate interventions. In this paper we consider two potential sources of dataset bias: those arising from unequal representation of sensitive groups and those arising from hidden biases through proxies for sensitive attributes. The approach is evaluated empirically.","PeriodicalId":93159,"journal":{"name":"2020 IEEE 21st International Conference on Information Reuse and Integration for Data Science : IRI 2020 : proceedings : virtual conference, 11-13 August 2020. IEEE International Conference on Information Reuse and Integration (21st : 2...","volume":"23-24 1","pages":"341-348"},"PeriodicalIF":0.0,"publicationDate":"2020-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89368482","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}