{"title":"SymphonyDB: A Polyglot Model for Knowledge Graph Query Processing","authors":"M. Salehpour, Joseph G. Davis","doi":"10.1109/TransAI51903.2021.00013","DOIUrl":"https://doi.org/10.1109/TransAI51903.2021.00013","url":null,"abstract":"Unlocking the full potential of Knowledge Graphs (KGs) to enable or enhance various semantic and other applications requires Data Management Systems (DMSs) to efficiently store and process the content of KGs. However, the increases in the size and variety of KG datasets as well as the growing diversity of KG queries pose efficiency challenges for the current generation of DMSs to the extent that the performance of representative DMSs tends to vary significantly across diverse query types and no single platform dominates performance. We present our extensible prototype, SymphonyDB, as an approach to addressing this problem based on a polyglot model of query processing as part of a multi-database system supported by a unified access layer that can analyze/translate individual queries just-in-time and match each to the likely best-performing DMS among Virtuoso, Blazegraph, RDF-3X, and MongoDB as representative DMSs that are included in our prototype at this time. The results of our experiments with the prototype over wellknown KG benchmark datasets and queries point to the efficiency and consistency of its performance across different query types and datasets.","PeriodicalId":426766,"journal":{"name":"2021 Third International Conference on Transdisciplinary AI (TransAI)","volume":"50 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123913953","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Identification of Ransomware families by Analyzing Network Traffic Using Machine Learning Techniques","authors":"May Almousa, Janet Osawere, Mohd Anwar","doi":"10.1109/TransAI51903.2021.00012","DOIUrl":"https://doi.org/10.1109/TransAI51903.2021.00012","url":null,"abstract":"The number of prominent ransomware attacks has increased recently. In this research, we detect ransomware by analyzing network traffic by using machine learning algorithms and comparing their detection performances. We have developed multi-class classification models to detect families of ransomware by using the selected network traffic features, which focus on the Transmission Control Protocol (TCP). Our experiment showed that decision trees performed best for classifying ransomware families with 99.83% accuracy, which is slightly better than the random forest algorithm with 99.61% accuracy. The experimental result without feature selection classified six ransomware families with high accuracy. On the other hand, classifiers with feature selection gave nearly the same result as those without feature selection. However, using feature selection gives the advantage of lower memory usage and reduced processing time, thereby increasing speed. We discovered the following ten important features for detecting ransomware: time delta, frame length, IP length, IP destination, IP source, TCP length, TCP sequence, TCP next sequence, TCP header length, and TCP initial round trip.","PeriodicalId":426766,"journal":{"name":"2021 Third International Conference on Transdisciplinary AI (TransAI)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129657480","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Michael P. Dye, D. S. Stamps, Myles Mason, E. Saria
{"title":"Toward autonomous detection of anomalous GNSS data via applied unsupervised artificial intelligence","authors":"Michael P. Dye, D. S. Stamps, Myles Mason, E. Saria","doi":"10.1109/TransAI51903.2021.00023","DOIUrl":"https://doi.org/10.1109/TransAI51903.2021.00023","url":null,"abstract":"Artificial intelligence applications within the geo-sciences are becoming increasingly common, yet there are still many challenges involved in adapting established techniques to geoscience data sets. Applications in the realm of volcanic hazards assessment show great promise for addressing such challenges. Here, we describe a Jupyter Notebook we developed that ingests real-time GNSS data streams from the EarthCube CHORDS (Cloud-Hosted Real-time Data Services for the geosciences) portal TZVOLCANO, applies unsupervised learning algorithms to perform automated data quality control (\"noise reduction\"), and explores autonomous detection of unusual volcanic activity using a neural network. The TZVOLCANO CHORDS portal streams real-time Global Navigation Satellite System (GNSS) positioning data in 1 second intervals from the TZVOLCANO network, which monitors the active volcano Ol Doinyo Lengai in Tanzania, through UNAVCO’s real-time GNSS data services. UNAVCO’s real-time data services provide near-real-time positions processed by the Trimble Pivot system. The positioning data (latitude, longitude, and height) are imported into this Jupyter Notebook in user-defined time spans. The positioning data are then collected in sets by the Jupyter Notebook and processed to extract a useful calculated variable in preparation for the machine learning algorithms, of which we choose the vector magnitude. Unsupervised K-means and Gaussian Mixture machine learning algorithms are then utilized to locate and remove data points (\"filter\") that are likely caused by noise and unrelated to volcanic signals. We find that both the K-means and Gaussian Mixture machine learning algorithms perform well at identifying regions of high noise within tested GNSS data sets, but the Gaussian Mixtures approach performs better. The filtered data are then used to train an artificial intelligence neural network that predicts volcanic deformation. Our Jupyter Notebook has the potential to be used for detecting potentially hazardous volcanic activity in the form of rapid vertical or horizontal displacement of the Earth’s surface.","PeriodicalId":426766,"journal":{"name":"2021 Third International Conference on Transdisciplinary AI (TransAI)","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128762884","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Automating the Process of Distinguishing Marketable Apples","authors":"M. Endo, P. Kawamoto","doi":"10.1109/TransAI51903.2021.00034","DOIUrl":"https://doi.org/10.1109/TransAI51903.2021.00034","url":null,"abstract":"The effective transfer of information accumulated from years of experience to following generations is a frequent challenge for farmers across the world. This report describes a work in progress which aims to preserve and apply such knowledge using machine learning techniques to help local apple farmers who face the same problem in fruit sorting processes, known as \"senka\" in Japanese. The process of identifying scratches, bruising, or other signs of illness in apples is typically carried out manually by only a few experienced farmers who reached their levels of expertise after many years of training. By allowing a deep learning software model to study a sufficient sample of images of the fruit sorted by veteran farmers, we aim to develop an automatic process for distinguishing marketable and non-marketable apples automatically and report the results of preliminary experiments which reached approximately 80% classification accuracy.","PeriodicalId":426766,"journal":{"name":"2021 Third International Conference on Transdisciplinary AI (TransAI)","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129299923","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Proteins-Based Circuits in an Intelligent Internet of Bio-Nano Things Network for Molecular Diagnostic of Renal Damage","authors":"H. Nieto-Chaupis","doi":"10.1109/TransAI51903.2021.00020","DOIUrl":"https://doi.org/10.1109/TransAI51903.2021.00020","url":null,"abstract":"It is shown that the accumulation of albumin proteins around the locations of podocytes is rather similar to a R-C (Resistance-Capacitor) circuit. While the electric shielding is not enough to detain the pass of albumin, more than a diffusion phenomenon, it is a problem that is entirely treated as one belonging to the classical electrodynamics. In this manner it was identified that the diffusion constant plays a role as the electrical parameters. The permanent aggregation of albumin proteins creates a capacitance. Therefore the expended power by the R-C circuit is interpreted as the loss of energy of renal glomerulus with implications on the performance and homeostasis of kidney. Thus, the identification of electric unbalance is translated as a signal of Kidney disease. The fact of having a physics-based scenario demands us to propose schemes inside the framework of the Internet of Bio-Nano Things.","PeriodicalId":426766,"journal":{"name":"2021 Third International Conference on Transdisciplinary AI (TransAI)","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129772555","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Tejaswani Verma, Christoph Lingenfelder, D. Klakow
{"title":"Explaining Black-box Predictions by Generating Local Meaningful Perturbations","authors":"Tejaswani Verma, Christoph Lingenfelder, D. Klakow","doi":"10.1109/TransAI51903.2021.00030","DOIUrl":"https://doi.org/10.1109/TransAI51903.2021.00030","url":null,"abstract":"Generating explanations of predictions made by machine learning models is a difficult task, especially for black-box models. One possible way to explain an individual decision or recommendation for a given instance is to build an interpretable local surrogate for the underlying black-box model in the vicinity of the given instance. This approach has been adopted by many algorithms, for example LIME and LEAFAGE. These algorithms suffer from shortcomings, strict assumptions and prerequisites, which not only limit their applicability but also affect black-box fidelity of their local approximations. We present ways to overcome their shortcomings including the definition of neighborhood, removal of prerequisites and assumption of linearity in local model. The main contribution of this paper is a novel algorithm (LEMP) which provides explanation for the given instance by building a surrogate model using generated perturbations in the neighborhood of the given instance as training data. Experiments show that our approach is more widely applicable and generates interpretable models with better fidelity to the underlying black-box model than previous algorithms.","PeriodicalId":426766,"journal":{"name":"2021 Third International Conference on Transdisciplinary AI (TransAI)","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115099392","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Comparative Analysis of Knowledge Graph Query Performance","authors":"M. Salehpour, Joseph G. Davis","doi":"10.1109/TransAI51903.2021.00014","DOIUrl":"https://doi.org/10.1109/TransAI51903.2021.00014","url":null,"abstract":"Knowledge Graphs (KGs) continue to gain widespread momentum for use in different domains. A variety of Data Management Systems (DMSs) have accordingly been developed in response to this growing deployment for storing KGs and querying their content. The performance of services offered by DMSs is crucial to unlocking the full potential of KGs for different purposes ranging from semantic search to reasoning and data integration. However, the efficiency of representative DMS types in supporting archetypal KG queries has not received adequate research attention. In this paper, we aim to provide a fine-grained, comparative analysis of four major DMS types, namely, row-, column-, graph-, and document-stores, against major query types, namely, subject-subject, subject-object, treelike, and optional joins. In particular, we analyze the performance of row-store Virtuoso, column-store Virtuoso, Blazegraph, and MongoDB using well-known benchmark datasets and queries. Our experimental results yield insight into the performance of the selected DMSs when executing different query types. The results highlight, however, that no single DMS proves superior in all benchmark scenarios, suggesting that a DMS should be selected and tailored to the query types being executed.","PeriodicalId":426766,"journal":{"name":"2021 Third International Conference on Transdisciplinary AI (TransAI)","volume":"106 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128178290","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}