{"title":"Big Data Quality Assessment Model for Unstructured Data","authors":"Ikbal Taleb, M. Serhani, R. Dssouli","doi":"10.1109/INNOVATIONS.2018.8605945","DOIUrl":"https://doi.org/10.1109/INNOVATIONS.2018.8605945","url":null,"abstract":"Big Data has gained an enormous momentum the past few years because of the tremendous volume of generated and processed Data from diverse application domains. Nowadays, it is estimated that 80% of all the generated data is unstructured. Evaluating the quality of Big data has been identified to be essential to guarantee data quality dimensions including for example completeness, and accuracy. Current initiatives for unstructured data quality evaluation are still under investigations. In this paper, we propose a quality evaluation model to handle quality of Unstructured Big Data (UBD). The later captures and discover first key properties of unstructured big data and its characteristics, provides some comprehensive mechanisms to sample, profile the UBD dataset and extract features and characteristics from heterogeneous data types in different formats. A Data Quality repository manage relationships between Data quality dimensions, quality Metrics, features extraction methods, mining methodologies, data types and data domains. An analysis of the samples provides a data profile of UBD. This profile is extended to a quality profile that contains the quality mapping with selected features for quality assessment. We developed an UBD quality assessment model that handles all the processes from the UBD profiling exploration to the Quality report. The model provides an initial blueprint for quality estimation of unstructured Big data. It also, states a set of quality characteristics and indicators that can be used to outline an initial data quality schema of UBD.","PeriodicalId":319472,"journal":{"name":"2018 International Conference on Innovations in Information Technology (IIT)","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126426444","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Marc Haßler, André Pomp, Christian Kohlschein, Tobias Meisen
{"title":"STIDes Revisited - Tackling Global Time Shifts and Scaling","authors":"Marc Haßler, André Pomp, Christian Kohlschein, Tobias Meisen","doi":"10.1109/INNOVATIONS.2018.8605951","DOIUrl":"https://doi.org/10.1109/INNOVATIONS.2018.8605951","url":null,"abstract":"In times where large amounts of time-dependent data is generated, the importance of time interval data sets in general and similarity analyses on these in particular continues to increase. In this context, various approaches regarding the comparability of two time interval data sets have been developed in recent years. The STIDes approach as a bottom up approach offers on the one hand the possibility to focus on individual properties of the intervals, on the other hand it allows time delays or scaling to be taken into consideration. In this paper, we take a closer look at the management of time delays and different scales and show that a similarity analysis using STIDes can be completed in polynomial time. Furthermore, we improve the handling of cardinality differences in the data sets to be compared.","PeriodicalId":319472,"journal":{"name":"2018 International Conference on Innovations in Information Technology (IIT)","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134602178","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Trust-Based Data Controller for Personal Information Management","authors":"Upul Jayasinghe, G. Lee, Áine MacDermott","doi":"10.1109/INNOVATIONS.2018.8605979","DOIUrl":"https://doi.org/10.1109/INNOVATIONS.2018.8605979","url":null,"abstract":"In today's data-driven digital economy, user-related information works as oil to fuel the state of art applications and services. Consumers, who use these services, provide personal information to service providers, intentionally or unintentionally and often without considering their trustworthiness. However, this personal information often reveals one's identity and may lead users to face unexpected outcomes, ranging from uninvited advertisements to identity theft. To regulate such issues, the new General Data Protection Regulation (GDPR) act was introduced by the European Union in May 2018. As defined by the act, the data controller plays an important role in determining the purposes, conditions and the means of processing data without compromising the user identities for malicious intentions. Therefore, in this paper, we propose a trust-based data controller in which an intermediate authority named trust manager recommends preferable actions towards the data controller on preserving the privacy of the users in accordance with the GDPR act.","PeriodicalId":319472,"journal":{"name":"2018 International Conference on Innovations in Information Technology (IIT)","volume":"54 Suppl 1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134519631","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Tasneem Salah, M. Zemerly, C. Yeun, M. Al-Qutayri, Yousof Al-Hammadi
{"title":"IoT Applications: From Mobile Agents to Microservices Architecture","authors":"Tasneem Salah, M. Zemerly, C. Yeun, M. Al-Qutayri, Yousof Al-Hammadi","doi":"10.1109/INNOVATIONS.2018.8605967","DOIUrl":"https://doi.org/10.1109/INNOVATIONS.2018.8605967","url":null,"abstract":"Internet of Things (IoT) is still grabbing the attention of researchers, developers, and organizations. This is due to the rapid increase of connected devices and the major advances seen in information and communication technologies every day. IoT refers to the network of interactive physical and virtual devices connected globally which includes smartphones, sensors, and robots. IoT devices require software adaptation as they are in continuous transition. Mobile Agents can offer adaptable composition for IoT systems. Mobile agents can be used to enable interoperability and global intelligence with smart objects in the Internet of Things. However, mobile agents come with many security concerns in which security protocols can be relatively heavy for IoT devices to handle. As a response, microservice architecture emerged and quickly became a widely used solution. The aim of this architecture is to break the application into a set of smaller independent services. It also allows developers and organizations to have the ability for frequent updates on their services. Studies in the last year showed a massive interest in microservice architectures in the context of IoT and cloud computing solutions. This paper offers a review of how microservices can replace mobile agents and be able to act as agents in IoT systems and highlight the benefits that can be obtained from this solution.","PeriodicalId":319472,"journal":{"name":"2018 International Conference on Innovations in Information Technology (IIT)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130222240","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"IIT 2018 Copyright Page","authors":"","doi":"10.1109/innovations.2018.8605962","DOIUrl":"https://doi.org/10.1109/innovations.2018.8605962","url":null,"abstract":"","PeriodicalId":319472,"journal":{"name":"2018 International Conference on Innovations in Information Technology (IIT)","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133760799","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Clone-Resistant Entities for Vehicular Security","authors":"Ayoub Mars, W. Adi","doi":"10.1109/INNOVATIONS.2018.8606035","DOIUrl":"https://doi.org/10.1109/INNOVATIONS.2018.8606035","url":null,"abstract":"Vehicular environment is exposed to many security threats such as illegal copying of software $mathrm {I}mathrm {P},$ counterfeiting of electronic and mechatronic components, and illegal tampering of digital data inside electronic control units (ECUs). The reason is mainly because ECUs are often easily to clone. Physical Unclonable Functions (PUFs) were proposed to be used in many applications such as secure memory-less key storage and devices identification. However, their usage for automotive security is still very limited due to their inconsistency and high implementation cost and complexity. This paper presents a novel VANET security architecture embedding a new consistent digital clone-resistant technology called Secret Unknown Cipher SUC as a physical security anchor. The paper addresses two sample use cases in software update and V2X link protocols fulfilling VANET security architecture and requirements. A secure Over-The-Air software update, and secure V2X communication demonstrate the efficiency of the proposed SUC technology. Since SUC concept is using pure consistent digital structures, it fits perfectly as a low-cost and robust technology in future vehicular IoT environment over the long lifetime of vehicular entities.","PeriodicalId":319472,"journal":{"name":"2018 International Conference on Innovations in Information Technology (IIT)","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124483234","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Migrating from SQL to NOSQL Database: Practices and Analysis","authors":"Fatima Yassine, M. Awad","doi":"10.1109/INNOVATIONS.2018.8606019","DOIUrl":"https://doi.org/10.1109/INNOVATIONS.2018.8606019","url":null,"abstract":"Big data and data analytics require migrating from relational databases (SQL) to NoSQL data structures to represent the data. Such transformation is challenging because of the lack of automatic transformation process and the requirement of guaranteeing both performance and accurate representation. In this paper, we analyze and implement commonly used mapping from SQL to NoSQL structures. We compare between these mappings in terms of retrieval time in an attempt to identify the best mapping. We use MySQL as DBMS for SQL structure and MongoDB for NoSQL structures. Our experiments showed promising results when using a mix of one level embedded document with a reference relationship with another document.","PeriodicalId":319472,"journal":{"name":"2018 International Conference on Innovations in Information Technology (IIT)","volume":"57 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128147962","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
F. Shamsi, Sarah Bamatraf, Talal Rahwan, Z. Aung, D. Svetinovic
{"title":"Correlation Analysis of Popularity and Interoperability in Open Source Projects","authors":"F. Shamsi, Sarah Bamatraf, Talal Rahwan, Z. Aung, D. Svetinovic","doi":"10.1109/INNOVATIONS.2018.8605970","DOIUrl":"https://doi.org/10.1109/INNOVATIONS.2018.8605970","url":null,"abstract":"In order to examine the success of open source projects, we analyzed 252,008 projects in the SourceForge database. We restricted our study to projects that are written in the top ranked programming languages. We measured the correlation between popularity, interoperability, productivity and the success rate of each project. The developers' contribution to the project was also examined in terms of the developer’s team size, and the success in terms of the total number of committers (contributors). Our results indicate that using multiple programming languages requires additional team members. Also, there is a significant increase in the functionality of the developed project. The results also demonstrated that there is a positive correlation between the development team size, and the total number of committers.","PeriodicalId":319472,"journal":{"name":"2018 International Conference on Innovations in Information Technology (IIT)","volume":"136 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129320081","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Predicting Employee Attrition using Machine Learning","authors":"Sarah S. Alduayj, K. Rajpoot","doi":"10.1109/INNOVATIONS.2018.8605976","DOIUrl":"https://doi.org/10.1109/INNOVATIONS.2018.8605976","url":null,"abstract":"The growing interest in machine learning among business leaders and decision makers demands that researchers explore its use within business organisations. One of the major issues facing business leaders within companies is the loss of talented employees. This research studies employee attrition using machine learning models. Using a synthetic data created by IBM Watson, three main experiments were conducted to predict employee attrition. The first experiment involved training the original class-imbalanced dataset with the following machine learning models: support victor machine (SVM) with several kernel functions, random forest and K-nearest neighbour (KNN). The second experiment focused on using adaptive synthetic (ADASYN) approach to overcome class imbalance, then retraining on the new dataset using the abovementioned machine learning models. The third experiment involved using manual undersampling of the data to balance between classes. As a result, training an ADASYN-balanced dataset with KNN (K = 3) achieved the highest performance, with 0.93 F1-score. Finally, by using feature selection and random forest, F1-score of 0.909 was achieved using 12 features out of a total of 29 features.","PeriodicalId":319472,"journal":{"name":"2018 International Conference on Innovations in Information Technology (IIT)","volume":"148 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124355478","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"IIT 2018 Author Index","authors":"","doi":"10.1109/innovations.2018.8605983","DOIUrl":"https://doi.org/10.1109/innovations.2018.8605983","url":null,"abstract":"","PeriodicalId":319472,"journal":{"name":"2018 International Conference on Innovations in Information Technology (IIT)","volume":"57 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129317190","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}