{"title":"A critical review of density-based data stream clustering techniques","authors":"Affan Ahmad Toor, M. Usman, W. Ahmed","doi":"10.1109/ICDIM.2016.7829786","DOIUrl":"https://doi.org/10.1109/ICDIM.2016.7829786","url":null,"abstract":"Data stream is relatively new and emerging domain in the current era of Internet advancement. Clustering data streams is equally important and difficult because of the numerous hurdles attached to it. A number of algorithms have been proposed to offer solutions for efficient clustering. Grid-based clustering approach was adopted few years ago to overcome the limitations of conventional partition-based algorithms for data stream clustering. Data points are mapped to the grid-cells to form micro-clusters which later are used for clustering. Using density in the clustering process is proved to be a remarkable success and in recent years many researchers have used density to find arbitrary shaped & density clusters and identify outliers. Concept of density-based clustering is to use grid-based clustering at core and create a distinction between dense and sparse grids using density threshold values and use dense grids to yield clustering results; which provide more cluster purity and accuracy. In this paper, we reviewed grid-based data stream clustering algorithms which utilize density. We evaluated their functionalities and identified their limitations. In the end, we critically evaluated different aspects of algorithms and suggested one of these algorithms which is better in terms of performance and accuracy.","PeriodicalId":146662,"journal":{"name":"2016 Eleventh International Conference on Digital Information Management (ICDIM)","volume":"70 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123973250","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Prominent voices and prevalent discourses: A corporate social responsibility application","authors":"Carlos M. Parra, M. Tremblay, A. Castellanos","doi":"10.1109/ICDIM.2016.7829780","DOIUrl":"https://doi.org/10.1109/ICDIM.2016.7829780","url":null,"abstract":"In this study we develop a simplified technique for identifying prominent voices (and characterizing prevalent discourses) using Text Data Mining around Corporate Social Responsibility (CSR) issues or topics. We do this by analyzing a corpus of CSR reports produced by 7 US firms (Citi, Coca-Cola, Exxon-Mobil, General Motors, Intel, McDonald's and Microsoft) in 2004, 2008 and 2012, and focusing on a reduced set of vectors — or Singular Vector Decompositions (SVDs)-derived from these CSR reports while exploring term associations (Text Topics or Term Clusters). Specifically, we use centroid clustering on these SVDs to identify centroid-guiding-CSR-report-components (or firms with prominent voices and prevalent discourses around a CSR topic). The analysis is performed by year in order to discern the way in which prominent voices and prevalent discourses (around CSR topics) have evolved through time. Results indicate that it is difficult for firms to maintain a prominent voice around CSR issues through time, and that when they manage to do so it is because the prevalent discourse has direct business implications.","PeriodicalId":146662,"journal":{"name":"2016 Eleventh International Conference on Digital Information Management (ICDIM)","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122032634","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Systematic mapping for big data stream processing frameworks","authors":"Mohammed Alayyoub, A. Yazici, Z. Karakaya","doi":"10.1109/ICDIM.2016.7829760","DOIUrl":"https://doi.org/10.1109/ICDIM.2016.7829760","url":null,"abstract":"There has been lots of discussions about the choice of a stream processing framework (SPF) for Big Data. Each of the SPFs has different cutting edge technologies in their steps of processing the data in motion that gives them a better advantage over the others. Even though, the cutting edge technologies used in each stream processing framework might better them, it is still hard to say which framework bests the rest under different scenarios and conditions. In this study, we aim to show trends and differences about several SPFs for Big Data by using the Systematic Mapping (SM) approach. To achieve our objectives, we raise 6 research questions (RQs), in which 91 studies that conducted between 2010 and 2015 were evaluated. We present the trends by classifying the research on SPFs with respect to the proposed RQs which can help researchers to obtain an overview of the field.","PeriodicalId":146662,"journal":{"name":"2016 Eleventh International Conference on Digital Information Management (ICDIM)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122615643","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Defining requirements for color-coding text software in teaching of Arabic","authors":"Hend Suliman Al-Khalifa, Muna A. Muhaureq","doi":"10.1109/ICDIM.2016.7829759","DOIUrl":"https://doi.org/10.1109/ICDIM.2016.7829759","url":null,"abstract":"Founding proper reading and comprehension abilities of the Arabic written text is of great significance for learners of the language since this is a means for extracting the linguistic and cultural knowledge. This process is complex in Arabic since the script is interwoven and multiple segments can be fused to create a single word which in return complicates identifying word units for new learners and accordingly delays proper acquisition. Proper acquisition is defined here as the ability to fluently read the text as well as manage to decode word parts formed by the agglutination of affixes. This paper introduces the requirements for software that simplifies instruction on word decoding and comprehension through utilizing color-coding on Arabic text.","PeriodicalId":146662,"journal":{"name":"2016 Eleventh International Conference on Digital Information Management (ICDIM)","volume":"403 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127593860","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"On the structural properties of eBay's network","authors":"C. M. França, Antonio A. Rocha, P. B. Velloso","doi":"10.1109/ICDIM.2016.7829771","DOIUrl":"https://doi.org/10.1109/ICDIM.2016.7829771","url":null,"abstract":"The OSN's (On-line Social Networks) have reached an incredible popularity in modern Internet. Those systems have been present in the daily lives of countless people helping them to share personal experiences, expectations and opinions. So high popularity has made of such networks complex systems. To understand the operation and phenomena that occur in such networks, there are metrics and models that capture aspects of their structures. The purpose of this work is to understand the complex reality of eBay e-commerce network, their connections and the dynamics of its users. Data were collected using a Web crawler developed in this work, and it resulted in a database of approximately 87 million transactions and 15 million different dealer users. From these data, the characterization was made estimating network metrics, like dealer users' degree distribution, that gave us key insights about the eBay negotiation network. We found that there are users who bought/sold for more than 100.000 different persons. We also found that a user A interacted over 4.000 times with another user B in just 3 months. Those and other interesting results, such as average distance and feedbacks ratings, were obtained, analyzed and discussed in this work.","PeriodicalId":146662,"journal":{"name":"2016 Eleventh International Conference on Digital Information Management (ICDIM)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127651924","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A survey revealing path towards service life cycle management in COBIT 5","authors":"Umara Noor, A. Ghazanfar","doi":"10.1109/ICDIM.2016.7829754","DOIUrl":"https://doi.org/10.1109/ICDIM.2016.7829754","url":null,"abstract":"Information technology has become an indispensable unit of an enterprise life in current era. Its inception has changed the ways businesses are done today in a competitive environment. In order to get value from significant investments done on complex IT infrastructure, it should be efficiently governed and managed. IT governance and management is a part of overall corporate governance and management and plays a vital role in aligning IT with business strategies. Among the several frameworks proposed for IT governance, COBIT is the most comprehensive and diverse framework providing support for both governance and management at all levels in multiple business domains. COBIT provides a toolset to bridge the gap between control requirements, technical issues and business risks. In this study we provide a survey of a few implementations of COBIT framework in multiple domains. Based on the survey we state our findings and recommend its adoption to comprehend similar issues. Further we identified certain limitations of COBIT framework and addressed the integration of service life cycle management into the original framework. We added seven new processes to the process structure of COBIT 5 along with their high level objectives. Also we added a few control objectives to the existing processes of COBIT framework. The survey provides a clear understanding of each COBIT implementation and the elements of the framework addressed in each implementation. Our study serves as a guide for all COBIT implementers and helps them teach how to deal with different kinds of governance or management matters. Further how the framework can be enhanced to provide service life cycle management.","PeriodicalId":146662,"journal":{"name":"2016 Eleventh International Conference on Digital Information Management (ICDIM)","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129974203","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A quality metric for BPEL process under evolution","authors":"N. Parimala, R. Kohar","doi":"10.1109/ICDIM.2016.7829777","DOIUrl":"https://doi.org/10.1109/ICDIM.2016.7829777","url":null,"abstract":"In Service-Oriented Architecture (SOA), behaviour of a business process is specified using Business Process Execution Language (BPEL) which is a XML based language. In today's competitive market, enterprises change their business processes frequently. Changes in BPEL process may affect the quality of BPEL process for the consumer. It is desirable to measure and evaluate the BPEL process quality when changes occur. Metrics are vastly used to provide a quantitative measure for the quality. In this paper, BPEL Process Usefulness Metric under Evolution (BUME) is proposed to measure quality of a BPEL process when it evolves. The applicability of the metric is demonstrated using simulated data for different versions of a BPEL process.","PeriodicalId":146662,"journal":{"name":"2016 Eleventh International Conference on Digital Information Management (ICDIM)","volume":"183 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116336868","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Muzammil Khan, Arif Ur Rahman, M. D. Awan, Syed Mehtab Alam
{"title":"Normalizing digital news-stories for preservation","authors":"Muzammil Khan, Arif Ur Rahman, M. D. Awan, Syed Mehtab Alam","doi":"10.1109/ICDIM.2016.7829785","DOIUrl":"https://doi.org/10.1109/ICDIM.2016.7829785","url":null,"abstract":"Preserving news stories may be important because of various reasons like they provide detailed information about events and they may be used for research purposes in the long term. However, the news stories published online are in danger because of reasons like constant change in the technologies used to publish information and the formats for publication. Certain institutions or individuals may be interested in preserving news stories related to a particular event or topic. The stories should be collected from various online newspapers and preserved for the long term. The major issue in the preservation process is that newspapers use different formats for online publication of the stories. The paper presents a tool which is developed to addresses the issue. The tool facilitates users in the extraction of news stories from various online newspapers and migration to a normalized format.","PeriodicalId":146662,"journal":{"name":"2016 Eleventh International Conference on Digital Information Management (ICDIM)","volume":"23 3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123502924","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Extracting keyword and keyphrase from online privacy policies","authors":"Dhiren A. Audich, R. Dara, B. Nonnecke","doi":"10.1109/ICDIM.2016.7829792","DOIUrl":"https://doi.org/10.1109/ICDIM.2016.7829792","url":null,"abstract":"One of the key components of constructing an ontology is a taxonomy. Creating a comprehensive taxonomy involves extracting keywords and keyphrases from the domain corpus. It is a time consuming endeavour that involves domain expertise and syntactic and structural knowledge of the corpus in question. In this paper we explore different keyword and keyphrase extraction algorithms for the domain of online privacy policies. To do this we used a variety of well-known techniques such as TF-IDF, RAKE, TextRank, and AlchemyAPI, benchmarked against manual annotation. We then further evaluated the performances of various algorithms over a large corpus of 631 privacy policies. Due to the inconsistent language of privacy policies algorithms evaluating single documents (RAKE, TextRank, AlchemyAPI) outperformed the one evaluating the entire corpus (TF-IDF).","PeriodicalId":146662,"journal":{"name":"2016 Eleventh International Conference on Digital Information Management (ICDIM)","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123751169","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
H. Nakayama, R. Onuma, Hayato Takagi, H. Kaminaga, Y. Miyadera, Shoichi Nakamura
{"title":"Methods for supporting the understanding of differences between search intentions and actual browsing situations in collaborative exploration","authors":"H. Nakayama, R. Onuma, Hayato Takagi, H. Kaminaga, Y. Miyadera, Shoichi Nakamura","doi":"10.1109/ICDIM.2016.7829772","DOIUrl":"https://doi.org/10.1109/ICDIM.2016.7829772","url":null,"abstract":"Collaborative exploration is one of the essential factors in advanced intellectual activities such as group work in project-based learning (PBL) and research work. Skillful sharing of the intentions of search and their results is quite important to enable collaborative exploration to be smoothly conducted. However, such sharing is usually difficult for members since they often face difficulties in expressing search intentions into queries and suffer from the troublesome activity of page selection. Such problems become more serious for novices. In particular, it is important but difficult to sufficiently understand the differences between search intentions and actual browsing situations. Moreover, there is often insufficient mutual understanding of differences in search policies between members of collaborative exploration since they tend to superficially confirm the search results. This research was aimed at developing novel support to cultivate the consideration skill of search strategy focusing on the novices' understanding of search contexts. This paper mainly describes the framework of support methods and provides a system overview. This paper also discusses the basic effectiveness and characteristics of our methods based on the results obtained from an experiment.","PeriodicalId":146662,"journal":{"name":"2016 Eleventh International Conference on Digital Information Management (ICDIM)","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125149453","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}