{"title":"Adaptation of Apriori to MapReduce to Build a Warehouse of Relations between Named Entities across the Web","authors":"Jean-Daniel Cryans, S. Ratté, R. Champagne","doi":"10.1109/DBKDA.2010.34","DOIUrl":"https://doi.org/10.1109/DBKDA.2010.34","url":null,"abstract":"The Semantic Web has made possible the use of the Internet to extract useful content, a task that could necessitate an infrastructure across the Web. With Hadoop, a free implementation of the MapReduce programming paradigm created by Google, we can treat these data reliably over hundreds of servers. This article describes how the Apriori algorithm was adapted to MapReduce in the search for relations between entities to deal with thousands of Web pages coming from RSS feeds daily. First, every feed is looked up five times per day and each entry is registered in a database with MapReduce. Second, the entries are read and their content sent to the Web service OpenCalais for the detection of named entities. For each Web page, the set of all itemsets found is generated and stored in the database. Third, all generated sets, from first to last, are counted and their support is registered. Finally, various analytical tasks are executed to present the relationships found. Our tests show that the third step, executed over 3,000,000 sets, was 4.5 times faster using five servers than using a single machine. This approach allows us to easily and automatically distribute treatments on as many machines as are available, and be able to process datasets that one server, even a very powerful one, would not be able to manage alone. We believe that this work is a step forward in processing semantic Web data efficiently and effectively.","PeriodicalId":273177,"journal":{"name":"2010 Second International Conference on Advances in Databases, Knowledge, and Data Applications","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122411687","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Performance Evaluation of an Optimistic Concurrency Control Algorithm for Temporal Databases","authors":"Achraf Makni, R. Bouaziz","doi":"10.1109/DBKDA.2010.41","DOIUrl":"https://doi.org/10.1109/DBKDA.2010.41","url":null,"abstract":"We propose in this paper a performance study of an access concurrency control algorithm for temporal databases. This algorithm is based on the optimistic approach, which is, in our opinion, more suitable for temporal databases than the pessimistic methods. Indeed, our optimistic algorithm, in the contrary to the pessimistic ones, can exploit the temporal specifications to reduce the granule size and then to minimize the conflict degree. Moreover, it can detect, as soon as possible, all the conflict cases. By using the end of transaction marker technique, it has the merit to reduce to the maximum the period during which resources are locked in the validation phase. By carrying out a formal verification, based first on the serialization theory and next on the SPIN model checker, we have ensured that our algorithm operate correctly. Now, we proceed to its experimental evaluation vis-à-vis of other well-known concurrency control mechanisms based on an optimistic and pessimistic approach.","PeriodicalId":273177,"journal":{"name":"2010 Second International Conference on Advances in Databases, Knowledge, and Data Applications","volume":"52 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133431795","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Modeling Topological Relations between Uncertain Spatial Regions in Geo-spatial Databases: Uncertain Intersection and Difference Topological Model","authors":"A. Alboody, F. Sèdes, J. Inglada","doi":"10.1109/DBKDA.2010.28","DOIUrl":"https://doi.org/10.1109/DBKDA.2010.28","url":null,"abstract":"Topological relations have played important roles in spatial query, analysis and reasoning in Geographic Information Systems (GIS) and geospatial databases. The topological relations between crisp, uncertain and fuzzy spatial regions based upon the 9-intersections model have been identified. The research issue of topological relations, particularly, between spatial regions with uncertainties, has gained a lot of attention during the past two decades. However, the formal representation and calculation of the topological relations between uncertain regions is still an open issue and needs to be further developed. The paper provides a theoretical framework for modeling topological relations between uncertain spatial regions based upon a new uncertain topological model called the Uncertain Intersection and Difference (UID) Model. In order to derive all topological relations between two spatial regions with uncertainties, the spatial object of type Region (A) is decomposed in four components: the Interior, the Interior’s Boundary, the Object’s Boundary, and the Exterior’s Boundary of A. By use of this definition of spatial region with uncertainties, new 4*4-Intersection and Uncertain Intersection and Difference (UID) models are proposed as a qualitative model for the identification of all topological relations between two spatial regions with uncertainties. These two new models are compared with other models studied in the literature. 152 binary topological relations can be identified by these two models. Then, the topological complexity and distance of the 152 relations will be study in details by using the UID model. Based upon this study of topological complexity and distance, a conceptual neighborhood graph for the 152 relations can be obtained. Examples are provided to illustrate the utility of these two models presented in this paper with results which can be applied for modeling GIS, geospatial databases and satellite image processing.","PeriodicalId":273177,"journal":{"name":"2010 Second International Conference on Advances in Databases, Knowledge, and Data Applications","volume":"92 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128598909","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Analysis of the Quality of Life after an Endoscopic Thoracic Sympathectomy: A Business Intelligence Approach","authors":"D. Goncalves, M. Y. Santos, Jorge Cruz","doi":"10.1109/DBKDA.2010.12","DOIUrl":"https://doi.org/10.1109/DBKDA.2010.12","url":null,"abstract":"Primary hyperhidrosis, a disorder characterized by an excessive sweating, has been treated by endoscopic thoracic sympathectomy. As a consequence of the surgery, patients improved their overall quality of life. Their day-by-day activities are not affected, or are less affected, by this disorder, and their emotional state verifies a significant improvement, from a situation of shame and self-punishing to what we could say a normal life. This paper presents the analysis of the quality of life of 227 patients that were treated by an endoscopic thoracic sympathectomy. The study was based on the use of business intelligence technologies, which allowed the storage, the analysis and the reporting of all the relevant findings. In technological terms, this paper illustrates the database and data analysis developments needed in a specific healthcare application domain. For data storage, a data mart was designed addressing the relevant attributes. For data analysis, on-line analytical processing and data mining technologies were used to show the evolution of the patients’, health condition and the incidence of complications or side effects as consequence of the surgery.","PeriodicalId":273177,"journal":{"name":"2010 Second International Conference on Advances in Databases, Knowledge, and Data Applications","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130807428","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Towards a Discovering Knowledge Comprehensible and Exploitable by the End-User","authors":"A. Touzi","doi":"10.1109/DBKDA.2010.36","DOIUrl":"https://doi.org/10.1109/DBKDA.2010.36","url":null,"abstract":"The main goal to extract knowledge in database is to help the user to give semantics of data and to optimize the information research. Unfortunately, this fundamental constraint is not taken into account by almost all the approaches for knowledge discovery. Indeed, these approaches generate a big number of rules that are not easily assimilated by the human brain. In this paper, we propose a new approach for Knowledge Discovery in Databases through the fusion of conceptual clustering, fuzzy logic, and formal concept analysis. While basing on the hierarchical structure offered by the lattices, we proceed to discover the Knowledge in a hierarchical way. Thus, according to the degree of detail required by the user, this approach proposes a level of knowledge and different views of this knowledge, so the user can easily exploit all knowledge generated. Moreover, this solution is extensible, the user is able to choose the fuzzy method of classification according to the domain of his data and his needs.","PeriodicalId":273177,"journal":{"name":"2010 Second International Conference on Advances in Databases, Knowledge, and Data Applications","volume":"189 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121204995","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Failure-Tolerant Transaction Routing at Large Scale","authors":"Idrissa Sarr, Hubert Naacke, Stéphane Gançarski","doi":"10.1109/DBKDA.2010.9","DOIUrl":"https://doi.org/10.1109/DBKDA.2010.9","url":null,"abstract":"Emerging Web2.0 applications such as virtual worlds or social networking websites strongly differ from usual OLTP applications. First, the transactions are encapsulated in an API such that it is possible to know which data a transaction will access, before processing it. Second, the simultaneous transactions are very often commutative since they access distinct data. Anticipating that the workload of such applications will quickly reach thousands of transactions per seconds, we envision a novel solution that would allow these applications to scale-up without the need to buy expensive resources at a data center. To this end, databases are replicated over a P2P infrastructure for achieving high availability and fast transaction processing thanks to parallelism. However, achieving both fast and consistent data access on such architectures is challenging at many points. In particular, centralized control is prohibited because of its vulnerability and lack of efficiency at large scale. Moreover dynamic behavior of nodes, which can join and leave the system at anytime and frequently, can compromise mutual consistency. In this article, we propose a failure-tolerant solution for the distributed control of transaction routing in a large scale network. We leverage a fully distributed approach relying on a DHT to handle routing metadata, with a suitable failure management mechanism that handles nodes dynamicity and nodes failures. Moreover, we demonstrate the feasibility of our transaction routing implementation through experimentation and the effectiveness of our failure management approach through simulation.","PeriodicalId":273177,"journal":{"name":"2010 Second International Conference on Advances in Databases, Knowledge, and Data Applications","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122726644","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Situational Resource Rating System","authors":"Raphaël Thollot, Marie-Aude Aufaure","doi":"10.1109/DBKDA.2010.31","DOIUrl":"https://doi.org/10.1109/DBKDA.2010.31","url":null,"abstract":"Recommendation technologies are considered a major technological trend in both industrial and academic environments. This growing interest was highlighted by, e.g., the Netflix prize which generated an intense competition. Recommender systems are crucial to support users and help them by suggesting resources relevant at a given instant. On the other hand, these systems are a core piece of e-commerce web sites, since they aim at generating more sales by encouraging users to buy more items. However, recommender systems are often designed to work with very specific types of resources, and they hardly take into account the current user’s situation. In this paper, we present our approach to augment an existing recommender system with a situation model. On top of this model, we define a situational interest measure to estimate a user’s interest for a resource, which we demonstrate with a prototypical implementation.","PeriodicalId":273177,"journal":{"name":"2010 Second International Conference on Advances in Databases, Knowledge, and Data Applications","volume":"51 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131417335","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Efficient Maintenance of k-Dominant Skyline for Frequently Updated Database","authors":"M. A. Siddique, Y. Morimoto","doi":"10.1109/DBKDA.2010.16","DOIUrl":"https://doi.org/10.1109/DBKDA.2010.16","url":null,"abstract":"Skyline queries retrieve a set of skyline objects so that the user can choose promising objects from them and make further inquiries. However, a skyline query often retrieves too many objects to analyze intensively. To solve the problem, k-dominant skyline queries have been introduced, which can reduce the number of retrieved objects by relaxing the definition of the dominance. Though it can reduce the number of retrieved objects, the k-dominant skyline objects are difficult to maintain if the database is updated. This paper addresses the problem of maintenance of k-dominant skyline objects of frequently updated database. We propose an algorithm for maintaining k-dominant skyline objects. Intensive experiments using real and synthetic datasets demonstrated that our method is efficient and scalable.","PeriodicalId":273177,"journal":{"name":"2010 Second International Conference on Advances in Databases, Knowledge, and Data Applications","volume":"61 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114750171","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Intelligent Network Communications for Distributed Database Systems","authors":"I. Hababeh","doi":"10.1109/DBKDA.2010.11","DOIUrl":"https://doi.org/10.1109/DBKDA.2010.11","url":null,"abstract":"Customizing network sites have become an increasingly important issue in distributed database systems. This will improve the network system performance by reducing the number of communications required for query processing in terms of retrieval and update transactions. This paper presents an intelligent clustering method for distributed database system that provides a structure for organizing large number of network sites into a set of useful clusters to minimize transactions processing communications. It has been designed to divide the database network sites into a set of disjoint clusters based on a high performance clustering technique. This can reduce the amount of redundant data to be accessed and transferred among different sites, definitely increase the transaction performance, significantly improve database system response time, and result in better distributed network decision support. Experimental validations on real database applications at different networks connectivity are performed and the results demonstrate that the proposed method leads to precise solutions for the problems of data communication, allocation, and redundancy.","PeriodicalId":273177,"journal":{"name":"2010 Second International Conference on Advances in Databases, Knowledge, and Data Applications","volume":"52 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133886197","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Towards Social Network Extraction Using a Graph Database","authors":"Rania Soussi, Marie-Aude Aufaure, Hajer Baazaoui Zghal","doi":"10.1109/DBKDA.2010.19","DOIUrl":"https://doi.org/10.1109/DBKDA.2010.19","url":null,"abstract":"In the enterprise context, an important amount of information is stored in relational databases. Therefore, relational database can be a rich source to extract social network. Moreover, it is not very suitable to present and store a social network. On the other hand, a graph database canmodel data in natural way and facilitates the query of data using graph operations. In this way, we propose a social network extraction approach from relational, and present mechanisms for transforming relational database into graph databases.","PeriodicalId":273177,"journal":{"name":"2010 Second International Conference on Advances in Databases, Knowledge, and Data Applications","volume":"45 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132345580","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}