{"title":"Large Scale Graph Mining with MapReduce: Counting Triangles in Large Real Networks","authors":"Charalampos E. Tsourakakis","doi":"10.4018/978-1-61350-053-8.ch013","DOIUrl":"https://doi.org/10.4018/978-1-61350-053-8.ch013","url":null,"abstract":"In recent years, a considerable amount of research has focused on the study of graph structures arising from technological, biological and sociological systems. Graphs are the tool of choice in modeling such systems since they are typically described as sets of pairwise interactions. Important examples of such datasets are the Internet, the Web, social networks, and large-scale information networks which reach the planetary scale, e.g., Facebook and LinkedIn. The necessity to process large datasets, including graphs, has led to a major shift towards distributed computing and parallel applications, especially in the recent years. MapReduce was developed by Google, one of the largest users of multiple processor computing in the world, for facilitating the development of scalable and fault tolerant applications. MapReduce has become the de facto standard for processing large scale datasets both in industry and academia. In this Chapter, we present state of the art work on large scale graph mining using MapReduce. We survey research work on an important graph mining problem, counting the number of triangles in large-real world networks. We present the most important applications related to the count of triangles and two families of algorithms, a spectral and a combinatorial one, which solve the problem efficiently.","PeriodicalId":227251,"journal":{"name":"Graph Data Management","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114593100","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Data, Storage and Index Models for Graph Databases","authors":"S. Srinivasa","doi":"10.4018/978-1-61350-053-8.ch003","DOIUrl":"https://doi.org/10.4018/978-1-61350-053-8.ch003","url":null,"abstract":"Management of graph structured data has important applications in several areas. Queries on such data sets are based on structural properties of the graphs, in addition to values of attributes. Answering such queries pose significant challenges, as reasoning about structural properties across graphs are typically intractable problems. This chapter provides an overview of the challenges in designing databases over graph datasets. Different application areas that use graph databases, pose their own unique set of challenges, making the task of designing a generic graphoriented DBMS still an elusive goal. The purpose of this chapter is to survey some of the piecemeal solutions that have been proposed to address specific challenges in graph data management and suggest an overall structure in which these different solutions can be meaningfully placed.","PeriodicalId":227251,"journal":{"name":"Graph Data Management","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128323574","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Querying RDF Data","authors":"Faisal Alkhateeb, J. Euzenat","doi":"10.4018/978-1-61350-053-8.ch015","DOIUrl":"https://doi.org/10.4018/978-1-61350-053-8.ch015","url":null,"abstract":"This chapter provides an introduction to the RDF language as well as surveys the languages that can be used for querying RDF graphs. Then it reviews some of the languages that can be used for querying RDF and provides a comparison between these query languages.","PeriodicalId":227251,"journal":{"name":"Graph Data Management","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129984468","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"TEDI: Efficient Shortest Path Query Answering on Graphs","authors":"Fang Wei-Kleiner","doi":"10.4018/978-1-61350-053-8.ch009","DOIUrl":"https://doi.org/10.4018/978-1-61350-053-8.ch009","url":null,"abstract":"Efficient shortest path query answering in large graphs is enjoying a growing number of applications, such as ranked keyword search in databases, social networks, ontology reasoning and bioinformatics. A shortest path query on a graph finds the shortest path for the given source and target vertices in the graph. Current techniques for efficient evaluation of such queries are based on the pre-computation of compressed Breadth First Search trees of the graph. However, they suffer from drawbacks of scalability. To address these problems, we propose TEDI, an indexing and query processing scheme for the shortest path query answering. TEDI is based on the tree decomposition methodology. The graph is first decomposed into a tree in which the node (a.k.a. bag) contains more than one vertex from the graph. The shortest paths are stored in such bags and these local paths together with the tree are the components of the index of the graph. Based on this index, a bottom-up operation can be executed to find the shortest path for any given source and target vertices. Our experimental results show that TEDI offers ordersof-magnitude performance improvement over existing approaches on the index construction time, the index size and the query answering.","PeriodicalId":227251,"journal":{"name":"Graph Data Management","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114911612","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Clustering Vertices in Weighted Graphs","authors":"D. Wijaya, S. Bressan","doi":"10.4018/978-1-61350-053-8.ch012","DOIUrl":"https://doi.org/10.4018/978-1-61350-053-8.ch012","url":null,"abstract":"Clustering is the unsupervised process of discovering natural clusters so that objects within the same cluster are similar and objects from different clusters are dissimilar. In clustering, if similarity relations between objects are represented as a simple, weighted graph where objects are vertices and similarities between objects are weights of edges; clustering reduces to the problem of graph clustering. A natural notion of graph clustering is the separation of sparsely connected dense sub graphs from each other based on the notion of intra-cluster density vs. inter-cluster sparseness. In this chapter, we overview existing graph algorithms for clustering vertices in weighted graphs: Minimum Spanning Tree (MST) clustering, Markov clustering, and Star clustering. This includes the variants of Star clustering, MST clustering and Ricochet.","PeriodicalId":227251,"journal":{"name":"Graph Data Management","volume":"69 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126901707","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Applications of Flexible Querying to Graph Data","authors":"A. Poulovassilis","doi":"10.1007/978-3-319-96193-4_4","DOIUrl":"https://doi.org/10.1007/978-3-319-96193-4_4","url":null,"abstract":"","PeriodicalId":227251,"journal":{"name":"Graph Data Management","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114789788","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Business Process Graphs: Similarity Search and Matching","authors":"R. Dijkman, M. Dumas, L. García-Bañuelos","doi":"10.4018/978-1-61350-053-8.ch018","DOIUrl":"https://doi.org/10.4018/978-1-61350-053-8.ch018","url":null,"abstract":"Organizations create collections of hundreds or even thousands of business process models to describe their operations. This chapter explains how graphs can be used as underlying formalism to develop techniques for managing such collections. To this end it defines the business process graph formalism. On this formalism it defines techniques for determining similarity of business process graphs. Such techniques can be used to quickly search through a collection of business process graphs to find the graph that is most relevant to a given query. These techniques can be used by tool builders that develop tools for managing large collections of business process models. The aim of the chapter is to provide an overview of the research area of using graphs to do similarity search and matching of business processes.","PeriodicalId":227251,"journal":{"name":"Graph Data Management","volume":"113 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124343059","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Graph Mining Techniques: Focusing on discriminating between real and synthetic graphs","authors":"A. P. Appel, C. Faloutsos, C. Traina","doi":"10.4018/978-1-61350-053-8.ch010","DOIUrl":"https://doi.org/10.4018/978-1-61350-053-8.ch010","url":null,"abstract":"Graphs appear in several settings, like social networks, recommendation systems, computer communication networks, gene/protein biological networks, among others. A large amount of graph patterns, as well as graph generator models that mimic such patterns have been proposed over the last years. However, a deep and recurring question still remains: “What is a good pattern?” The answer is related to finding a pattern or a tool able to help distinguishing between actual real-world and fake graphs. Here we explore the ability of ShatterPlots, a simple and powerful algorithm to tease out patterns of real graphs, helping us to spot fake/masked graphs. The idea is to force a graph to reach a critical (“Shattering”) point, randomly deleting edges, and study its properties at that point.","PeriodicalId":227251,"journal":{"name":"Graph Data Management","volume":"101 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115033865","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Shortest Path in Transportation Network and Weighted Subdivisions","authors":"Radwa El Shawi, Joachim Gudmundsson","doi":"10.4018/978-1-61350-053-8.ch020","DOIUrl":"https://doi.org/10.4018/978-1-61350-053-8.ch020","url":null,"abstract":"The shortest path problem asks for a path between two given points such that the sum of its edges is minimized. The problem has a rich history and has been studied extensively since the 1950’s in many areas of computer science, among them network optimization, graph theory and computational geometry. In this chapter we consider two versions of the problem; the shortest path in a transportation network and the shortest path in a weighted subdivision, sometimes called a terrain.","PeriodicalId":227251,"journal":{"name":"Graph Data Management","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128840184","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Survey of Relational Approaches for Graph Pattern Matching over Large Graphs","authors":"Jiefeng Cheng, J. Yu","doi":"10.4018/978-1-61350-053-8.ch006","DOIUrl":"https://doi.org/10.4018/978-1-61350-053-8.ch006","url":null,"abstract":"Due to rapid growth of the Internet and new scientific/technological advances, there exist many new applications that model data as graphs, because graphs have sufficient expressiveness to model complicated structures. The dominance of graphs in real-world applications demands new graph processing techniques to access and analyze large graph datasets effectively and efficiently. Among those techniques, a graph pattern matching problem receives increasing attention, which is to find all patterns in a large data graph that match a user-given graph pattern. In this survey, we review approaches to process such graph pattern queries with a framework of multi joins, which can be easily implemented in relational databases and requires no specialized native storage for graphs. We also discuss the top-k graph pattern matching problem. DOI: 10.4018/978-1-61350-053-8.ch006","PeriodicalId":227251,"journal":{"name":"Graph Data Management","volume":"58 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130592320","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}