A Simulation Framework for P2P Queries Routing for E-Business
Anis Ismail, Aziz Barbar
{"title":"A Simulation Framework for P2P Queries Routing for E-Business","authors":"Anis Ismail, Aziz Barbar","doi":"10.4018/jeei.2012040103","DOIUrl":null,"url":null,"abstract":"On-line business transaction processing systems have so far been based on centralized or client-server architectures. The growing interest in Peer-to-Peer centralized or decentralized systems has inspired numerous research activities, though in a schema-based Peer-to-Peer (P2P) system, locating Peers (services) relevant to a given query is a basic problem for which different routing strategies of queries have been proposed. In this paper, the architecture, based on (Super-) Peers, is proposed, with a special focus on query routing. For an efficient query routing, (Super-) Peers having similar interests are grouped together and called SuperSuper-Peers (SSP). Super-Peers submit queries that are often processed by members of this group. A SSP is a specific Super-Peer that contains knowledge about 1) its Super-Peers, and 2) the other SSP. Using data mining techniques knowledge is extracted by processing queries of Peers that transit on the network. The advantage of this distributed knowledge is that it avoids making semantic mapping between heterogeneous data sources owned by (Super-) Peers each time the system decides to route query to other (Super-) Peers. DOI: 10.4018/jeei.2012040103 30 International Journal of E-Entrepreneurship and Innovation, 3(2), 29-50, April-June 2012 Copyright © 2012, IGI Global. Copying or distributing in print or electronic forms without written permission of IGI Global is prohibited. server and the relevant business data, decrease the risk of a centralized server to become a single point of failure, and diminish the risk of shutting down the centralized server for unfinished business transactions. Second, P2P architectures provide scalable environments. It is able to deal with transient users. The Peer-toPeer computing paradigm is viewed as a novel approach for people to share resources such as files and computing cycles, or to support collaborative tasks. During the past few years, the Internet has been gradually shifting toward a distributed system that supports more than a unique client-server application. Peer-to-Peer (P2P) systems are distributed systems, in which nodes of equal roles and capabilities exchange information and services directly with each other, making it more popular. Peer-to-Peer (P2P) systems’ design, including efficient techniques for search, route queries and retrieval of data, allows the user to share huge volumes of data. However, the major problem in such networks is query routing, i.e., deciding to which other (Super-) Peers the query has to be sent for high efficiency and effectiveness. Traditional P2P systems offer support for richer queries; they provide the option to search by identifier, such as a keyword search with regular expressions. Search techniques for these systems must therefore operate under a different set of constraints than those techniques developed for persistent storage utilities. However, the technique of broadcasting all queries to all Peers suffer from limited efficiency and scalability. In hybrid P2P systems (Ioannidis et al., 2008; Annapureddy et al., 2007) composed of (Super-) Peers, when a Peer submits a query, this Peer becomes the source of this query. Then the query is transmitted to its Super-Peer (SP). The routing policy use semantic mappings between schemas of (Super-) Peers to quickly determine the relevant neighbors (SP), and to which neighbors the query is to be sent. A query received by a SP is processed over its local collection of data sources of different Peers. Once results are found, the SP will send a single response message back to the query source. The time the user must wait for the results to arrive is an important factor; and, it is affected by the mediation process which remains difficult to realize in such a context when the number of (Super-) Peers increases. Several reasons affect response times, such as the time it takes for the query to travel through several SP in the network; and, whenever the SP is forced to look for connections (i.e., mappings) in order to route the query. For these reasons, response times tend to be slow in hybrid P2P networks. Satisfaction time is simply the time that has elapsed between the submission of the query by the user, and the time he receives the overall results. Recently, data mining has gained in popularity due to the emergence of vast quantities of data. In this paper, a practical issue about data mining in P2P network is discussed. The motivations behind P2P data mining include the optimal usage of available computational resources, privacy, and dependability to eliminate critical points of service. In this paper, the effect of data mining in P2P query routing is presented. The proposed method focuses on how the query is routed to relevant Peers with minimum query processing at SP level in order to improve answering time of the queries by using data mining technique. The important advantage of the suggested approach is scalability. The said approach consists of grouping together (Super-) Peers that have similar themes for an efficient query routing. Each obtained group, called Super-Super-Peers (SSP), contains domains composed of Super-Peers (responsible of domains) and their corresponding Peers (the members); the former submit the queries that are often processed by the members of this group. Each SSP operates with an index that is obtained by applying decision tree algorithms; and, it keeps track of locations of contents concerning a query: when an SSP receives a query from a Super-Peer (in its group), it directly consults its index (without making any mappings) in order to determine 1) in its group, all Super-Peers (or domains) that are able to answer this query; and, 2) in other groups (i.e., other SSP), all Super-Peers which are relevant to this query. 20 more pages are available in the full version of this document, which may be purchased using the \"Add to Cart\" button on the product's webpage: www.igi-global.com/article/simulation-framework-p2pqueries-routing/67541?camid=4v1 This title is available in InfoSci-Journals, InfoSci-Journal Disciplines Business, Administration, and Management. Recommend this product to your librarian: www.igi-global.com/e-resources/libraryrecommendation/?id=2","PeriodicalId":102199,"journal":{"name":"Int. J. E Entrepreneurship Innov.","volume":"22 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Int. J. E Entrepreneurship Innov.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.4018/jeei.2012040103","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 7
Abstract
On-line business transaction processing systems have so far been based on centralized or client-server architectures. The growing interest in Peer-to-Peer centralized or decentralized systems has inspired numerous research activities, though in a schema-based Peer-to-Peer (P2P) system, locating Peers (services) relevant to a given query is a basic problem for which different routing strategies of queries have been proposed. In this paper, the architecture, based on (Super-) Peers, is proposed, with a special focus on query routing. For an efficient query routing, (Super-) Peers having similar interests are grouped together and called SuperSuper-Peers (SSP). Super-Peers submit queries that are often processed by members of this group. A SSP is a specific Super-Peer that contains knowledge about 1) its Super-Peers, and 2) the other SSP. Using data mining techniques knowledge is extracted by processing queries of Peers that transit on the network. The advantage of this distributed knowledge is that it avoids making semantic mapping between heterogeneous data sources owned by (Super-) Peers each time the system decides to route query to other (Super-) Peers. DOI: 10.4018/jeei.2012040103 30 International Journal of E-Entrepreneurship and Innovation, 3(2), 29-50, April-June 2012 Copyright © 2012, IGI Global. Copying or distributing in print or electronic forms without written permission of IGI Global is prohibited. server and the relevant business data, decrease the risk of a centralized server to become a single point of failure, and diminish the risk of shutting down the centralized server for unfinished business transactions. Second, P2P architectures provide scalable environments. It is able to deal with transient users. The Peer-toPeer computing paradigm is viewed as a novel approach for people to share resources such as files and computing cycles, or to support collaborative tasks. During the past few years, the Internet has been gradually shifting toward a distributed system that supports more than a unique client-server application. Peer-to-Peer (P2P) systems are distributed systems, in which nodes of equal roles and capabilities exchange information and services directly with each other, making it more popular. Peer-to-Peer (P2P) systems’ design, including efficient techniques for search, route queries and retrieval of data, allows the user to share huge volumes of data. However, the major problem in such networks is query routing, i.e., deciding to which other (Super-) Peers the query has to be sent for high efficiency and effectiveness. Traditional P2P systems offer support for richer queries; they provide the option to search by identifier, such as a keyword search with regular expressions. Search techniques for these systems must therefore operate under a different set of constraints than those techniques developed for persistent storage utilities. However, the technique of broadcasting all queries to all Peers suffer from limited efficiency and scalability. In hybrid P2P systems (Ioannidis et al., 2008; Annapureddy et al., 2007) composed of (Super-) Peers, when a Peer submits a query, this Peer becomes the source of this query. Then the query is transmitted to its Super-Peer (SP). The routing policy use semantic mappings between schemas of (Super-) Peers to quickly determine the relevant neighbors (SP), and to which neighbors the query is to be sent. A query received by a SP is processed over its local collection of data sources of different Peers. Once results are found, the SP will send a single response message back to the query source. The time the user must wait for the results to arrive is an important factor; and, it is affected by the mediation process which remains difficult to realize in such a context when the number of (Super-) Peers increases. Several reasons affect response times, such as the time it takes for the query to travel through several SP in the network; and, whenever the SP is forced to look for connections (i.e., mappings) in order to route the query. For these reasons, response times tend to be slow in hybrid P2P networks. Satisfaction time is simply the time that has elapsed between the submission of the query by the user, and the time he receives the overall results. Recently, data mining has gained in popularity due to the emergence of vast quantities of data. In this paper, a practical issue about data mining in P2P network is discussed. The motivations behind P2P data mining include the optimal usage of available computational resources, privacy, and dependability to eliminate critical points of service. In this paper, the effect of data mining in P2P query routing is presented. The proposed method focuses on how the query is routed to relevant Peers with minimum query processing at SP level in order to improve answering time of the queries by using data mining technique. The important advantage of the suggested approach is scalability. The said approach consists of grouping together (Super-) Peers that have similar themes for an efficient query routing. Each obtained group, called Super-Super-Peers (SSP), contains domains composed of Super-Peers (responsible of domains) and their corresponding Peers (the members); the former submit the queries that are often processed by the members of this group. Each SSP operates with an index that is obtained by applying decision tree algorithms; and, it keeps track of locations of contents concerning a query: when an SSP receives a query from a Super-Peer (in its group), it directly consults its index (without making any mappings) in order to determine 1) in its group, all Super-Peers (or domains) that are able to answer this query; and, 2) in other groups (i.e., other SSP), all Super-Peers which are relevant to this query. 20 more pages are available in the full version of this document, which may be purchased using the "Add to Cart" button on the product's webpage: www.igi-global.com/article/simulation-framework-p2pqueries-routing/67541?camid=4v1 This title is available in InfoSci-Journals, InfoSci-Journal Disciplines Business, Administration, and Management. Recommend this product to your librarian: www.igi-global.com/e-resources/libraryrecommendation/?id=2
电子商务P2P查询路由的仿真框架
到目前为止,在线业务事务处理系统一直基于集中式或客户机-服务器体系结构。尽管在基于模式的对等(P2P)系统中,定位与给定查询相关的对等(服务)是一个基本问题,但对点对点集中式或分散式系统日益增长的兴趣激发了许多研究活动,为此已经提出了不同的查询路由策略。本文提出了一种基于(超级)对等体的结构,特别关注查询路由。为了实现高效的查询路由,将具有相似兴趣的(Super-)对等体组合在一起,称为超级超级对等体(SSP)。超级对等节点提交的查询通常由该组的成员处理。SSP是一个特定的超级对等体,它包含关于1)它的超级对等体和2)另一个SSP的知识。利用数据挖掘技术,通过处理在网络上传输的对等体的查询来提取知识。这种分布式知识的优点是,每次系统决定将查询路由到其他(超级)对等点时,它避免了在(超级)对等点拥有的异构数据源之间进行语义映射。DOI: 10.4018 / jeei。2012040103国际电子创业与创新学报,3(2),29- 50,2012年4 - 6月版权所有©2012,IGI Global。未经IGI Global书面许可,禁止以印刷或电子形式复制或分发。服务器和相关业务数据,降低集中式服务器成为单点故障的风险,并降低因未完成的业务事务而关闭集中式服务器的风险。其次,P2P架构提供了可伸缩的环境。它能够处理暂态用户。点对点计算范式被视为人们共享资源(如文件和计算周期)或支持协作任务的一种新方法。在过去的几年中,Internet已经逐渐转向分布式系统,该系统支持的不仅仅是一个独特的客户机-服务器应用程序。P2P (Peer-to-Peer)系统是一种分布式系统,在这种系统中,具有同等角色和能力的节点彼此直接交换信息和服务,使其更加流行。点对点(P2P)系统的设计,包括高效的搜索、路由查询和数据检索技术,允许用户共享大量数据。然而,这种网络中的主要问题是查询路由,即决定查询必须发送到哪个其他(超级)对等体以获得高效率和有效性。传统的P2P系统支持更丰富的查询;它们提供了按标识符进行搜索的选项,例如使用正则表达式进行关键字搜索。因此,针对这些系统的搜索技术必须在与针对持久存储实用程序开发的技术不同的约束下运行。然而,将所有查询广播到所有对等点的技术效率和可扩展性有限。在混合P2P系统中(Ioannidis et al., 2008;Annapureddy et al., 2007)由(超级)Peer组成,当一个Peer提交查询时,该Peer成为该查询的源。然后将查询发送到超级对等体(SP)。路由策略使用(超级)对等体模式之间的语义映射来快速确定相关的邻居(SP),以及将查询发送到哪个邻居。SP接收到的查询是在不同对等体的本地数据源集合上处理的。找到结果后,SP将向查询源发送单个响应消息。用户等待结果到来的时间是一个重要因素;并且,它受到中介过程的影响,在这种情况下,当(超级)对等体数量增加时,中介过程仍然难以实现。有几个原因会影响响应时间,例如查询通过网络中的多个SP所花费的时间;以及,当SP为了路由查询而被迫查找连接(即映射)时。由于这些原因,在混合P2P网络中,响应时间往往很慢。满意时间就是从用户提交查询到他收到总体结果之间所经过的时间。近年来,由于大量数据的出现,数据挖掘越来越受欢迎。本文讨论了P2P网络中数据挖掘的一个实际问题。P2P数据挖掘背后的动机包括对可用计算资源的最佳利用、隐私和可靠性,以消除服务的关键点。本文介绍了数据挖掘在P2P查询路由中的作用。该方法利用数据挖掘技术,以最小的SP级查询处理将查询路由到相关的对等节点,从而提高查询的应答时间。所建议的方法的重要优点是可伸缩性。 上述方法包括将具有相似主题的对等节点分组在一起,以实现高效的查询路由。每个获得的组称为Super-Super-Peers (SSP),包含由super - peer(负责域)及其对应的peer(成员)组成的域;前者提交通常由该组成员处理的查询。每个SSP都有一个索引,该索引是通过应用决策树算法获得的;并且,它跟踪与查询有关的内容的位置:当SSP收到来自超级对等体(在其组中)的查询时,它直接咨询其索引(不做任何映射)以确定1)在其组中,所有能够回答此查询的超级对等体(或域);2)在其他组(即其他SSP)中,与此查询相关的所有超级对等体。本文档的完整版还有20多页,可通过产品网页上的“添加到购物车”按钮购买:www.igi-global.com/article/simulation-framework-p2pqueries-routing/67541?camid=4v1。本标题可在InfoSci-Journals、InfoSci-Journal discipline Business、Administration和Management中找到。向您的图书管理员推荐此产品:www.igi-global.com/e-resources/libraryrecommendation/?id=2
本文章由计算机程序翻译,如有差异,请以英文原文为准。