Structural Analyses of Malaysian Web and Host Graphs

Jason Yong-Jin Tee, Lay-Ki Soon, Choo-Yee Ting
{"title":"Structural Analyses of Malaysian Web and Host Graphs","authors":"Jason Yong-Jin Tee, Lay-Ki Soon, Choo-Yee Ting","doi":"10.1109/FiCloud.2015.11","DOIUrl":null,"url":null,"abstract":"In this paper, we present our study on building a Web graph for the Malaysian Web based on the crawling of Malaysian Web sites. Given the constructed Web graph, interesting characteristics have been studied, such as the in-degree, out-degree, the distribution of power law, bow-tie structure and the strongly-connected components (SCCs). Besides, more important insight could be obtained from analyzing the hyperlinks among the hosts. Similar to Web graph, host graph serves as an informative source for enhancing crawling and searching methodologies, predicting Web growth and for Web sociology study. To date, Web or host graphs have been built for the global Web and other nations. Hence, being a subset of the global Web, it would also be interesting to compare the characteristics of Malaysian host graph with the global Web and other nations'. This research outputs a Malaysian Web graph, a Malaysian host graph and studies on the characteristics of these graphs. The graphs portray characteristics that are consistent with the graphs created for other nations. There are irregularities to the graphs where it is discussed at length to justify to anomalies.","PeriodicalId":182204,"journal":{"name":"2015 3rd International Conference on Future Internet of Things and Cloud","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-08-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 3rd International Conference on Future Internet of Things and Cloud","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/FiCloud.2015.11","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

In this paper, we present our study on building a Web graph for the Malaysian Web based on the crawling of Malaysian Web sites. Given the constructed Web graph, interesting characteristics have been studied, such as the in-degree, out-degree, the distribution of power law, bow-tie structure and the strongly-connected components (SCCs). Besides, more important insight could be obtained from analyzing the hyperlinks among the hosts. Similar to Web graph, host graph serves as an informative source for enhancing crawling and searching methodologies, predicting Web growth and for Web sociology study. To date, Web or host graphs have been built for the global Web and other nations. Hence, being a subset of the global Web, it would also be interesting to compare the characteristics of Malaysian host graph with the global Web and other nations'. This research outputs a Malaysian Web graph, a Malaysian host graph and studies on the characteristics of these graphs. The graphs portray characteristics that are consistent with the graphs created for other nations. There are irregularities to the graphs where it is discussed at length to justify to anomalies.
马来西亚网站和主机图的结构分析
在本文中,我们提出了建立一个基于马来西亚网站爬行的网络图的研究。在构建网络图的基础上,研究了网络图的入度、出度、幂律分布、领结结构和强连接构件等特征。此外,通过分析主机之间的超链接可以获得更重要的见解。与Web图类似,主机图是增强爬行和搜索方法、预测Web增长和Web社会学研究的信息源。迄今为止,已经为全球Web和其他国家建立了Web或主机图。因此,作为全球网络的一个子集,将马来西亚主机图的特征与全球网络和其他国家的特征进行比较也会很有趣。本研究输出了一个马来西亚网络图、一个马来西亚主机图,并对这些图的特征进行了研究。这些图表描绘的特征与为其他国家创建的图表一致。图中有不规则之处,我们对其进行了详细的讨论,以证明其异常。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信