基于局部相似度生成特征的链路预测分类算法

Sistemasi Jurnal Sistem Informasi Pub Date : 2022-05-21 DOI:10.32520/stmsi.v11i2.1641

Siti Apryanti Koni’ah, H. Yuliansyah

{"title":"基于局部相似度生成特征的链路预测分类算法","authors":"Siti Apryanti Koni’ah, H. Yuliansyah","doi":"10.32520/stmsi.v11i2.1641","DOIUrl":null,"url":null,"abstract":"A social network is a social structure that consists consisting of nodes, edges, or links and describes activity on a social media platform. Later, link prediction is a technique to predict new relationships for future networks based on information explored from the current network topology. Several local similarity-based methods use topological information to predict the link. However, these methods have different performances and depend on the network topology. This study proposes using classification algorithms of machine learning to predict future links. The classification algorithms compared are k-Nearest Neighbors (KNN), Naive Bayes, Decision Tree, and Random Forest by comparing six social network datasets with features generated from local similarity-based methods. This research was conducted in three stages: preprocessing, classification comparison, and performance evaluation. The findings of this study are that the Random Forest algorithm outperforms for testing accuracy, precision, and F1-Score. However, in the recall test results, Random Forest only outperformed other benchmark algorithms in the four datasets: soc-karate, soc-dolphin, soc-highschool M, and Soc-sparrowlyon-flock-season 03. Meanwhile, in the datasets soc-tribes and soc-aves-weaver-social-05, the Decision Tree algorithm outperformed other benchmark algorithms.","PeriodicalId":32367,"journal":{"name":"Sistemasi Jurnal Sistem Informasi","volume":"65 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2022-05-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Classification Algorithm for Link Prediction Based on Generated Features of Local Similarity-Based Method\",\"authors\":\"Siti Apryanti Koni’ah, H. Yuliansyah\",\"doi\":\"10.32520/stmsi.v11i2.1641\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"A social network is a social structure that consists consisting of nodes, edges, or links and describes activity on a social media platform. Later, link prediction is a technique to predict new relationships for future networks based on information explored from the current network topology. Several local similarity-based methods use topological information to predict the link. However, these methods have different performances and depend on the network topology. This study proposes using classification algorithms of machine learning to predict future links. The classification algorithms compared are k-Nearest Neighbors (KNN), Naive Bayes, Decision Tree, and Random Forest by comparing six social network datasets with features generated from local similarity-based methods. This research was conducted in three stages: preprocessing, classification comparison, and performance evaluation. The findings of this study are that the Random Forest algorithm outperforms for testing accuracy, precision, and F1-Score. However, in the recall test results, Random Forest only outperformed other benchmark algorithms in the four datasets: soc-karate, soc-dolphin, soc-highschool M, and Soc-sparrowlyon-flock-season 03. Meanwhile, in the datasets soc-tribes and soc-aves-weaver-social-05, the Decision Tree algorithm outperformed other benchmark algorithms.\",\"PeriodicalId\":32367,\"journal\":{\"name\":\"Sistemasi Jurnal Sistem Informasi\",\"volume\":\"65 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-05-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Sistemasi Jurnal Sistem Informasi\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.32520/stmsi.v11i2.1641\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Sistemasi Jurnal Sistem Informasi","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.32520/stmsi.v11i2.1641","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

社交网络是由节点、边缘或链接组成的社会结构，描述社交媒体平台上的活动。随后，链路预测是一种基于从当前网络拓扑中探索的信息来预测未来网络新关系的技术。几种基于局部相似度的方法使用拓扑信息来预测链路。然而，这些方法的性能各不相同，并且受网络拓扑结构的影响。本研究提出使用机器学习的分类算法来预测未来的链接。通过将6个社会网络数据集与基于局部相似度的方法生成的特征进行比较，比较了k近邻(KNN)、朴素贝叶斯(Naive Bayes)、决策树(Decision Tree)和随机森林(Random Forest)。本研究分预处理、分类比较、性能评价三个阶段进行。本研究的结果是随机森林算法在测试准确性、精密度和F1-Score方面都优于随机森林算法。然而，在召回测试结果中，Random Forest仅在soc-空手道、soc-海豚、soc-高中M和Soc-sparrowlyon-flock-season 03这四个数据集上优于其他基准算法。同时，在soc-tribes和soc-ave -weaver-social-05数据集上，决策树算法优于其他基准算法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Classification Algorithm for Link Prediction Based on Generated Features of Local Similarity-Based Method

A social network is a social structure that consists consisting of nodes, edges, or links and describes activity on a social media platform. Later, link prediction is a technique to predict new relationships for future networks based on information explored from the current network topology. Several local similarity-based methods use topological information to predict the link. However, these methods have different performances and depend on the network topology. This study proposes using classification algorithms of machine learning to predict future links. The classification algorithms compared are k-Nearest Neighbors (KNN), Naive Bayes, Decision Tree, and Random Forest by comparing six social network datasets with features generated from local similarity-based methods. This research was conducted in three stages: preprocessing, classification comparison, and performance evaluation. The findings of this study are that the Random Forest algorithm outperforms for testing accuracy, precision, and F1-Score. However, in the recall test results, Random Forest only outperformed other benchmark algorithms in the four datasets: soc-karate, soc-dolphin, soc-highschool M, and Soc-sparrowlyon-flock-season 03. Meanwhile, in the datasets soc-tribes and soc-aves-weaver-social-05, the Decision Tree algorithm outperformed other benchmark algorithms.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Sistemasi Jurnal Sistem Informasi

自引率

0.00%

发文量

审稿时长

43 weeks