SKEM++: SEMANTIC KEYWORD EXTRACTION MODEL USING COLLECTIVE CENTRALITY MEASURE ON BIG SOCIAL DATA

IF 1.2 4区计算机科学 Q4 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Malaysian Journal of Computer Science Pub Date : 2022-03-31 DOI:10.22452/mjcs.sp2022no1.1

D. R, S. V.

{"title":"SKEM++: SEMANTIC KEYWORD EXTRACTION MODEL USING COLLECTIVE CENTRALITY MEASURE ON BIG SOCIAL DATA","authors":"D. R, S. V.","doi":"10.22452/mjcs.sp2022no1.1","DOIUrl":null,"url":null,"abstract":"In recent times, Online Social Network (OSN) has accumulated a massive volume of user-generated data available in an unstructured format. It consists of user ideas, responses, and opinions on various topics. It extracts essential keywords in OSN, which is endowed with many exciting applications such as information recommendation or viral marketing. This paper emphasizes the importance of semantic graph-based methods for extracting vital keywords experimentally using a novel SKEM++ method. It is an innovative method for keyword extraction from OSN based on centrality measures. It utilizes a distributed computing approach to calculate the network Collective Centrality Measure (CCM) for each node and improve the semantics of keywords. The distributed approach is more scalable and computationally efficient than the conventional system, making it more suitable for large-scale real-time data sets such as the OSN. Experimental outcomes on the real-time Twitter Data set to infer the dominance of the proposed Collective Centrality Measure(CCM) method in evaluation with contemporary schemes in terms of F-score by 81% and recall by 80% and precision by 80% using Semantic Analysis.","PeriodicalId":49894,"journal":{"name":"Malaysian Journal of Computer Science","volume":" ","pages":""},"PeriodicalIF":1.2000,"publicationDate":"2022-03-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Malaysian Journal of Computer Science","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.22452/mjcs.sp2022no1.1","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

In recent times, Online Social Network (OSN) has accumulated a massive volume of user-generated data available in an unstructured format. It consists of user ideas, responses, and opinions on various topics. It extracts essential keywords in OSN, which is endowed with many exciting applications such as information recommendation or viral marketing. This paper emphasizes the importance of semantic graph-based methods for extracting vital keywords experimentally using a novel SKEM++ method. It is an innovative method for keyword extraction from OSN based on centrality measures. It utilizes a distributed computing approach to calculate the network Collective Centrality Measure (CCM) for each node and improve the semantics of keywords. The distributed approach is more scalable and computationally efficient than the conventional system, making it more suitable for large-scale real-time data sets such as the OSN. Experimental outcomes on the real-time Twitter Data set to infer the dominance of the proposed Collective Centrality Measure(CCM) method in evaluation with contemporary schemes in terms of F-score by 81% and recall by 80% and precision by 80% using Semantic Analysis.

查看原文本刊更多论文

SKEM++：基于集体中心性测度的大社会数据语义关键词提取模型

近年来，在线社交网络（OSN）积累了大量用户生成的非结构化数据。它包括用户对各种主题的想法、反应和意见。它提取OSN中的重要关键词，被赋予了许多令人兴奋的应用，如信息推荐或病毒营销。本文强调了基于语义图的方法在实验中使用一种新的SKEM++方法提取重要关键词的重要性。这是一种基于中心性度量的OSN关键词提取的创新方法。它利用分布式计算方法来计算每个节点的网络集体中心性度量（CCM），并改进关键字的语义。分布式方法比传统系统更具可扩展性和计算效率，使其更适合于大规模实时数据集，如OSN。在实时推特数据集上的实验结果表明，使用语义分析，所提出的集体中心性测量（CCM）方法在当代方案评估中的优势在于F分提高81%，召回率提高80%，准确率提高80%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Malaysian Journal of Computer Science COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE-COMPUTER SCIENCE, THEORY & METHODS

CiteScore

2.20

自引率

33.30%

发文量

审稿时长

7.5 months

期刊介绍： The Malaysian Journal of Computer Science (ISSN 0127-9084) is published four times a year in January, April, July and October by the Faculty of Computer Science and Information Technology, University of Malaya, since 1985. Over the years, the journal has gained popularity and the number of paper submissions has increased steadily. The rigorous reviews from the referees have helped in ensuring that the high standard of the journal is maintained. The objectives are to promote exchange of information and knowledge in research work, new inventions/developments of Computer Science and on the use of Information Technology towards the structuring of an information-rich society and to assist the academic staff from local and foreign universities, business and industrial sectors, government departments and academic institutions on publishing research results and studies in Computer Science and Information Technology through a scholarly publication. The journal is being indexed and abstracted by Clarivate Analytics'' Web of Science and Elsevier''s Scopus