{"title":"Analysis of Trends in Online Romanian News Using Semantic Models","authors":"A. Simion, M. Dascalu, Stefan Trausan-Matu","doi":"10.1109/CSCS.2019.00075","DOIUrl":null,"url":null,"abstract":"Online news are currently the most frequently accessed source of information. Moreover, online media is bound to grow even further in significance and popularity due to the increasing usage of smart devices. As such, it becomes of great interest to study and analyze various trends in online media. Considering the sheer number of articles that are published every day, a manual approach to generate meaningful statistics for large-scale media networks is unfeasible. In this paper, we aim to analyze some of the most visited online Romanian news websites using various Natural Language Processing techniques. Our analysis has two main highlights: the extraction of trending topics and concepts from the news, together with the objective of ranking publications in accordance to their relative influence within our generated network. Unlike other systems which rely on assigning influence according to direct links or citations, we propose a novel ranking method based on intertextuality links identified using document embeddings.","PeriodicalId":352411,"journal":{"name":"2019 22nd International Conference on Control Systems and Computer Science (CSCS)","volume":"37 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 22nd International Conference on Control Systems and Computer Science (CSCS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CSCS.2019.00075","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Online news are currently the most frequently accessed source of information. Moreover, online media is bound to grow even further in significance and popularity due to the increasing usage of smart devices. As such, it becomes of great interest to study and analyze various trends in online media. Considering the sheer number of articles that are published every day, a manual approach to generate meaningful statistics for large-scale media networks is unfeasible. In this paper, we aim to analyze some of the most visited online Romanian news websites using various Natural Language Processing techniques. Our analysis has two main highlights: the extraction of trending topics and concepts from the news, together with the objective of ranking publications in accordance to their relative influence within our generated network. Unlike other systems which rely on assigning influence according to direct links or citations, we propose a novel ranking method based on intertextuality links identified using document embeddings.