Milos Krstajic, Florian Mansmann, A. Stoffel, M. Atkinson, D. Keim
{"title":"Processing online news streams for large-scale semantic analysis","authors":"Milos Krstajic, Florian Mansmann, A. Stoffel, M. Atkinson, D. Keim","doi":"10.1109/ICDEW.2010.5452710","DOIUrl":null,"url":null,"abstract":"While Internet has enabled us to access a vast amount of online news articles originating from thousands of different sources, the human capability to read all these articles has stayed rather constant. Usually, the publishing industry takes over the role of filtering this enormous amount of information and presenting it in an appropriate way to the group of their subscribers. In this paper, the semantic analysis of such news streams is discussed by introducing a system that streams online news collected by the Europe Media Monitor to our proposed semantic news analysis system. Thereby, we describe in detail the emerging challenges and the corresponding engineering solutions to process incoming articles close to real-time. To demonstrate the use of our system, the case studies show a) temporal analysis of entities, such as institutions or persons, and b) their co-occurence in news articles.","PeriodicalId":442345,"journal":{"name":"2010 IEEE 26th International Conference on Data Engineering Workshops (ICDEW 2010)","volume":"163 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2010-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"34","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2010 IEEE 26th International Conference on Data Engineering Workshops (ICDEW 2010)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDEW.2010.5452710","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 34
Abstract
While Internet has enabled us to access a vast amount of online news articles originating from thousands of different sources, the human capability to read all these articles has stayed rather constant. Usually, the publishing industry takes over the role of filtering this enormous amount of information and presenting it in an appropriate way to the group of their subscribers. In this paper, the semantic analysis of such news streams is discussed by introducing a system that streams online news collected by the Europe Media Monitor to our proposed semantic news analysis system. Thereby, we describe in detail the emerging challenges and the corresponding engineering solutions to process incoming articles close to real-time. To demonstrate the use of our system, the case studies show a) temporal analysis of entities, such as institutions or persons, and b) their co-occurence in news articles.