{"title":"News story clustering from both what and how aspects: using bag of word model and affinity propagation","authors":"W. Chu, Chao-Chin Huang, Wen-Fang Cheng","doi":"10.1145/2072552.2072555","DOIUrl":null,"url":null,"abstract":"The 24-hour news TV channels repeat the same news stories again and again. In this paper we cluster hundreds of news stories broadcasted in a day into dozens of clusters according to topics, and thus facilitate efficient browsing and summarization. The proposed system automatically removes commercial breaks and detects anchorpersons, and then determines boundaries of news stories. Semantic concepts, the bag of visual word model and the bag of trajectory model are used to describe what and how objects present in news stories. After measuring similarity between stories by the earth mover's distance, the affinity propagation algorithm is utilized to cluster stories of the same topic together. The experimental results show that with the proposed methods sophisticated news stories can be effectively clustered.","PeriodicalId":280321,"journal":{"name":"Automated Information Extraction in Media Production","volume":"25 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Automated Information Extraction in Media Production","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2072552.2072555","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4
Abstract
The 24-hour news TV channels repeat the same news stories again and again. In this paper we cluster hundreds of news stories broadcasted in a day into dozens of clusters according to topics, and thus facilitate efficient browsing and summarization. The proposed system automatically removes commercial breaks and detects anchorpersons, and then determines boundaries of news stories. Semantic concepts, the bag of visual word model and the bag of trajectory model are used to describe what and how objects present in news stories. After measuring similarity between stories by the earth mover's distance, the affinity propagation algorithm is utilized to cluster stories of the same topic together. The experimental results show that with the proposed methods sophisticated news stories can be effectively clustered.