Rendra Budi Hutama, Ali Ridho Barakbah, Afrida Helen
{"title":"印尼新闻自动总结在基础设施发展主题中使用5W+1H考虑","authors":"Rendra Budi Hutama, Ali Ridho Barakbah, Afrida Helen","doi":"10.1109/KCIC.2017.8228596","DOIUrl":null,"url":null,"abstract":"With an average reading speed of 200–500 words per minute, at least human takes 2 to 3 minutes to read and understand one news in online media. The number of news updates on an online media in a few minutes can be a lot and it's time-consuming if a reader has to read the contents of all the news. Reading a summary that represents the main idea of the news can be a solution to save time. This study considers the 5W + 1H element in generating news summaries because this element is important in a news. The single news from online media pages is taken by scanning and grabbing process which is further will be sanitized, then segmentation and tokenizing to break the news into sentences and words. Each sentence classified into multi-label whether it contains 5W + 1H (What, Who, Where, When, Why and/or How) or nothing else by using training data that has been built. Sentences containing 5W + 1H will be selected as summary sentences. Testing of summary results shows the average precision 91%, recall 67% and f-measure 76%.","PeriodicalId":117148,"journal":{"name":"2017 International Electronics Symposium on Knowledge Creation and Intelligent Computing (IES-KCIC)","volume":"207 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Indonesian news auto summarization in infrastructure development topic using 5W+1H consideration\",\"authors\":\"Rendra Budi Hutama, Ali Ridho Barakbah, Afrida Helen\",\"doi\":\"10.1109/KCIC.2017.8228596\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"With an average reading speed of 200–500 words per minute, at least human takes 2 to 3 minutes to read and understand one news in online media. The number of news updates on an online media in a few minutes can be a lot and it's time-consuming if a reader has to read the contents of all the news. Reading a summary that represents the main idea of the news can be a solution to save time. This study considers the 5W + 1H element in generating news summaries because this element is important in a news. The single news from online media pages is taken by scanning and grabbing process which is further will be sanitized, then segmentation and tokenizing to break the news into sentences and words. Each sentence classified into multi-label whether it contains 5W + 1H (What, Who, Where, When, Why and/or How) or nothing else by using training data that has been built. Sentences containing 5W + 1H will be selected as summary sentences. Testing of summary results shows the average precision 91%, recall 67% and f-measure 76%.\",\"PeriodicalId\":117148,\"journal\":{\"name\":\"2017 International Electronics Symposium on Knowledge Creation and Intelligent Computing (IES-KCIC)\",\"volume\":\"207 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2017 International Electronics Symposium on Knowledge Creation and Intelligent Computing (IES-KCIC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/KCIC.2017.8228596\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 International Electronics Symposium on Knowledge Creation and Intelligent Computing (IES-KCIC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/KCIC.2017.8228596","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Indonesian news auto summarization in infrastructure development topic using 5W+1H consideration
With an average reading speed of 200–500 words per minute, at least human takes 2 to 3 minutes to read and understand one news in online media. The number of news updates on an online media in a few minutes can be a lot and it's time-consuming if a reader has to read the contents of all the news. Reading a summary that represents the main idea of the news can be a solution to save time. This study considers the 5W + 1H element in generating news summaries because this element is important in a news. The single news from online media pages is taken by scanning and grabbing process which is further will be sanitized, then segmentation and tokenizing to break the news into sentences and words. Each sentence classified into multi-label whether it contains 5W + 1H (What, Who, Where, When, Why and/or How) or nothing else by using training data that has been built. Sentences containing 5W + 1H will be selected as summary sentences. Testing of summary results shows the average precision 91%, recall 67% and f-measure 76%.