使用维基百科自动标记新闻

2013 9th International Conference on Innovations in Information Technology (IIT) Pub Date : 2013-03-17 DOI:10.1109/INNOVATIONS.2013.6544411

Shaimaa Shams Eldin, S. El-Beltagy

{"title":"使用维基百科自动标记新闻","authors":"Shaimaa Shams Eldin, S. El-Beltagy","doi":"10.1109/INNOVATIONS.2013.6544411","DOIUrl":null,"url":null,"abstract":"This paper presents an efficient method for automatically annotating Arabic news stories with tags using Wikipedia. The idea of the system is to use Wikipedia article names, properties, and re-directs to build a pool of meaningful tags. Sophisticated and efficient matching methods are then used to detect text fragments in input news stories that correspond to entries in the constructed tag pool. Generated tags represent real life entities or concepts such as the names of popular places, known organizations, celebrities, etc. These tags can be used indirectly by a news site for indexing, clustering, classification, statistics generation or directly to give a news reader an overview of news story contents. Evaluation of the system has shown that the tags it generates are better than those generated by MSN Arabic news.","PeriodicalId":438270,"journal":{"name":"2013 9th International Conference on Innovations in Information Technology (IIT)","volume":"179 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-03-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"News auto-tagging using Wikipedia\",\"authors\":\"Shaimaa Shams Eldin, S. El-Beltagy\",\"doi\":\"10.1109/INNOVATIONS.2013.6544411\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper presents an efficient method for automatically annotating Arabic news stories with tags using Wikipedia. The idea of the system is to use Wikipedia article names, properties, and re-directs to build a pool of meaningful tags. Sophisticated and efficient matching methods are then used to detect text fragments in input news stories that correspond to entries in the constructed tag pool. Generated tags represent real life entities or concepts such as the names of popular places, known organizations, celebrities, etc. These tags can be used indirectly by a news site for indexing, clustering, classification, statistics generation or directly to give a news reader an overview of news story contents. Evaluation of the system has shown that the tags it generates are better than those generated by MSN Arabic news.\",\"PeriodicalId\":438270,\"journal\":{\"name\":\"2013 9th International Conference on Innovations in Information Technology (IIT)\",\"volume\":\"179 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2013-03-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2013 9th International Conference on Innovations in Information Technology (IIT)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/INNOVATIONS.2013.6544411\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 9th International Conference on Innovations in Information Technology (IIT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/INNOVATIONS.2013.6544411","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 4

摘要

本文提出了一种利用维基百科自动标注阿拉伯语新闻故事的有效方法。该系统的思想是使用Wikipedia条目的名称、属性和重定向来构建一个有意义的标签池。然后使用复杂而有效的匹配方法来检测输入新闻故事中的文本片段，这些文本片段与构建的标记池中的条目相对应。生成的标签代表现实生活中的实体或概念，如热门地点、知名组织、名人等的名称。这些标签可以被新闻站点间接地用于索引、聚类、分类、统计生成，或者直接给新闻读者一个新闻故事内容的概述。对该系统的评价表明，它生成的标签比MSN阿拉伯语新闻生成的标签要好。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

News auto-tagging using Wikipedia

This paper presents an efficient method for automatically annotating Arabic news stories with tags using Wikipedia. The idea of the system is to use Wikipedia article names, properties, and re-directs to build a pool of meaningful tags. Sophisticated and efficient matching methods are then used to detect text fragments in input news stories that correspond to entries in the constructed tag pool. Generated tags represent real life entities or concepts such as the names of popular places, known organizations, celebrities, etc. These tags can be used indirectly by a news site for indexing, clustering, classification, statistics generation or directly to give a news reader an overview of news story contents. Evaluation of the system has shown that the tags it generates are better than those generated by MSN Arabic news.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2013 9th International Conference on Innovations in Information Technology (IIT)

自引率

0.00%

发文量