基于竞争情报的网络规模命名实体提取

François Pouilloux
{"title":"基于竞争情报的网络规模命名实体提取","authors":"François Pouilloux","doi":"10.1109/WI-IAT.2011.284","DOIUrl":null,"url":null,"abstract":"Businesses of all sizes have now realized that the Web is an invaluable resource for competitive intelligence, and consequently business decision making. But many have trouble collecting targeted & useful information, and are often further overwhelmed by the time required for analysis & monitoring. On another hand, text mining techniques have become widely used for information analysis in the scientific community in general, and are now ubiquitous in most Web Intelligence fields. With the availability of services such as Google Prediction API, or mature open source software such as GATE, RapidMiner or NLTK, one can expect a much wider adoption of text mining and associated machine learning techniques by expert developers. But how can these techniques benefit to the daily life of a wider business audience? As competitive intelligence is often focused on products, people, customers and competitors, there is an added value for systems providing analytics on these entities, whose recognition is fundamental to text mining and semantic analysis, and consequently is still under active scientific investigation. In this talk we will tour some of the specific requirements and options for building an efficient Web based competitive intelligence system with named entity analytics. We will see how some savvy simplifications can help to overcome common issues such as Web scale and Web content noise, and finally deliver acceptable usability and value for non-specialists, business users.","PeriodicalId":128421,"journal":{"name":"2011 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology","volume":"261 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Extracting Named Entities at Web Scale for Competitive Intelligence\",\"authors\":\"François Pouilloux\",\"doi\":\"10.1109/WI-IAT.2011.284\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Businesses of all sizes have now realized that the Web is an invaluable resource for competitive intelligence, and consequently business decision making. But many have trouble collecting targeted & useful information, and are often further overwhelmed by the time required for analysis & monitoring. On another hand, text mining techniques have become widely used for information analysis in the scientific community in general, and are now ubiquitous in most Web Intelligence fields. With the availability of services such as Google Prediction API, or mature open source software such as GATE, RapidMiner or NLTK, one can expect a much wider adoption of text mining and associated machine learning techniques by expert developers. But how can these techniques benefit to the daily life of a wider business audience? As competitive intelligence is often focused on products, people, customers and competitors, there is an added value for systems providing analytics on these entities, whose recognition is fundamental to text mining and semantic analysis, and consequently is still under active scientific investigation. In this talk we will tour some of the specific requirements and options for building an efficient Web based competitive intelligence system with named entity analytics. We will see how some savvy simplifications can help to overcome common issues such as Web scale and Web content noise, and finally deliver acceptable usability and value for non-specialists, business users.\",\"PeriodicalId\":128421,\"journal\":{\"name\":\"2011 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology\",\"volume\":\"261 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2011-08-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2011 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/WI-IAT.2011.284\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2011 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/WI-IAT.2011.284","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

摘要

各种规模的企业现在都已经意识到,Web是竞争情报和业务决策的宝贵资源。但许多公司在收集有针对性和有用的信息时遇到了困难,而且往往被分析和监控所需的时间所淹没。另一方面,文本挖掘技术已经广泛应用于科学界的信息分析,并且在大多数Web智能领域中无处不在。随着谷歌预测API等服务的可用性,或者GATE、RapidMiner或NLTK等成熟的开源软件的出现,我们可以期待专业开发人员更广泛地采用文本挖掘和相关的机器学习技术。但是,这些技术如何使更广泛的商业受众的日常生活受益呢?由于竞争情报通常关注产品、人员、客户和竞争对手,因此对这些实体提供分析的系统具有附加价值,这些实体的识别是文本挖掘和语义分析的基础,因此仍在积极的科学研究中。在这次演讲中,我们将介绍一些特定的需求和选项,以构建一个高效的基于Web的竞争情报系统,并使用命名实体分析。我们将看到一些精明的简化如何帮助克服Web规模和Web内容干扰等常见问题,并最终为非专业人员、业务用户交付可接受的可用性和价值。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Extracting Named Entities at Web Scale for Competitive Intelligence
Businesses of all sizes have now realized that the Web is an invaluable resource for competitive intelligence, and consequently business decision making. But many have trouble collecting targeted & useful information, and are often further overwhelmed by the time required for analysis & monitoring. On another hand, text mining techniques have become widely used for information analysis in the scientific community in general, and are now ubiquitous in most Web Intelligence fields. With the availability of services such as Google Prediction API, or mature open source software such as GATE, RapidMiner or NLTK, one can expect a much wider adoption of text mining and associated machine learning techniques by expert developers. But how can these techniques benefit to the daily life of a wider business audience? As competitive intelligence is often focused on products, people, customers and competitors, there is an added value for systems providing analytics on these entities, whose recognition is fundamental to text mining and semantic analysis, and consequently is still under active scientific investigation. In this talk we will tour some of the specific requirements and options for building an efficient Web based competitive intelligence system with named entity analytics. We will see how some savvy simplifications can help to overcome common issues such as Web scale and Web content noise, and finally deliver acceptable usability and value for non-specialists, business users.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信