{"title":"基于竞争情报的网络规模命名实体提取","authors":"François Pouilloux","doi":"10.1109/WI-IAT.2011.284","DOIUrl":null,"url":null,"abstract":"Businesses of all sizes have now realized that the Web is an invaluable resource for competitive intelligence, and consequently business decision making. But many have trouble collecting targeted & useful information, and are often further overwhelmed by the time required for analysis & monitoring. On another hand, text mining techniques have become widely used for information analysis in the scientific community in general, and are now ubiquitous in most Web Intelligence fields. With the availability of services such as Google Prediction API, or mature open source software such as GATE, RapidMiner or NLTK, one can expect a much wider adoption of text mining and associated machine learning techniques by expert developers. But how can these techniques benefit to the daily life of a wider business audience? As competitive intelligence is often focused on products, people, customers and competitors, there is an added value for systems providing analytics on these entities, whose recognition is fundamental to text mining and semantic analysis, and consequently is still under active scientific investigation. In this talk we will tour some of the specific requirements and options for building an efficient Web based competitive intelligence system with named entity analytics. We will see how some savvy simplifications can help to overcome common issues such as Web scale and Web content noise, and finally deliver acceptable usability and value for non-specialists, business users.","PeriodicalId":128421,"journal":{"name":"2011 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology","volume":"261 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Extracting Named Entities at Web Scale for Competitive Intelligence\",\"authors\":\"François Pouilloux\",\"doi\":\"10.1109/WI-IAT.2011.284\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Businesses of all sizes have now realized that the Web is an invaluable resource for competitive intelligence, and consequently business decision making. But many have trouble collecting targeted & useful information, and are often further overwhelmed by the time required for analysis & monitoring. On another hand, text mining techniques have become widely used for information analysis in the scientific community in general, and are now ubiquitous in most Web Intelligence fields. With the availability of services such as Google Prediction API, or mature open source software such as GATE, RapidMiner or NLTK, one can expect a much wider adoption of text mining and associated machine learning techniques by expert developers. But how can these techniques benefit to the daily life of a wider business audience? As competitive intelligence is often focused on products, people, customers and competitors, there is an added value for systems providing analytics on these entities, whose recognition is fundamental to text mining and semantic analysis, and consequently is still under active scientific investigation. In this talk we will tour some of the specific requirements and options for building an efficient Web based competitive intelligence system with named entity analytics. We will see how some savvy simplifications can help to overcome common issues such as Web scale and Web content noise, and finally deliver acceptable usability and value for non-specialists, business users.\",\"PeriodicalId\":128421,\"journal\":{\"name\":\"2011 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology\",\"volume\":\"261 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2011-08-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2011 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/WI-IAT.2011.284\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2011 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/WI-IAT.2011.284","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Extracting Named Entities at Web Scale for Competitive Intelligence
Businesses of all sizes have now realized that the Web is an invaluable resource for competitive intelligence, and consequently business decision making. But many have trouble collecting targeted & useful information, and are often further overwhelmed by the time required for analysis & monitoring. On another hand, text mining techniques have become widely used for information analysis in the scientific community in general, and are now ubiquitous in most Web Intelligence fields. With the availability of services such as Google Prediction API, or mature open source software such as GATE, RapidMiner or NLTK, one can expect a much wider adoption of text mining and associated machine learning techniques by expert developers. But how can these techniques benefit to the daily life of a wider business audience? As competitive intelligence is often focused on products, people, customers and competitors, there is an added value for systems providing analytics on these entities, whose recognition is fundamental to text mining and semantic analysis, and consequently is still under active scientific investigation. In this talk we will tour some of the specific requirements and options for building an efficient Web based competitive intelligence system with named entity analytics. We will see how some savvy simplifications can help to overcome common issues such as Web scale and Web content noise, and finally deliver acceptable usability and value for non-specialists, business users.