{"title":"一种去除网络机器人的增强web日志挖掘预处理技术","authors":"P. Nithya, P. Sumathi","doi":"10.1109/ICCIC.2012.6510325","DOIUrl":null,"url":null,"abstract":"Nowadays, internet becomes useful source of information in day-to-day life. It creates huge development of World Wide Web in its quantity of interchange and its size and difficulty of websites. Web Usage Mining (WUM) is one of the main applications of data mining, artificial intelligence and so on to the web data and forecast the user's visiting behaviors and obtains their interests by investigating the samples. Since WUM directly involves in large range of applications, such as, ecommerce, e-learning, Web analytics, information retrieval etc. Weblog data is one of the major sources which contain all the information regarding the users visited links, browsing patterns, time spent on a particular page or link and this information can be used in several applications like adaptive web sites, modified services, customer summary, pre-fetching, generate attractive web sites etc. There are several problems related with the existing web usage mining approaches. Existing web usage mining algorithms suffer from difficulty of practical applicability. So, a novel research is necessary for the accurate prediction of future performance of web users with rapid execution time. WUM consists of preprocessing, pattern discovery and pattern analysis. Log data is characteristically noisy and unclear. Hence, preprocessing is an essential process for effective mining process. In this paper, a novel pre-processing technique is proposed by removing local and global noise and web robots. Anonymous Microsoft Web Dataset and MSNBC.com Anonymous Web Dataset are used for estimating the proposed preprocessing technique.","PeriodicalId":340238,"journal":{"name":"2012 IEEE International Conference on Computational Intelligence and Computing Research","volume":"67 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"9","resultStr":"{\"title\":\"An enhanced pre-processing technique for web log mining by removing web robots\",\"authors\":\"P. Nithya, P. Sumathi\",\"doi\":\"10.1109/ICCIC.2012.6510325\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Nowadays, internet becomes useful source of information in day-to-day life. It creates huge development of World Wide Web in its quantity of interchange and its size and difficulty of websites. Web Usage Mining (WUM) is one of the main applications of data mining, artificial intelligence and so on to the web data and forecast the user's visiting behaviors and obtains their interests by investigating the samples. Since WUM directly involves in large range of applications, such as, ecommerce, e-learning, Web analytics, information retrieval etc. Weblog data is one of the major sources which contain all the information regarding the users visited links, browsing patterns, time spent on a particular page or link and this information can be used in several applications like adaptive web sites, modified services, customer summary, pre-fetching, generate attractive web sites etc. There are several problems related with the existing web usage mining approaches. Existing web usage mining algorithms suffer from difficulty of practical applicability. So, a novel research is necessary for the accurate prediction of future performance of web users with rapid execution time. WUM consists of preprocessing, pattern discovery and pattern analysis. Log data is characteristically noisy and unclear. Hence, preprocessing is an essential process for effective mining process. In this paper, a novel pre-processing technique is proposed by removing local and global noise and web robots. Anonymous Microsoft Web Dataset and MSNBC.com Anonymous Web Dataset are used for estimating the proposed preprocessing technique.\",\"PeriodicalId\":340238,\"journal\":{\"name\":\"2012 IEEE International Conference on Computational Intelligence and Computing Research\",\"volume\":\"67 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2012-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"9\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2012 IEEE International Conference on Computational Intelligence and Computing Research\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICCIC.2012.6510325\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2012 IEEE International Conference on Computational Intelligence and Computing Research","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCIC.2012.6510325","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
An enhanced pre-processing technique for web log mining by removing web robots
Nowadays, internet becomes useful source of information in day-to-day life. It creates huge development of World Wide Web in its quantity of interchange and its size and difficulty of websites. Web Usage Mining (WUM) is one of the main applications of data mining, artificial intelligence and so on to the web data and forecast the user's visiting behaviors and obtains their interests by investigating the samples. Since WUM directly involves in large range of applications, such as, ecommerce, e-learning, Web analytics, information retrieval etc. Weblog data is one of the major sources which contain all the information regarding the users visited links, browsing patterns, time spent on a particular page or link and this information can be used in several applications like adaptive web sites, modified services, customer summary, pre-fetching, generate attractive web sites etc. There are several problems related with the existing web usage mining approaches. Existing web usage mining algorithms suffer from difficulty of practical applicability. So, a novel research is necessary for the accurate prediction of future performance of web users with rapid execution time. WUM consists of preprocessing, pattern discovery and pattern analysis. Log data is characteristically noisy and unclear. Hence, preprocessing is an essential process for effective mining process. In this paper, a novel pre-processing technique is proposed by removing local and global noise and web robots. Anonymous Microsoft Web Dataset and MSNBC.com Anonymous Web Dataset are used for estimating the proposed preprocessing technique.