{"title":"An Approach to Instantly Detecting Fake Plates Based on Large-Scale ANPR Data","authors":"Yue Li, Chen Liu","doi":"10.1109/WISA.2015.53","DOIUrl":"https://doi.org/10.1109/WISA.2015.53","url":null,"abstract":"Traditional methods of detecting fake plates are mostly inefficient. They usually require lots of investments in advance. These methods cannot fully play potentials of ANPR (Automatic Number Plate Recognition) data and utilize them to detect fake plates quickly. In this paper, we propose a method, called as FP-Detector, to instantly detect fake plates through parallel analyzing the historical large-scale ANPR data with MapReduce. The main contributions include: we design a partition strategy, which can fully use the features of ANPR and maintain balances among different nodes. In addition, we also give a criterion of judging fake plates through analyzing spatio-temporal contradiction of plate information. Finally, we apply our method on a real large-scale data set and compare the performance of our method with default blocking strategy of MapReduce. The experiment results show the effectiveness of our method.","PeriodicalId":198938,"journal":{"name":"2015 12th Web Information System and Application Conference (WISA)","volume":"22 2","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133169914","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Challenges and Issues in Trajectory Streams Clustering upon a Sliding-Window Model","authors":"Jiali Mao, Cheqing Jin, Xiaoling Wang, Aoying Zhou","doi":"10.1109/WISA.2015.42","DOIUrl":"https://doi.org/10.1109/WISA.2015.42","url":null,"abstract":"The proliferation of location-acquisition devices and thriving development of social Web sites enable analyzing users' movement behaviors and detecting social events in dynamic trajectory streams. In this paper, we firstly analyze the challenges in trajectory stream clustering, and then depict a three-part framework to deal with this issue, that includes (i) trajectory data pre-processing for higher quality, (ii) online micro-clustering to summarize a large number of microclusters, and (iii) offline macro-clustering to form the resulting clusters. Particularly, we present the in-cluster maintenance strategy for online clustering evolving trajectory streams over sliding windows. It can eliminate the obsolete data while adaptively maintaining the summary statistics for continuously arriving location data, and thus avoid performance degradation with minimal harm to result quality.","PeriodicalId":198938,"journal":{"name":"2015 12th Web Information System and Application Conference (WISA)","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127674466","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Popular Topic Detection in Chinese Micro-Blog Based on the Modified LDA Model","authors":"Yuzhong Chen, Wanhua Li, Wenzhong Guo, Kun Guo","doi":"10.1109/WISA.2015.58","DOIUrl":"https://doi.org/10.1109/WISA.2015.58","url":null,"abstract":"Micro-blog has become a symbol of the novel social media, and because of its rapid development in such a short time, many research researchers are full of enthusiasm about it. We take use of Latent Dirichlet Allocation (LDA) Model which has excellent dimension reduction capability and can excavate latent semantic from texts to discover popular topics. We improve the original LDA model to FSC-LDA model by combining the text clustering methods and feature selection methods, which can identify the number of topics adaptively. FSC-LDA model can keep short micro-blog texts features better, and make the result more stable. The result of the experiments on real Chinese microblog text dataset shows that FSC-LDA model can perform well on the custom evaluation and find more accurate popular topics.","PeriodicalId":198938,"journal":{"name":"2015 12th Web Information System and Application Conference (WISA)","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117131881","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Mining Event Associations Using Structured Data and Classifiers","authors":"Jinxin Zhao, Xinjun Wang, Zhongmin Yan, Song Wei","doi":"10.1109/WISA.2015.37","DOIUrl":"https://doi.org/10.1109/WISA.2015.37","url":null,"abstract":"Event is a widely used concept these years. Many areas such as Natural Language Process, Information Retrieval have used event as the basic information unit in their research. So, the mining of event association is very necessary for our research. And it plays an important role business intelligence and researches of relations between events. Usually events are associated with others when they often occur in the vicinity of others or co-occur in the same context. However, there are some implicit associations we cannot mine only from sequence or context. In this paper, we aim to find associations of events under the background of Data Integration Systems. By using the structured information of data integration system, the background information of entities can be extracted to classify events. So we classify the events into different categories which makes it possible to mine the statistical information from event sequence. Furthermore, we generalize the association between event entities to predict the implicit association in our algorithm. We validate our method with experiments and results show the useful information in the area of business intelligence.","PeriodicalId":198938,"journal":{"name":"2015 12th Web Information System and Application Conference (WISA)","volume":"73 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126387579","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Friend Recommendation Algorithm Based on Multiple Factors in LBSNs","authors":"Tiancheng Zhang, Wei Wang, D. Yue, Ge Yu","doi":"10.1109/WISA.2015.35","DOIUrl":"https://doi.org/10.1109/WISA.2015.35","url":null,"abstract":"In location-based social networks, the current friend recommendation algorithms just take a relatively single factor into account without comprehensive evaluations. To solve this problem, we design a framework - Multiple Heterogeneous Social Network (MHSN) according to users' profiles, check-in records and interests. Based on this framework, we propose a friend recommendation model which consider multiple factors, including 1) a detecting model based on interest similarity by using users' check-in records, 2) a social distance calculation method based on users' social relationship, 3) a clustering method based on users' check-in location information to measure the similarity among clusters. The top-k friends who satisfy the above conditions will be recommended to the target users. We evaluated our method using Foursquare data-sets and the results showed that our friend recommendation algorithm is more feasible and effective.","PeriodicalId":198938,"journal":{"name":"2015 12th Web Information System and Application Conference (WISA)","volume":"225 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114987573","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"SLOF: Identify Density-Based Local Outliers in Big Data","authors":"Haowen Guan, Qingzhong Li, Zhongmin Yan, Wei Wei","doi":"10.1109/WISA.2015.40","DOIUrl":"https://doi.org/10.1109/WISA.2015.40","url":null,"abstract":"With the rapid progress in data mining and outlier detection, outlier detection methods have been widely used in various domains. The density based LOF method is the commonly used outlier detection method. In big data, the size and dimensions of data is very large, and the data is sparse. Those features make the LOF not suitable for big data. According to the features of big data, we propose a novel SLOF method. We use vectors to denote the complex high dimensional objects in dataset. We compute the distances between objects based on the concept of vector similarity. We introduce the idea of feature bagging approach, to make the SLOF method robust and accurate. We compare the performance of SLOF, LOF and the PINN methods. The experimental results show that SLOF scores' distribution is more stable, the recall rate and precision of SLOF is much better than LOF and PINN methods.","PeriodicalId":198938,"journal":{"name":"2015 12th Web Information System and Application Conference (WISA)","volume":"98 3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134563943","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An Algorithm for URL Routing Based on Trie Structure","authors":"Yijun Zhang, Lizhen Xu","doi":"10.1109/WISA.2015.62","DOIUrl":"https://doi.org/10.1109/WISA.2015.62","url":null,"abstract":"In this paper, a new algorithm based on trie structure is applied to URL routing systems like web MVC system and enterprise service bus, where the routing rules not stored as a table containing regular expressions but a trie. As a result, the table traversing in routing process is replaced by a depth-first searching for trie. Because the static pattern segments are stored as a hash table, the time complexity of matching static pattern are O(1), so that the whole time taken in routing process is reduced. The experiment shows that, compared to ASP.NET routing module which use the route table, the routing module with this algorithm have excellent performance when there are plenty of routing rules in system. It just cost about 10% time of the ASP.NET routing module.","PeriodicalId":198938,"journal":{"name":"2015 12th Web Information System and Application Conference (WISA)","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124908028","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Lingling Huang, Qing Liu, Nan Yang, Yaping Li, Lin Xiao
{"title":"RABBIC: Rank-Based BIClustering Algorithm","authors":"Lingling Huang, Qing Liu, Nan Yang, Yaping Li, Lin Xiao","doi":"10.1109/WISA.2015.50","DOIUrl":"https://doi.org/10.1109/WISA.2015.50","url":null,"abstract":"Biclustering performs simultaneous clustering on the row and column dimensions of the data matrix, it could discover data modules in the data matrix. Gene module is an important concept in systems biology. In this paper, gene modules are specifically defined as a set of genes whose expression levels share the same linear order on each member of a subset of samples. In order to discover such modules, a novel algorithm, the Rank-Based BIClustering algorithm (RABBIC), is designed and developed. RABBIC, when applied to the real ovarian cancer gene expression data, identifies 93 modules, and 25 are biologically significant according to the gene set functional enrichment analysis. This paper deals with the gene expression data from the aspect of rank, which is helpful in reducing the noise of the data. It provides new thoughts for the researches of gene module identification.","PeriodicalId":198938,"journal":{"name":"2015 12th Web Information System and Application Conference (WISA)","volume":"107 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123232813","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Multi-source Emotion Tagging for Online News","authors":"Li Yu, Zhifan Yang, Peng Nie, Xue Zhao, Y. Zhang","doi":"10.1109/WISA.2015.24","DOIUrl":"https://doi.org/10.1109/WISA.2015.24","url":null,"abstract":"With the rapid growth of social media and online news services, users nowadays can respond to online news by rating subjective emotions such as happiness, surprise or anger actively. Once the user ratings is over a certain range, it begins to show up a tendency of what most people think and feel, which can help us understand the preferences and perspectives of most users, and help news providers to provide users with more positive news. Thus it has become a pregnant research problem to tag emotion automatically. This paper tackles the task of emotion tagging for online news with multi-source including news article and comment, as emotion is not only tagged after reading news article, but also can be incorporated in comment with what they feel. In this paper, a novel classification model are proposed with two layer logistic regression. The new approach get outputs from basic classifiers and combine them in a new classifier, making a more accurate prediction when compared with a single source method. An extensive set of experimental results on a real dataset from a popular online news service demonstrate the effectiveness of the proposed approach.","PeriodicalId":198938,"journal":{"name":"2015 12th Web Information System and Application Conference (WISA)","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126410642","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Zhifan Yang, Chao Wang, Fan Zhang, Y. Zhang, Haiwei Zhang
{"title":"Emerging Rumor Identification for Social Media with Hot Topic Detection","authors":"Zhifan Yang, Chao Wang, Fan Zhang, Y. Zhang, Haiwei Zhang","doi":"10.1109/WISA.2015.19","DOIUrl":"https://doi.org/10.1109/WISA.2015.19","url":null,"abstract":"A rumor is commonly defined as a statement whose true value is unverifiable. As rumor can spread misinformation around people, causing social problems such as panic, and the rapid growth of online social media has made it possible for rumors to spread more quickly, it is important to automatically identify rumors for social media. Existing methods on rumor detection always concentrate on telling rumor from truth with handcrafted regular expressions, dealing with out of date rumor related message. To solve this problem, we introduce a novel hot topic detection method combining bursty term identification and multi-dimension sentence modeling to automatically detect emerging hot topics for rumor identification. We conduct a comprehensive set of experiments on two data sets from real-world social media. Experiment results show that our emerging rumor identification for social media with hot topic detection work well both in news data set and twitter data set, and combining the hot topic detection with the rumor detection is possible to finish real-time rumor identification. We believe our method to automatically detect rumor will open new dimensions in analyzing online misinformation and other aspects of social media mining.","PeriodicalId":198938,"journal":{"name":"2015 12th Web Information System and Application Conference (WISA)","volume":"47 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126276716","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}