2016 IEEE 16th International Conference on Data Mining Workshops (ICDMW)最新文献

筛选
英文 中文
Semi-Supervised Similarity Preserving Co-Selection 半监督相似保持协同选择
2016 IEEE 16th International Conference on Data Mining Workshops (ICDMW) Pub Date : 2016-12-12 DOI: 10.1109/ICDMW.2016.0111
Raywat Makkhongkaew, K. Benabdeslem
{"title":"Semi-Supervised Similarity Preserving Co-Selection","authors":"Raywat Makkhongkaew, K. Benabdeslem","doi":"10.1109/ICDMW.2016.0111","DOIUrl":"https://doi.org/10.1109/ICDMW.2016.0111","url":null,"abstract":"Semi-supervised learning is the required paradigm when data are partially labeled. It is more adapted for large domain applications when labels are hardly and costly to obtain. In addition, when data are large, feature selection and instance selection are two important dual operations for removing irrelevant information. To address theses challenges together, we propose a unified framework, called sCOs, for semi-supervised co-selection of features and instances, simultaneously. In particular, we propose a novel cost function based on l2, 1-norm regularization and similarity preserving selection of both features and instances. Experimental results on some known benchmark datasets are provided for validating sCOs and comparing it with some representative methods in the state-of-the art.","PeriodicalId":373866,"journal":{"name":"2016 IEEE 16th International Conference on Data Mining Workshops (ICDMW)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130762047","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Link Prediction in the Twitter Mention Network: Impacts of Local Structure and Similarity of Interest Twitter提及网络中的链接预测:局部结构和兴趣相似性的影响
2016 IEEE 16th International Conference on Data Mining Workshops (ICDMW) Pub Date : 2016-12-12 DOI: 10.1109/ICDMW.2016.0071
Hadrien Hours, E. Fleury, M. Karsai
{"title":"Link Prediction in the Twitter Mention Network: Impacts of Local Structure and Similarity of Interest","authors":"Hadrien Hours, E. Fleury, M. Karsai","doi":"10.1109/ICDMW.2016.0071","DOIUrl":"https://doi.org/10.1109/ICDMW.2016.0071","url":null,"abstract":"The creation of social ties is driven by several factors which can arguably be related to individual preferences and to the common social environment of individuals. Effects of homophily and triadic closure mechanisms are claimed to be important in terms of initiating new social interactions and in turn to shape the global social structure. This way they eventually provide some potential to predict the creation of social ties between disconnected people sharing common friends or common subjects of interest. In this paper we analyze a large Twitter data corpus and quantify similarities between people by considering the set of their common friends and the set of their commonly shared hashtags in order to predict mention links among them. We show that these similarity measures are correlated among connected people and that the combination of contextual and local structural features provides better predictions as compared to cases where they are considered separately. These results help us to better understand the evolution of egocentric and global social networks and provide advances in the design of better recommendation systems and resource allocation plans.","PeriodicalId":373866,"journal":{"name":"2016 IEEE 16th International Conference on Data Mining Workshops (ICDMW)","volume":"236 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116094752","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Overlapping Community Detection by Local Decentralised Vertex-Centred Process 局部分散顶点中心过程的重叠社区检测
2016 IEEE 16th International Conference on Data Mining Workshops (ICDMW) Pub Date : 2016-12-12 DOI: 10.1109/ICDMW.2016.0019
M. Canu, Marie-Jeanne Lesot, Adrien Revault d'Allonnes
{"title":"Overlapping Community Detection by Local Decentralised Vertex-Centred Process","authors":"M. Canu, Marie-Jeanne Lesot, Adrien Revault d'Allonnes","doi":"10.1109/ICDMW.2016.0019","DOIUrl":"https://doi.org/10.1109/ICDMW.2016.0019","url":null,"abstract":"This paper focuses on the identification of overlapping communities, allowing nodes to simultaneously belong to several communities, in a decentralised way. To that aim it proposes LOCNeSs, an algorithm specially designed to run in a decentralised environment and to limit propagation, two essential characteristics to be applied in mobile networks. It is based on the exploitation of the preferential attachment mechanism in networks. Experimental results show that LOCNeSs is stable and achieves good overlapping vertex identification.","PeriodicalId":373866,"journal":{"name":"2016 IEEE 16th International Conference on Data Mining Workshops (ICDMW)","volume":"1022 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123122350","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
A Probabilistic View of Neighborhood-Based Recommendation Methods 基于邻域的推荐方法的概率观点
2016 IEEE 16th International Conference on Data Mining Workshops (ICDMW) Pub Date : 2016-12-12 DOI: 10.1109/ICDMW.2016.0011
Jun Wang, Qiang Tang
{"title":"A Probabilistic View of Neighborhood-Based Recommendation Methods","authors":"Jun Wang, Qiang Tang","doi":"10.1109/ICDMW.2016.0011","DOIUrl":"https://doi.org/10.1109/ICDMW.2016.0011","url":null,"abstract":"Probabilistic graphic model is an elegant framework to compactly present complex real-world observations by modeling uncertainty and logical flow (conditionally independent factors). In this paper, we present a probabilistic framework of neighborhood-based recommendation methods (PNBM) in which similarity is regarded as an unobserved factor. Thus, PNBM leads the estimation of user preference to maximizing a posterior over similarity. We further introduce a novel multi-layer similarity descriptor which models and learns the joint influence of various features under PNBM, and name the new framework MPNBM. Empirical results on real-world datasets show that MPNBM allows very accurate estimation of user preferences.","PeriodicalId":373866,"journal":{"name":"2016 IEEE 16th International Conference on Data Mining Workshops (ICDMW)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124904013","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Risk-Aware Dynamic Reserve Prices of Programmatic Guarantee in Display Advertising 显示广告程序化保证的动态保留价风险意识
2016 IEEE 16th International Conference on Data Mining Workshops (ICDMW) Pub Date : 2016-12-12 DOI: 10.1109/ICDMW.2016.0079
Bowei Chen
{"title":"Risk-Aware Dynamic Reserve Prices of Programmatic Guarantee in Display Advertising","authors":"Bowei Chen","doi":"10.1109/ICDMW.2016.0079","DOIUrl":"https://doi.org/10.1109/ICDMW.2016.0079","url":null,"abstract":"Display advertising is one important online advertising type where banner advertisements (shortly ad) on websites are usually measured by how many times they are viewed by online users. There are two major channels to sell ad views. They can be auctioned off in real time or be directly sold through guaranteed contracts in advance. The former is also known as real-time bidding (RTB), in which media buyers come to a common marketplace to compete for a single ad view and this inventory will be allocated to a buyer in milliseconds by an auction model. Unlike RTB, buying and selling guaranteed contracts are not usually programmatic but through private negotiations as advertisers would like to customise their requests and purchase ad views in bulk. In this paper, we propose a simple model that facilitates the automation of direct sales. In our model, a media seller puts future ad views on sale and receives buy requests sequentially over time until the future delivery period. The seller maintains a hidden yet dynamically changing reserve price in order to decide whether to accept a buy request or not. The future supply and demand are assumed to be well estimated and static, and the model's revenue management is using inventory control theory where each computed reverse price is based on the updated supply and demand, and the unsold future ad views will be auctioned off in RTB to the meet the unfulfilled demand. The model has several desirable properties. First, it is not limited to the demand arrival assumption. Second, it will not affect the current equilibrium between RTB and direct sales as there are no posted guaranteed prices. Third, the model uses the expected revenue from RTB as a lower bound for inventory control and we show that a publisher can receive expected total revenue greater than or equal to those from only RTB if she uses the computed dynamic reserves prices for direct sales.","PeriodicalId":373866,"journal":{"name":"2016 IEEE 16th International Conference on Data Mining Workshops (ICDMW)","volume":"46 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134258215","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
A Probabilistic Address Parser Using Conditional Random Fields and Stochastic Regular Grammar 基于条件随机场和随机规则语法的概率地址解析器
2016 IEEE 16th International Conference on Data Mining Workshops (ICDMW) Pub Date : 2016-12-12 DOI: 10.1109/ICDMW.2016.0039
Minlue Wang, Valeriia Haberland, Amos Yeo, Andrew O. Martin, J. Howroyd, J. M. Bishop
{"title":"A Probabilistic Address Parser Using Conditional Random Fields and Stochastic Regular Grammar","authors":"Minlue Wang, Valeriia Haberland, Amos Yeo, Andrew O. Martin, J. Howroyd, J. M. Bishop","doi":"10.1109/ICDMW.2016.0039","DOIUrl":"https://doi.org/10.1109/ICDMW.2016.0039","url":null,"abstract":"Automatic semantic annotation of data from databases or the web is an important pre-process for data cleansing and record linkage. It can be used to resolve the problem of imperfect field alignment in a database or identify comparable fields for matching records from multiple sources. The annotation process is not trivial because data values may be noisy, such as abbreviations, variations or misspellings. In particular, overlapping features usually exist in a lexicon-based approach. In this work, we present a probabilistic address parser based on linear-chain conditional random fields (CRFs), which allow more expressive token-level features compared to hidden Markov models (HMMs). In additions, we also proposed two general enhancement techniques to improve the performance. One is taking original semi-structure of the data into account. Another is post-processing of the output sequences of the parser by combining its conditional probability and a score function, which is based on a learned stochastic regular grammar (SRG) that captures segment-level dependencies. Experiments were conducted by comparing the CRF parser to a HMM parser and a semi-Markov CRF parser in two real-world datasets. The CRF parser out-performed the HMM parser and the semi-Markov CRF in both datasets in terms of classification accuracy. Leveraging the structure of the data and combining the linear-chain CRF with the SRG further improved the parser to achieve an accuracy of 97% on a postal dataset and 96% on a company dataset.","PeriodicalId":373866,"journal":{"name":"2016 IEEE 16th International Conference on Data Mining Workshops (ICDMW)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129503288","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
Online Outlier Detection of Energy Data Streams Using Incremental and Kernel PCA Algorithms 基于增量和核主成分分析算法的能源数据流异常值在线检测
2016 IEEE 16th International Conference on Data Mining Workshops (ICDMW) Pub Date : 2016-12-01 DOI: 10.1109/ICDMW.2016.0062
Jeremiah D. Deng
{"title":"Online Outlier Detection of Energy Data Streams Using Incremental and Kernel PCA Algorithms","authors":"Jeremiah D. Deng","doi":"10.1109/ICDMW.2016.0062","DOIUrl":"https://doi.org/10.1109/ICDMW.2016.0062","url":null,"abstract":"Outlier detection or anomaly detection is an important and challenging issue in data mining, even so in the domain of energy data mining where data are often collected in large amounts but with little labeled information. This paper presents a couple of online outlier detection algorithms based on principal component analysis. Novel algorithmic treatments are introduced to build incremental PCA and kernel PCA algorithms with online learning abilities. Some preliminary experimental results obtained from a real-world household consumption dataset have produced some promising performance for the proposed algorithms.","PeriodicalId":373866,"journal":{"name":"2016 IEEE 16th International Conference on Data Mining Workshops (ICDMW)","volume":"204 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115725479","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Fraud Detection in Voice-Based Identity Authentication Applications and Services 基于语音的身份认证应用和服务中的欺诈检测
2016 IEEE 16th International Conference on Data Mining Workshops (ICDMW) Pub Date : 2016-12-01 DOI: 10.1109/ICDMW.2016.0155
Saeid Safavi, Hock C. Gan, I. Mporas, R. Sotudeh
{"title":"Fraud Detection in Voice-Based Identity Authentication Applications and Services","authors":"Saeid Safavi, Hock C. Gan, I. Mporas, R. Sotudeh","doi":"10.1109/ICDMW.2016.0155","DOIUrl":"https://doi.org/10.1109/ICDMW.2016.0155","url":null,"abstract":"Keeping track of the multiple passwords, PINs, memorable dates and other authentication details needed to gainremote access to accounts is one of modern life's less appealingchallenges. The employment of a voice-based verification as abiometric technology for both children and adults could be agood replacement to the old fashioned memory dependentprocedure. Using voice for authentication could be beneficial inseveral application areas, including, security, protection, education, call-based and web-based services. Voice-basedbiometric applications are subject to different types of spoofingattacks. The most accessible and affordable type of spoofing for avoice-based biometrics system is a replay attack. Replay, which isto playback a pre-recorded speech sample, presents a genuinerisk to automatic speaker verification technology. This workpresents two architectures for detecting frauds caused by replayattacks in a voice-based biometrics authentication systems. Experimental results confirmed that obtained performancesfrom both methods could further improve by applying a machinelearning algorithm for performing fusion at the score level. Theperformance of both methods further improved by fusion usingindependent sources of scores in different architectures.","PeriodicalId":373866,"journal":{"name":"2016 IEEE 16th International Conference on Data Mining Workshops (ICDMW)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116755785","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 18
Real-Time Top-View People Counting Based on a Kinect and NVIDIA Jetson TK1 Integrated Platform 基于Kinect和NVIDIA Jetson TK1集成平台的实时顶视图人口计数
2016 IEEE 16th International Conference on Data Mining Workshops (ICDMW) Pub Date : 2016-12-01 DOI: 10.1109/ICDMW.2016.0073
Guangqin Li, Peng Ren, Xinrong Lyu, He Zhang
{"title":"Real-Time Top-View People Counting Based on a Kinect and NVIDIA Jetson TK1 Integrated Platform","authors":"Guangqin Li, Peng Ren, Xinrong Lyu, He Zhang","doi":"10.1109/ICDMW.2016.0073","DOIUrl":"https://doi.org/10.1109/ICDMW.2016.0073","url":null,"abstract":"In this paper, we describe how to establish an embedded framework for real-time top-view people counting. The development of our system consists of two parts, i.e. establishing an embedded signal processing platform and designing a people counting algorithm for the embedded system. For the hardware platform construction, we use Kinect as the camera and exploit NVIDIA Jetson TK1 board as the embedded processing platform. We describe how to build a channel to make Kinect for windows version 2.0 communicate with Jetson TK1. Based on the embedded system, we adapt a water filling based scheme for top-view people counting, which integrates head detection based on water drop, people tracking and counting. Gaussian Mixture Model is used to construct and update the background model. The moving people in each video frame are extracted using background subtraction method. Additionally, the water filling algorithm is used to segment head area as Region Of Interest(ROI). Tracking and counting people are performed by calculating the distance of ROI center point before and after the frame. The whole framework is flexible and practical for real-time application.","PeriodicalId":373866,"journal":{"name":"2016 IEEE 16th International Conference on Data Mining Workshops (ICDMW)","volume":"119 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127265382","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Improving the Prediction Cost of Drift Handling Algorithms by Abstaining 用弃权法提高漂移处理算法的预测代价
2016 IEEE 16th International Conference on Data Mining Workshops (ICDMW) Pub Date : 2016-12-01 DOI: 10.1109/ICDMW.2016.0175
P. Loeffel, V. Lemaire, C. Marsala, Marcin Detyniecki
{"title":"Improving the Prediction Cost of Drift Handling Algorithms by Abstaining","authors":"P. Loeffel, V. Lemaire, C. Marsala, Marcin Detyniecki","doi":"10.1109/ICDMW.2016.0175","DOIUrl":"https://doi.org/10.1109/ICDMW.2016.0175","url":null,"abstract":"The problem considered in this paper is regression with a constraint on the precision of each prediction in the framework of data streams subject to concept drifts (when the hidden distribution which generates the observations can change over time). Concept drifts can diminish the reliability of the predictions over time and it might not be possible to output a prediction which satisfies the constraints on the precision. In this case, we claim that if the costs associated with a good and with a bad prediction are known beforehand, the overall prediction cost can be improved by allowing the regressor to abstain. To this end, we propose a generic method, compatible with any regressor, which uses an ensemble of reliability estimators to estimate whether the constraints on the precision of a given prediction can be met or not. In the later case, the regressor is allowed to abstain. Empirical results on 30 datasets including different types of drifts back our claim.","PeriodicalId":373866,"journal":{"name":"2016 IEEE 16th International Conference on Data Mining Workshops (ICDMW)","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125941708","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信