2018 IEEE International Conference on Big Knowledge (ICBK)最新文献

筛选
英文 中文
[Copyright notice] (版权)
2018 IEEE International Conference on Big Knowledge (ICBK) Pub Date : 2018-11-01 DOI: 10.1109/icbk.2018.00003
{"title":"[Copyright notice]","authors":"","doi":"10.1109/icbk.2018.00003","DOIUrl":"https://doi.org/10.1109/icbk.2018.00003","url":null,"abstract":"","PeriodicalId":144958,"journal":{"name":"2018 IEEE International Conference on Big Knowledge (ICBK)","volume":"21 4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115321916","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Matrix Profile XIII: Time Series Snippets: A New Primitive for Time Series Data Mining 矩阵轮廓XIII:时间序列片段:时间序列数据挖掘的一种新基元
2018 IEEE International Conference on Big Knowledge (ICBK) Pub Date : 2018-11-01 DOI: 10.1109/ICBK.2018.00058
Shima Imani, Frank Madrid, W. Ding, S. Crouter, Eamonn J. Keogh
{"title":"Matrix Profile XIII: Time Series Snippets: A New Primitive for Time Series Data Mining","authors":"Shima Imani, Frank Madrid, W. Ding, S. Crouter, Eamonn J. Keogh","doi":"10.1109/ICBK.2018.00058","DOIUrl":"https://doi.org/10.1109/ICBK.2018.00058","url":null,"abstract":"Perhaps the most basic query made by a data analyst confronting a new data source is \"Show me some representative/typical data.\" Answering this question is trivial in many domains, but surprisingly, it is very difficult in large time series datasets. The major difficulty is not time or space complexity, but defining what it means to be representative data in this domain. In this work, we show that the obvious candidate definitions: motifs, shapelets, cluster centers, random samples etc., are all poor choices. Thus motivated, we introduce time series snippets, a novel representation of typical time series subsequences. Beyond their utility for visualizing and summarizing massive time series collections, we show that time series snippets have utility for high-level comparison of large time series collections.","PeriodicalId":144958,"journal":{"name":"2018 IEEE International Conference on Big Knowledge (ICBK)","volume":"52 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124534019","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 36
Stochastic Optimization for Market Return Prediction Using Financial Knowledge Graph 基于金融知识图的市场收益预测随机优化
2018 IEEE International Conference on Big Knowledge (ICBK) Pub Date : 2018-11-01 DOI: 10.1109/ICBK.2018.00012
Xiaoyi Fu, Xinqi Ren, O. Mengshoel, Xindong Wu
{"title":"Stochastic Optimization for Market Return Prediction Using Financial Knowledge Graph","authors":"Xiaoyi Fu, Xinqi Ren, O. Mengshoel, Xindong Wu","doi":"10.1109/ICBK.2018.00012","DOIUrl":"https://doi.org/10.1109/ICBK.2018.00012","url":null,"abstract":"Interactive prediction of financial instrument returns is important. It is needed for asset managers to generate trading strategies as well as for stock exchange regulators to discover pricing anomalies. In this paper, we introduce an integrated stochastic optimization technique, namely genetic programming (GP) with generalized crowding (GC), GP+GC, as an integrated approach for a market return prediction system, using a financial knowledge graph (KG). On the one hand, using time-series data for twenty-nine component stocks of the Dow Jones industrial average, we show that our stochastic local search method can give a better prediction performance by providing a comparison of its return performances with two traditional benchmarks, namely a Buy & Hold strategy and the Moving Average Convergence Divergence (MACD) technical indicator. On the other hand, we use features extracted from a time-evolving knowledge graph constructed from fifty component stocks of the SSE50 Index. These features are used to a GP variant and then incorporate the knowledge extracted from the expression learnt from GP into a KG. Overall, this work demonstrates how to integrate GP+GC with KGs in a powerful manner.","PeriodicalId":144958,"journal":{"name":"2018 IEEE International Conference on Big Knowledge (ICBK)","volume":"110 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124248946","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 18
Don't Do Imputation: Dealing with Informative Missing Values in EHR Data Analysis 不要代入:处理电子病历数据分析中的信息缺失值
2018 IEEE International Conference on Big Knowledge (ICBK) Pub Date : 2018-11-01 DOI: 10.1109/ICBK.2018.00062
Jia Li, Mengdie Wang, M. Steinbach, Vipin Kumar, György J. Simon
{"title":"Don't Do Imputation: Dealing with Informative Missing Values in EHR Data Analysis","authors":"Jia Li, Mengdie Wang, M. Steinbach, Vipin Kumar, György J. Simon","doi":"10.1109/ICBK.2018.00062","DOIUrl":"https://doi.org/10.1109/ICBK.2018.00062","url":null,"abstract":"Missing values pose a significant challenge in data analytic, especially in clinical studies, data is typically missing-not-at-random (MNAR). Applying techniques (e.g. imputations) that were designed for missing-at-random (MAR) to MNAR data, can lead to biases. In this work, we propose pattern-wise analysis, a collection of methods for building predictive models in the presence of MNAR missing values. On a per-pattern basis, this methodology constructs an individual model for each missingness pattern. We show that even the simplest pattern-wise method, Per-Pattern Modeling (PPM) outperforms models built on data sets completed by the most popular imputation methods. PPM faces difficulty when the number of missingness patterns is too high or when the missingness patterns have too few observations. We developed variants of PPM to overcome these challenges from three complementary perspectives: (i) from a model selection perspective, where PPM can select patterns to build models; (ii) a distributional perspective, where the training data set is expanded in a distribution-preserving fashion; and (iii) from a causal perspective, where a causal structure for the MNAR mechanism is assumed and exploited to convert the problem from MNAR to MAR. Evaluation of the proposed methods on both synthetic MNAR data and a real-world clinical data set of sepsis patients shows notable improvement over traditional approaches.","PeriodicalId":144958,"journal":{"name":"2018 IEEE International Conference on Big Knowledge (ICBK)","volume":"70 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123142370","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Opponent Resource Prediction in StarCraft Using Imperfect Information 基于不完全信息的《星际争霸》对手资源预测
2018 IEEE International Conference on Big Knowledge (ICBK) Pub Date : 2018-11-01 DOI: 10.1109/ICBK.2018.00056
W. Hamilton, M. Shafiq
{"title":"Opponent Resource Prediction in StarCraft Using Imperfect Information","authors":"W. Hamilton, M. Shafiq","doi":"10.1109/ICBK.2018.00056","DOIUrl":"https://doi.org/10.1109/ICBK.2018.00056","url":null,"abstract":"The real-time strategy (RTS) game StarCraft has recently become a focus of research on game AI. A major challenge in RTS gameplay is making decisions using imperfect information about the opponent's state and actions. One approach that has proven rewarding is to apply machine learning techniques to replays of games between skilled human players. We consider the problem of estimating the number of resources gathered by the opponent during a StarCraft match. We introduce and evaluate two techniques for opponent resource prediction using supervised learning on match replays. Our first method uses multiple linear regression on observable features of the game state. Our second method uses naïve Bayes classification to form imprecise but accurate predictions.","PeriodicalId":144958,"journal":{"name":"2018 IEEE International Conference on Big Knowledge (ICBK)","volume":"50 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116729048","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
LINKSOCIAL: Linking User Profiles Across Multiple Social Media Platforms LINKSOCIAL:跨多个社交媒体平台链接用户档案
2018 IEEE International Conference on Big Knowledge (ICBK) Pub Date : 2018-11-01 DOI: 10.1109/ICBK.2018.00042
V. Sharma, C. Dyreson
{"title":"LINKSOCIAL: Linking User Profiles Across Multiple Social Media Platforms","authors":"V. Sharma, C. Dyreson","doi":"10.1109/ICBK.2018.00042","DOIUrl":"https://doi.org/10.1109/ICBK.2018.00042","url":null,"abstract":"Social media connects individuals to on-line communities through a variety of platforms, which are partially funded by commercial marketing and product advertisements. A recent study reported that 92% of businesses rated social media marketing as very important. Accurately linking the identity of users across various social media platforms has several applications viz. marketing strategy, friend suggestions, multi platform user behavior, information verification etc. We propose LINKSOCIAL, a large-scale, scalable, and efficient system to link social media profiles. Unlike most previous research that focuses mostly on pair-wise linking (e.g., Facebook profiles paired to Twitter profiles), we focus on linking across multiple social media platforms. L INK S OCIAL has three steps: (1) extract features from user profiles and build a cost function, (2) use Stochastic Gradient Descent to calculate feature weights, and (3) perform pair-wise and multi-platform linking of user profiles. To reduce the cost of computation, L INK S OCIAL uses clustering to perform candidate pair selection. Our experiments show that L INK S OCIAL predicts with 92% accuracy on pair-wise and 74% on multi-platform linking of three well-known social media platforms. Data used in our approach will be available at http://vishalshar.github.io/data/.","PeriodicalId":144958,"journal":{"name":"2018 IEEE International Conference on Big Knowledge (ICBK)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115757614","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
Principal Sample Analysis for Data Reduction 数据约简的主样本分析
2018 IEEE International Conference on Big Knowledge (ICBK) Pub Date : 2018-11-01 DOI: 10.1109/ICBK.2018.00054
Benyamin Ghojogh, Mark Crowley
{"title":"Principal Sample Analysis for Data Reduction","authors":"Benyamin Ghojogh, Mark Crowley","doi":"10.1109/ICBK.2018.00054","DOIUrl":"https://doi.org/10.1109/ICBK.2018.00054","url":null,"abstract":"Data reduction is an essential technique used for purifying data, training discriminative models more efficiently, encouraging generalizability, and for using less storage space for memory-limited systems. The literature on data reduction focuses mostly on dimensionality reduction, however, data sample reduction (i.e. removal of data points from a dataset) has its own benefits and is no less important given growing sizes of datasets and the growing need for usable data analysis methods on the network edge. This paper proposes a new data sample reduction method, Principal Sample Analysis (PSA), which reduces the number (population) of data samples as a preprocessing step for classification. PSA ranks the samples of each class considering how well they represent it and enables better discriminative learning by using the sparsity and similarity of samples at the same time. Data sample reduction then occurs by cutting off the lowest ranked samples. The PSA method can work alongside any other data reduction/expansion and classification method. Experiments are carried out on three datasets (WDBC, AT&T, and MNIST) with contrasting characteristics and show the state-of-the-art effectiveness of the proposed method.","PeriodicalId":144958,"journal":{"name":"2018 IEEE International Conference on Big Knowledge (ICBK)","volume":"116 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127238212","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Confidence-Aware Negative Sampling Method for Noisy Knowledge Graph Embedding 噪声知识图嵌入的置信度感知负抽样方法
2018 IEEE International Conference on Big Knowledge (ICBK) Pub Date : 2018-11-01 DOI: 10.1109/ICBK.2018.00013
Yingchun Shan, Chenyang Bu, Xiaojian Liu, Shengwei Ji, Lei Li
{"title":"Confidence-Aware Negative Sampling Method for Noisy Knowledge Graph Embedding","authors":"Yingchun Shan, Chenyang Bu, Xiaojian Liu, Shengwei Ji, Lei Li","doi":"10.1109/ICBK.2018.00013","DOIUrl":"https://doi.org/10.1109/ICBK.2018.00013","url":null,"abstract":"Knowledge graph embedding (KGE) can benefit a variety of downstream tasks, such as link prediction and relation extraction, and has therefore quickly gained much attention. However, most conventional embedding models assume that all triple facts share the same confidence without any noise, which is inappropriate. In fact, many noises and conflicts can be brought into a knowledge graph (KG) because of both the automatic construction process and data quality problems. Fortunately, the novel confidence-aware knowledge representation learning (CKRL) framework was proposed, to incorporate triple confidence into translation-based models for KGE. Though effective at detecting noises, with uniform negative sampling methods, and a harsh triple quality function, CKRL could easily cause zero loss problems and false detection issues. To address these problems, we introduce the concept of negative triple confidence and propose a confidence-aware negative sampling method to support the training of CKRL in noisy KGs. We evaluate our model on the knowledge graph completion task. Experimental results demonstrate that the idea of introducing negative triple confidence can greatly facilitate performance improvement in this task, which confirms the capability of our model in noisy knowledge representation learning (NKRL).","PeriodicalId":144958,"journal":{"name":"2018 IEEE International Conference on Big Knowledge (ICBK)","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133009800","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 19
Semi-Supervised Representation Learning: Transfer Learning with Manifold Regularized Auto-Encoders 半监督表示学习:流形正则化自编码器的迁移学习
2018 IEEE International Conference on Big Knowledge (ICBK) Pub Date : 2018-11-01 DOI: 10.1109/ICBK.2018.00019
Yi Zhu, Xuegang Hu, Yuhong Zhang, Peipei Li
{"title":"Semi-Supervised Representation Learning: Transfer Learning with Manifold Regularized Auto-Encoders","authors":"Yi Zhu, Xuegang Hu, Yuhong Zhang, Peipei Li","doi":"10.1109/ICBK.2018.00019","DOIUrl":"https://doi.org/10.1109/ICBK.2018.00019","url":null,"abstract":"The excellent performance of transfer learning has emerged in the past few years. How to find feature representations which minimizes the distance between source and target domain is the crucial problem in transfer learning. Recently, deep learning methods have been proposed to learn higher level and robust representation. However, in traditional methods, label information in source domain is not designed to optimize both feature representations and parameters of the learning model. Additionally, data redundance may incur performance degradation on transfer learning. To address these problems, we propose a novel semi-supervised representation learning framework for transfer learning. To obtain this framework, manifold regularization is integrated for the parameters optimization, and the label information is encoded using a softmax regression model in auto-encoders. Meanwhile, whitening layer is introduced to reduce data redundance before auto-encoders. Extensive experiments demonstrate the effectiveness of our proposed framework compared to other competing state-of-the-art baseline methods.","PeriodicalId":144958,"journal":{"name":"2018 IEEE International Conference on Big Knowledge (ICBK)","volume":"111 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115565532","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Fast Approximate Hubness Reduction for Large High-Dimensional Data 大型高维数据的快速近似轮毂约简
2018 IEEE International Conference on Big Knowledge (ICBK) Pub Date : 2018-11-01 DOI: 10.1109/ICBK.2018.00055
Roman Feldbauer, Maximilian Leodolter, C. Plant, A. Flexer
{"title":"Fast Approximate Hubness Reduction for Large High-Dimensional Data","authors":"Roman Feldbauer, Maximilian Leodolter, C. Plant, A. Flexer","doi":"10.1109/ICBK.2018.00055","DOIUrl":"https://doi.org/10.1109/ICBK.2018.00055","url":null,"abstract":"High-dimensional data mining is challenging due to the \"curse of dimensionality\". Hubness reduction counters one particular aspect of the dimensionality curse, but suffers from quadratic algorithmic complexity. We present approximate hubness reduction methods with linear complexity in time and space, thus enabling hubness reduction for large data for the first time. Furthermore, we introduce a new hubness measure especially suited for large data, which is, in addition, readily interpretable. Experiments on synthetic and real-world data show that the approximations come at virtually no cost in accuracy in comparison with full hubness reduction. Finally, we demonstrate improved transport mode detection in massive mobility data collected with mobile devices as concrete research application. All methods are made publicly available in a free open source software package.","PeriodicalId":144958,"journal":{"name":"2018 IEEE International Conference on Big Knowledge (ICBK)","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116204379","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信