2019 IEEE International Conference on Big Knowledge (ICBK)最新文献_第3页

FASE: Feature-Based Similarity Search on ECG Data 基于特征的心电数据相似度搜索

2019 IEEE International Conference on Big Knowledge (ICBK) Pub Date : 2019-11-01 DOI: 10.1109/ICBK.2019.00044

Meng Wu, Lei Li, Hongyan Li

引用次数: 3

Inductive Multi-view Semi-Supervised Anomaly Detection via Probabilistic Modeling 基于概率建模的感应多视图半监督异常检测

2019 IEEE International Conference on Big Knowledge (ICBK) Pub Date : 2019-11-01 DOI: 10.1109/ICBK.2019.00042

Zhen Wang, Maohong Fan, S. Muknahallipatna, Chao Lan

{"title":"Inductive Multi-view Semi-Supervised Anomaly Detection via Probabilistic Modeling","authors":"Zhen Wang, Maohong Fan, S. Muknahallipatna, Chao Lan","doi":"10.1109/ICBK.2019.00042","DOIUrl":"https://doi.org/10.1109/ICBK.2019.00042","url":null,"abstract":"This paper considers anomaly detection with multi-view data. Unlike traditional detection on single-view data which identifies anomalies based on inconsistency between instances, multi-view anomaly detection identifies anomalies based on view inconsistency within each instance. Current multi-view detection approaches are mostly unsupervised and transductive. This may have limited performance in many applications, which have labeled normal data and prefer efficient detection on new data. In this paper, we propose an inductive semi-supervised multi-view anomaly detection approach. We design a probabilistic generative model for normal data, which assumes different views of a normal instance are generated from a shared latent factor, conditioned on which the views become independent. We estimate the model by maximizing its likelihood on normal data using the EM algorithm. Then, we apply the model to detect anomalies, which are instances generated with small probabilities. We experiment our approach on nine public data sets under different multi-view anomaly settings, and show it outperforms several state-of-the-art multi-view detection methods.","PeriodicalId":383917,"journal":{"name":"2019 IEEE International Conference on Big Knowledge (ICBK)","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121070423","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

Vector-Degree: A General Similarity Measure for Co-location Patterns 向量度:共同定位模式的一般相似度量

2019 IEEE International Conference on Big Knowledge (ICBK) Pub Date : 2019-11-01 DOI: 10.1109/ICBK.2019.00045

Pingping Wu, Lizhen Wang, Muquan Zou

{"title":"Vector-Degree: A General Similarity Measure for Co-location Patterns","authors":"Pingping Wu, Lizhen Wang, Muquan Zou","doi":"10.1109/ICBK.2019.00045","DOIUrl":"https://doi.org/10.1109/ICBK.2019.00045","url":null,"abstract":"Co-location pattern mining is one of the hot issues in spatial pattern mining. Similarity measures between co-location patterns can be used to solve problems such as pattern compression, pattern summarization, pattern selection and pattern ordering. Although, many researchers have focused on this issue recently and provided a more concise set of co-location patterns based on these measures. Unfortunately, these measures suffer from various weaknesses, e.g., some measures can only calculate the similarity between super-pattern and sub-pattern while some others require additional domain knowledge. In this paper, we propose a general similarity measure for any two co-location patterns. Firstly, we study the characteristics of the co-location pattern and present a novel representation model based on maximal cliques. Then, two materializations of the maximal clique and the pattern relationship, 0-1 vector and key-value vector, are proposed and discussed in the paper. Moreover, based on the materialization methods, the similarity measure, Vector-Degree, is defined by applying the cosine similarity. Finally, similarity is used to group the patterns by a hierarchical clustering algorithm. The experimental results on both synthetic and real world data sets show the efficiency and effectiveness of our proposed method.","PeriodicalId":383917,"journal":{"name":"2019 IEEE International Conference on Big Knowledge (ICBK)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126152898","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Machine Learning Models for Paraphrase Identification and its Applications on Plagiarism Detection 释义识别的机器学习模型及其在抄袭检测中的应用

2019 IEEE International Conference on Big Knowledge (ICBK) Pub Date : 2019-11-01 DOI: 10.1109/ICBK.2019.00021

E. Hunt, Binay Dahal, J. Zhan, L. Gewali, Paul Y. Oh, Ritvik Janamsetty, Chanana Kinares, Chanel Koh, Alexis Sanchez, Felix Zhan, Murat Özdemir, Shabnam Waseem, Osman Yolcu

{"title":"Machine Learning Models for Paraphrase Identification and its Applications on Plagiarism Detection","authors":"E. Hunt, Binay Dahal, J. Zhan, L. Gewali, Paul Y. Oh, Ritvik Janamsetty, Chanana Kinares, Chanel Koh, Alexis Sanchez, Felix Zhan, Murat Özdemir, Shabnam Waseem, Osman Yolcu","doi":"10.1109/ICBK.2019.00021","DOIUrl":"https://doi.org/10.1109/ICBK.2019.00021","url":null,"abstract":"Paraphrase Identification or Natural Language Sentence Matching (NLSM) is one of the important and challenging tasks in Natural Language Processing where the task is to identify if a sentence is a paraphrase of another sentence in a given pair of sentences. Paraphrase of a sentence conveys the same meaning but its structure and the sequence of words varies. It is a challenging task as it is difficult to infer the proper context about a sentence given its short length. Also, coming up with similarity metrics for the inferred context of a pair of sentences is not straightforward as well. Whereas, its applications are numerous. This work explores various machine learning algorithms to model the task and also applies different input encoding scheme. Specifically, we created the models using Logistic Regression, Support Vector Machines, and different architectures of Neural Networks. Among the compared models, as expected, Recurrent Neural Network (RNN) is best suited for our paraphrase identification task. Also, we propose that Plagiarism detection is one of the areas where Paraphrase Identification can be effectively implemented.","PeriodicalId":383917,"journal":{"name":"2019 IEEE International Conference on Big Knowledge (ICBK)","volume":"2013 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127421292","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 28

Complicacy-Guided Parameter Space Sampling for Knowledge Discovery with Limited Simulation Budgets 有限仿真预算下知识发现的复杂度引导参数空间采样

2019 IEEE International Conference on Big Knowledge (ICBK) Pub Date : 2019-11-01 DOI: 10.1109/ICBK.2019.00015

Xilun Chen, L. Mathesen, Giulia Pedrielli, K. Candan

引用次数: 0

Highly Parallel Seedless Random Number Generation from Arbitrary Thread Schedule Reconstruction 基于任意线程调度重构的高度并行无籽随机数生成

2019 IEEE International Conference on Big Knowledge (ICBK) Pub Date : 2019-11-01 DOI: 10.1109/ICBK.2019.00009

Eryn Aguilar, Benjamin Lowe, J. Zhan, L. Gewali, Paul Y. Oh, Jevis Dancel, Deysaree Mamaud, Dorothy Pirosch, Farin Tavacoli, Felix Zhan, Robbie Pearce, Margaret Novack, Hokunani Keehu

{"title":"Highly Parallel Seedless Random Number Generation from Arbitrary Thread Schedule Reconstruction","authors":"Eryn Aguilar, Benjamin Lowe, J. Zhan, L. Gewali, Paul Y. Oh, Jevis Dancel, Deysaree Mamaud, Dorothy Pirosch, Farin Tavacoli, Felix Zhan, Robbie Pearce, Margaret Novack, Hokunani Keehu","doi":"10.1109/ICBK.2019.00009","DOIUrl":"https://doi.org/10.1109/ICBK.2019.00009","url":null,"abstract":"Security is a universal concern across a multitude of sectors involved in the transfer and storage of computerized data. In the realm of cryptography, random number generators (RNGs) are integral to the creation of encryption keys that protect private data, and the production of uniform probability outcomes is a revenue source for certain enterprises (most notably the casino industry). Arbitrary thread schedule reconstruction of compare-and-swap operations is used to generate input traces for the Blum-Elias algorithm as a method for constructing random sequences, provided the compare-and-swap operations avoid cache locality. Threads accessing shared memory at the memory controller is a true random source which can be polled indirectly through our algorithm with unlimited parallelism. A theoretical and experimental analysis of the observation and reconstruction algorithm are considered. The quality of the random number generator is experimentally analyzed using two standard test suites, DieHarder and ENT, on three data sets.","PeriodicalId":383917,"journal":{"name":"2019 IEEE International Conference on Big Knowledge (ICBK)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130265248","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 9

Adversarial Graph Attention Network for Multi-modal Cross-Modal Retrieval 多模态跨模态检索的对抗性图注意网络

2019 IEEE International Conference on Big Knowledge (ICBK) Pub Date : 2019-11-01 DOI: 10.1109/ICBK.2019.00043

Hongchang Wu, Ziyu Guan, Tao Zhi, Wei Zhao, Cai Xu, Hong Han, Yaming Yang

{"title":"Adversarial Graph Attention Network for Multi-modal Cross-Modal Retrieval","authors":"Hongchang Wu, Ziyu Guan, Tao Zhi, Wei Zhao, Cai Xu, Hong Han, Yaming Yang","doi":"10.1109/ICBK.2019.00043","DOIUrl":"https://doi.org/10.1109/ICBK.2019.00043","url":null,"abstract":"Existing cross-modal retrieval methods are mainly constrained to the bimodal case. When applied to the multi-modal case, we need to train O(K^2) (K: number of modalities) separate models, which is inefficient and unable to exploit common information among multiple modalities. Though some studies focused on learning a common space of multiple modalities for retrieval, they assumed data to be i.i.d. and failed to learn the underlying semantic structure which could be important for retrieval. To tackle this issue, we propose an extensive Adversarial Graph Attention Network for Multi-modal Cross-modal Retrieval (AGAT). AGAT synthesizes a self-attention network (SAT), a graph attention network (GAT) and a multi-modal generative adversarial network (MGAN). The SAT generates high-level embeddings for data items from different modalities, with self-attention capturing feature-level correlations in each modality. The GAT then uses attention to aggregate embeddings of matched items from different modalities to build a common embedding space. The MGAN aims to \"cluster\" matched embeddings of different modalities in the common space by forcing them to be similar to the aggregation. Finally, we train the common space so that it captures the semantic structure by constraining within-class/between-class distances. Experiments on three datasets show the effectiveness of AGAT.","PeriodicalId":383917,"journal":{"name":"2019 IEEE International Conference on Big Knowledge (ICBK)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130393826","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Unsupervised Keyword Extraction Method Based on Chinese Patent Clustering 基于中文专利聚类的无监督关键字提取方法

2019 IEEE International Conference on Big Knowledge (ICBK) Pub Date : 2019-11-01 DOI: 10.1109/ICBK.2019.00048

Yuxin Xie, Xuegang Hu, Yuhong Zhang, Shi Li

引用次数: 2

Research on Incentive Algorithm of Participatory Sensing System Based on Location 基于位置的参与式感知系统激励算法研究

2019 IEEE International Conference on Big Knowledge (ICBK) Pub Date : 2019-11-01 DOI: 10.1109/ICBK.2019.00035

Ziyi Qi, Mingxin Liu, Yanju Liang, Jing Chen

{"title":"Research on Incentive Algorithm of Participatory Sensing System Based on Location","authors":"Ziyi Qi, Mingxin Liu, Yanju Liang, Jing Chen","doi":"10.1109/ICBK.2019.00035","DOIUrl":"https://doi.org/10.1109/ICBK.2019.00035","url":null,"abstract":"At present, user participation as the main body of the perception system will bring the problems that include consuming user's time, energy and participation costs, and so on. Therefore, giving reasonable feedback and encouragement to user participation itself can effectively improve user's initiative and data quality. Combining data quantity, data distribution and budget constraint together, an improved incentive mechanism of reverse auction is proposed based on the structure of participatory sensing system in this paper. Firstly, to maximize the coverage rate and the number of samples as the optimization goal, a model combining the dynamic reverse auction incentive strategy is designed based on the limited budget of the task provider. Secondly, on the basis of optimizing the results of sample screening, the improved algorithm KDA incentive mechanism based on position information is proposed. The algorithm combines the greedy algorithm to gradually decompose the idea of subproblem optimization, in order to ensure that the optimization results are closer to the final goal. Finally, the algorithm is verified, the experimental results show that the proposed algorithm can improve the sample number and coverage under limited budget constraints, and improve the quality of the best sample set.","PeriodicalId":383917,"journal":{"name":"2019 IEEE International Conference on Big Knowledge (ICBK)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129504530","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Matrix Profile XX: Finding and Visualizing Time Series Motifs of All Lengths using the Matrix Profile 矩阵配置文件XX:使用矩阵配置文件查找和可视化所有长度的时间序列图案

2019 IEEE International Conference on Big Knowledge (ICBK) Pub Date : 2019-11-01 DOI: 10.1109/ICBK.2019.00031

Frank Madrid, Shima Imani, Ryan Mercer, Zachary Schall-Zimmerman, N. S. Senobari, Eamonn J. Keogh

{"title":"Matrix Profile XX: Finding and Visualizing Time Series Motifs of All Lengths using the Matrix Profile","authors":"Frank Madrid, Shima Imani, Ryan Mercer, Zachary Schall-Zimmerman, N. S. Senobari, Eamonn J. Keogh","doi":"10.1109/ICBK.2019.00031","DOIUrl":"https://doi.org/10.1109/ICBK.2019.00031","url":null,"abstract":"Many time series analytic tasks can be reduced to discovering and then reasoning about conserved structures, or time series motifs. Recently, the Matrix Profile has emerged as the state-of-the-art for finding time series motifs, allowing the community to efficiently find time series motifs in large datasets. The matrix profile reduced time series motif discovery to a process requiring a single parameter, the length of time series motifs we expect (or wish) to find. In many cases this is a reasonable limitation as the user may utilize out-of-band information or domain knowledge to set this parameter. However, in truly exploratory data mining, a poor choice of this parameter can result in failing to find unexpected and exploitable regularities in the data. In this work, we introduce the Pan Matrix Profile, a new data structure which contains the nearest neighbor information for all subsequences of all lengths. This data structure allows the first truly parameter-free motif discovery algorithm in the literature. The sheer volume of information produced by our representation may be overwhelming; thus, we also introduce a novel visualization tool called the motif-heatmap which allows the users to discover and reason about repeated structures at a glance. We demonstrate our ideas on a diverse set of domains including seismology, bioinformatics, transportation and biology.","PeriodicalId":383917,"journal":{"name":"2019 IEEE International Conference on Big Knowledge (ICBK)","volume":"73 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115218481","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 31