{"title":"Meta-path Enhanced Knowledge Graph Convolutional Network for Recommender Systems","authors":"Ru Wang, Meng Wu, Shengwei Ji","doi":"10.1109/ICKG52313.2021.00024","DOIUrl":"https://doi.org/10.1109/ICKG52313.2021.00024","url":null,"abstract":"Knowledge Graph (KG) is a directed heterogeneous information network that contains a large number of entities and relations, which is widely used as effective side information in rec-ommender systems. Moreover, in recommender systems, the Graph Convolutional Network (GCN) model is introduced to mine the relatedness between entities in a KG because of its efficiency in extracting spatial features on topological graphs. The Knowledge Graph Convolutional Network (KGCN) model up-dates the embedding of a currently positioned entity by aggregating the information of adjacent entities selected randomly. Never-theless, it has two limititations: 1) the information of neighbors se-lected randomly cannot accurately represent the current entity in the KG; 2) the model is hard to converge as graph features (i.e. The spatial relation features and semantic information features of en-tities in the KG) grow. To solve these limitations, in this paper, a meta-path (i.e., a sequence of artificially constructed relationships) is introduced into the selection of neighbors in the KGCN model to enhance the representation of each entity. Furthermore, two construction methods of the meta-path - constructing a meta-path based on the same relation (KGCN-SP) and the characteris-tics of KG (KGCN-MP) -are proposed. The experiments based on three real-world datasets demonstrate that the neighbor selection based on the meta-path is able to collect more accurate infor-mation from a KG and improve the recommendation performance effectively.","PeriodicalId":174126,"journal":{"name":"2021 IEEE International Conference on Big Knowledge (ICBK)","volume":"46 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128725296","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An Ensemble Latent Factor Model for Highly Accurate Web Service QoS Prediction","authors":"Peng Zhang, Yi He, Di Wu","doi":"10.1109/ICKG52313.2021.00055","DOIUrl":"https://doi.org/10.1109/ICKG52313.2021.00055","url":null,"abstract":"How to accurately predict quality of service (QoS) data is a great challenge in Web service selection or recommen-dation. To date, a latent factor (LF)-based QoS predictor is one of the most successful and popular approaches to address this chal-lenge as its high efficiency and scalability. However, current LF -based QoS predictors are mostly developed on inner product space with an L2 norm-oriented loss function only, thereby they cannot comprehensively represent target QoS data's characteris-tics to make accurate prediction as inner product space and L2 norm have their respective limitations. To address this issue, this study proposes an ensemble LF (ELF) model. It has three-fold ideas: 1) two kinds of LF models are developed as QoS predictors on inner product space and distance space, respectively, 2) both of these two QoS predictors adopt an Ll-and-L2-norm-oriented loss function, and 3) building an ensemble of these two QoS predictors by a weighting strategy. By doing so, ELF integrates multi-merits originating from inner product space, distance space, L1 norm, and L2 norm, making it achieve highly accurate and robust QoS prediction. Experiments on a real-world QoS dataset demonstrate that the proposed ELF model outperforms state-of-the-art QoS predictors in predicting the missing QoS data.","PeriodicalId":174126,"journal":{"name":"2021 IEEE International Conference on Big Knowledge (ICBK)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116865137","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Implicit Business Competitor Inference Using Heterogeneous Knowledge Graph","authors":"Wei Qin, Xiangfeng Luo, Hao Wang","doi":"10.1109/ICKG52313.2021.00035","DOIUrl":"https://doi.org/10.1109/ICKG52313.2021.00035","url":null,"abstract":"Competitor inference is the task of identifying current or potential competitors given their primary markets and Business Scope. Previous methods have achieved remarkable success on explicit competitor inference using state-of-the-art natural language processing (NLP) techniques, mainly relying on comparative expressions. However, those methods lack interpretability and cannot identify implicit competitors without the explicit mentions of competitive relationships in the text. To remedy these problems, in this paper, we propose a probabilistic graphical model which leverages heterogeneous enterprise knowledge graph containing both structured information, e.g., Product Analysis, Sales Territory, and unstructured information, e.g., Business Scope. The model is defined with first-order logic rules using the declarative language of Probabilistic Soft Logic (PSL). As a result, our model enables predicting implicit competitors while provides pieces of interpretable evidence. Experimental results show that our approach is significantly superior to previous methods.","PeriodicalId":174126,"journal":{"name":"2021 IEEE International Conference on Big Knowledge (ICBK)","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123530516","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Durga Prasad Ganta, Himel Das Gupta, Victor S. Sheng
{"title":"Knowledge Distillation via Weighted Ensemble of Teaching Assistants","authors":"Durga Prasad Ganta, Himel Das Gupta, Victor S. Sheng","doi":"10.1109/ICKG52313.2021.00014","DOIUrl":"https://doi.org/10.1109/ICKG52313.2021.00014","url":null,"abstract":"Knowledge distillation in machine learning is the process of transferring knowledge from a large model called teacher to a smaller model called student. Knowledge distillation is one of the techniques to compress the large network (teacher) to a smaller network (student) that can be deployed in small devices such as mobile phones. When the network size gap between the teacher and student increases, the performance of the student network decreases. To solve this problem, an intermediate model is employed between the teacher model and the student model known as the teaching assistant model, which in turn bridges the gap between the teacher and the student. In this research, we have shown that using multiple teaching assistant models, the student model (the smaller model) can be further improved. We combined these multiple teaching assistant model using weighted ensemble learning where we have used a differential evaluation optimization algorithm to generate the weight values.","PeriodicalId":174126,"journal":{"name":"2021 IEEE International Conference on Big Knowledge (ICBK)","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122162794","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Accelerating Learning Bayesian Network Structures by Reducing Redundant CI Tests","authors":"Wentao Hu, Shuai Yang, Xianjie Guo, Kui Yu","doi":"10.1109/ICKG52313.2021.00016","DOIUrl":"https://doi.org/10.1109/ICKG52313.2021.00016","url":null,"abstract":"The type of constraint-based methods is one of the most important approaches to learn Bayesian network (BN) structures from observational data with conditional independence (CI) tests. In this paper, we find that existing constraint-based methods often perform many redundant CI tests, which significantly reduces the learning efficiency of those algorithms. To tackle this issue, we propose a novel framework to accelerate BN structure learning by reducing redundant CI tests without sacrificing accuracy. Specifically, we first design a CI test cache table to store CI tests. If a CI test has been computed before, the result of the CI test is obtained from the table instead of computing the CI test again. If not, the CI test is computed and stored in the table. Then based on the table, we propose two CI test cache table based PC (CTPC) learning frameworks for reducing redundant CI tests for BN structure learning. Finally, we instantiate the proposed frameworks with existing well-established local and global BN structure learning algorithms. Using twelve benchmark BNs, the extensive experiments have demonstrated that the proposed frameworks can significantly accelerate existing BN structure learning algorithms without sacrificing accuracy.","PeriodicalId":174126,"journal":{"name":"2021 IEEE International Conference on Big Knowledge (ICBK)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122182656","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"ICBK 2021 Programme Committee","authors":"","doi":"10.1109/ickg52313.2021.00007","DOIUrl":"https://doi.org/10.1109/ickg52313.2021.00007","url":null,"abstract":"","PeriodicalId":174126,"journal":{"name":"2021 IEEE International Conference on Big Knowledge (ICBK)","volume":"187 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131746002","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Liangzhu Zhou, Xingrui Zhuo, Gongqing Wu, Zan Zhang, Xianyu Bao
{"title":"Research on Crowdsourcing Truth Inference Method Based on Graph Embedding","authors":"Liangzhu Zhou, Xingrui Zhuo, Gongqing Wu, Zan Zhang, Xianyu Bao","doi":"10.1109/ICKG52313.2021.00036","DOIUrl":"https://doi.org/10.1109/ICKG52313.2021.00036","url":null,"abstract":"Crowdsourcing is a cheap and popular method to solve problems that are difficult for computers to handle. Due to the differences in ability among workers on crowdsourcing platforms, existing research use aggregation strategies to deal with the labels of different workers to improve the utility of crowdsourcing data. However, most of these studies are based on probabilistic graphical models, which have problems such as difficulty in setting initial parameters. This paper proposes a novel crowdsourcing method Truth Inference based on Graph Embedding (TIGE) for single-choice questions, the method draws on the idea of graph autoencoder, constructs feature vectors for each crowdsourcing task, embeds the relationship between crowdsourcing tasks and workers in graphs, then uses graph neural networks to convert crowdsourcing problems into graph node prediction problems. The feature vectors are continuously optimized in the convolutional layer to obtain the final result. Compared with the six state-of-the-art algorithms on real-world datasets, our method has significant advantages in accuracy and F1-score.","PeriodicalId":174126,"journal":{"name":"2021 IEEE International Conference on Big Knowledge (ICBK)","volume":"239 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124627431","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Novel Homophily-aware Correction Approach for Crowdsourced Labels Using Information Entropy","authors":"Kang Yan, Jian Lu, Qingren Wang, Wei Li","doi":"10.1109/ICKG52313.2021.00013","DOIUrl":"https://doi.org/10.1109/ICKG52313.2021.00013","url":null,"abstract":"Crowdsourcing provides a cost effective and conve-nient way for label collection. However, it fails to guarantee the quality of crowdsourced labels. Inspired by homophily in social networks denoting the tendency of individuals with similar char-acteristics to be friends with each other, in this paper we propose a novel Homophily-aware Correction Approach for crowdsourced labels using Information Entropy (namely HaCAIE), to further achieve quality improvement of crowdsourced labels. Specifically, Our HaCAIE can be decomposed into three phases: $i$) seeking full semantic relations among entities, where HaCAIE models multiple explicit and implicit semantic relations among labelers, tasks and categories, based on homogeneous information network and related techniques; ii) calculating homophily, where HaCAIE utilizes adjacent relation matrices of labelers and tasks to calculate homophily among labelers; and iii) correcting labels, where for each task, HaCAIE employs information entropy and constructs a corresponding star homophily network to perform label correction. Our experimental results on six real-world datasets not only show that our HaCAIE performs well, but also demonstrate that HaCAIE can collaborate well with different inference algorithms in the field of crowdsourcing.","PeriodicalId":174126,"journal":{"name":"2021 IEEE International Conference on Big Knowledge (ICBK)","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130740341","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Influence Maximization Using User Connectivity Guarantee in Social Networks","authors":"Xiyu Qiao, Yuliang Ma, Yelie Yuan, Xiangmin Zhou","doi":"10.1109/ICKG52313.2021.00056","DOIUrl":"https://doi.org/10.1109/ICKG52313.2021.00056","url":null,"abstract":"With the rapid development of social networks, the influence maximization problem has attracted more and more attention from academia and industry. Its aim is to find a set of nodes as seeds to spread the influence as widely as possible. However, most of the existing researches neglected the connectivity of seeds, which has effect on the process of information diffusion. In this paper, we propose a novel problem, connectivity guaranteed influence maximization, which suggests a fixed number of new links to the seed set with the aim of maximizing the influence of seed nodes while guaranteeing the connectivity of the induced subgraphs consisting of active nodes. To tackle this problem, we propose a Connectivity Guaranteed Influence Maximization (CGIM) algorithm based on user connec-tivity and link recommendation. Specifically, Jaccard coefficient is first used to calculate the influence between users. Then a Connectivity Guarantee based Link Addition (CGLA) algorithm is proposed to keep the connectivity of the induced sub graphs formed by all active nodes after influence propagation. Following that, an improved approximate influence maximization algorithm is proposed to maximize the influence by recommending a number of new links to the seed set. Experimental results on real social network datasets show that the proposed CGIM algorithm can maximize the influence of seed nodes while guarantee user connectivity. and has good performance and scalability.","PeriodicalId":174126,"journal":{"name":"2021 IEEE International Conference on Big Knowledge (ICBK)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129450183","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Genetic Algorithm for Residual Static Correction","authors":"Miao Wu, Shulin Pan, Fan Min","doi":"10.1109/ICKG52313.2021.00069","DOIUrl":"https://doi.org/10.1109/ICKG52313.2021.00069","url":null,"abstract":"Residual static correction is a necessary step to improve the resolution in the seismic exploration process. It is a challenging task because a large number of parameters need to be adjusted. Some machine learning methods have been proposed to deal with this problem, but the results should be further strengthened. In this paper, we propose the genetic-based residual static correction (GBRS) algorithm with three techniques. First, the original encodings is generated by per-forming floating encoding on the offset of each point. Second, a new encodings is constructed through paired crossover on the original ones. Third, the fitness function is used to select new original encodings to promote the evolution of the population. Experiment data with 50 shots and 50 receivers are generated using a simulation model. Results show that our algorithm usually converges in less 100 iterations to the optimal solution.","PeriodicalId":174126,"journal":{"name":"2021 IEEE International Conference on Big Knowledge (ICBK)","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114379648","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}