{"title":"SGKD: A Scalable and Effective Knowledge Distillation Framework for Graph Representation Learning","authors":"Yufei He, Yao Ma","doi":"10.1109/ICDMW58026.2022.00091","DOIUrl":"https://doi.org/10.1109/ICDMW58026.2022.00091","url":null,"abstract":"As Graph Neural Networks (GNNs) are widely used in various fields, there is a growing demand for improving their efficiency and scalablity. Knowledge Distillation (KD), a classical methods for model compression and acceleration, has been gradually introduced into the field of graph learning. More recently, it has been shown that, through knowledge distillation, the predictive capability of a well-trained GNN model can be transferred to lightweight and easy-to-deploy MLP models. Such distilled MLPs are able to achieve comparable performance as their corresponding G NN teachers while being significantly more efficient in terms of both space and time. However, the research of KD for graph learning is still in its early stage and there exist several limitations in the existing KD framework. The major issues lie in distilled MLPs lack useful information about the graph structure and logits of teacher are not always reliable. In this paper, we propose a Scalable and effective graph neural network Knowledge Distillation framework (SGKD) to address these issues. Specifically, to include the graph, we use feature propagation as preprocessing to provide MLPs with graph structure-aware features in the original feature space; to address unreliable logits of teacher, we introduce simple yet effective training strategies such as masking and temperature. With these innovations, our framework is able to be more effective while remaining scalable and efficient in training and inference. We conducted comprehensive experiments on eight datasets of different sizes - up to 100 million nodes - under various settings. The results demonstrated that SG KD is able to significantly outperform existing KD methods and even achieve comparable performance with their state-of-the-art GNN teachers.","PeriodicalId":146687,"journal":{"name":"2022 IEEE International Conference on Data Mining Workshops (ICDMW)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125519595","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Fabian Fingerhut, Chaitra Harsha, Amirmohammad Eghbalian, Tom Jacobs, Mahdi Tabassian, R. Verbeke, E. Tsiporkova
{"title":"Data-Driven Usage Profiling and Anomaly Detection in Support of Sustainable Machining Processes","authors":"Fabian Fingerhut, Chaitra Harsha, Amirmohammad Eghbalian, Tom Jacobs, Mahdi Tabassian, R. Verbeke, E. Tsiporkova","doi":"10.1109/ICDMW58026.2022.00026","DOIUrl":"https://doi.org/10.1109/ICDMW58026.2022.00026","url":null,"abstract":"There is a lot of room for improvement towards more sustainability in manufacturing companies. During the machining operations, replacement of the cutting tools is not done in an optimal way, resulting in sub-optimal usage of resources and inefficiencies during the production process. Using data-driven approaches to extend the usage of tools can greatly improve on this shortcoming by optimizing the replacement process of these tools. This study is therefore sought to investigate the value of several data-driven approaches, applied to an industrial dataset, to achieve this goal. Although the examined data-driven methods were applied to a dataset which has been generated under a wide variety of machining conditions and lacks reliable ground truth, the obtained experimental results confirm that these methods are indeed capable of extracting informative profiles from the tool usages and can identify anomalous patterns and signs in the time-series datasets collected during different machining processes.","PeriodicalId":146687,"journal":{"name":"2022 IEEE International Conference on Data Mining Workshops (ICDMW)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127700004","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Stress Identification in Online Social Networks","authors":"Ashok Kumar, T. Trueman, E. Cambria","doi":"10.1109/ICDMW58026.2022.00063","DOIUrl":"https://doi.org/10.1109/ICDMW58026.2022.00063","url":null,"abstract":"Online social networks have become one of the primary ways of communication to individuals. It rapidly gen-erates a large volume of textual and non-textual data such as images, audio, and videos. In particular, textual data plays a vital role in detecting mental health-related problems such as stress, depression, anxiety, and emotional and behavioral disorders. In this paper, we identify the mental stress of online users in social networks using a transformers-based RoBERTa model and an autoregressive model, also called XLNet. We implement this model in both a constrained system and an unconstrained system. The constrained system maintains the gold standard datasets such as training, validation, and testing. On the other hand, the unconstrained system divides the given dataset into user-specific training, validation, and test sets. Our results indicate that the proposed transformers-based RoBERTa model achieves a better result in both constrained and unconstrained systems than the state-of-the-art models.","PeriodicalId":146687,"journal":{"name":"2022 IEEE International Conference on Data Mining Workshops (ICDMW)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127725724","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Compression Methods for Transformers in Multidomain Sentiment Analysis","authors":"Wojciech Korczynski, Jan Kocoń","doi":"10.1109/ICDMW58026.2022.00062","DOIUrl":"https://doi.org/10.1109/ICDMW58026.2022.00062","url":null,"abstract":"Transformer models like BERT have significantly improved performance on many NLP tasks, e.g., sentiment analysis. However, their large number of parameters makes real-world applications difficult because of computational costs and latency. Many compression methods have been proposed to solve this problem using quantization, weight pruning, and knowledge distillation. In this work, we explore some of these task-specific and task-agnostic methods by comparing their effectiveness and quality on the MultiEmo sentiment analysis dataset. Additionally, we analyze their ability to generalize and capture sentiment features by conducting domain-sentiment experiments. The results show that the compression methods reduce the model size by 8.6 times and the inference time by 6.9 times compared to the original model while maintaining unimpaired quality. Smaller models perform better on tasks with fewer data and retain more remarkable generalization ability after fine-tuning because they are less prone to overfitting. The best trade-off is obtained using the task-agnostic XtremeDistil model.","PeriodicalId":146687,"journal":{"name":"2022 IEEE International Conference on Data Mining Workshops (ICDMW)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128068324","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Disentangling the Information Flood on OSNs: Finding Notable Posts and Topics","authors":"P. Caso, Martino Trevisan, L. Vassio","doi":"10.1109/ICDMW58026.2022.00152","DOIUrl":"https://doi.org/10.1109/ICDMW58026.2022.00152","url":null,"abstract":"Online Social Networks (OSN s) are an integral part of modern life for sharing thoughts, stories, and news. An ecosystem of influencers generates a flood of content in the form of posts, some of which have an unusually high level of engagement with the influencer's fan base. These posts relate to blossoming topics of discussion that generate particular interest among users: The COVID-19 pandemic is a prominent example. Studying these phenomena provides an understanding of the OSN landscape and requires appropriate methods. This paper presents a methodology to discover notable posts and group them according to their related topic. By combining anomaly detection, graph modelling and community detection techniques, we pinpoint salient events automatically, with the ability to tune the amount of them. We showcase our approach using a large Instagram dataset and extract some notable weekly topics that gained momentum from 1.4 million posts. We then illustrate some use cases ranging from the COVID-19 outbreak to sporting events.","PeriodicalId":146687,"journal":{"name":"2022 IEEE International Conference on Data Mining Workshops (ICDMW)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129588418","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Shi Jer Low, V. Raghavan, H. Gopalan, Jian Cheng Wong, J. Yeoh, C. Ooi
{"title":"FastFlow: AI for Fast Urban Wind Velocity Prediction","authors":"Shi Jer Low, V. Raghavan, H. Gopalan, Jian Cheng Wong, J. Yeoh, C. Ooi","doi":"10.1109/ICDMW58026.2022.00028","DOIUrl":"https://doi.org/10.1109/ICDMW58026.2022.00028","url":null,"abstract":"Data-driven approaches, including deep learning, have shown great promise as surrogate models across many domains, including computer vision and natural language pro-cessing. These extend to various areas in sustainability, including for satellite image analysis to obtain information such as land usage and extent of development. An interesting direction for which data-driven methods have not been applied much yet is in the quick quantitative evaluation of urban layouts for planning and design. In particular, urban designs typically involve complex trade-offs between multiple objectives, including limits on urban build-up and/or consideration of urban heat island effect. Hence, it can be beneficial to urban planners to have a fast surrogate model to predict urban characteristics of a hypothetical layout, e.g. pedestrian-level wind velocity, without having to run compu-tationally expensive and time-consuming high-fidelity numerical simulations each time. This fast surrogate can then be potentially integrated into other design optimization frameworks, including generative models or other gradient-based methods. Here we present an investigation into the use of convolutional neural networks as a surrogate for urban layout characterization that is typically done via high-fidelity numerical simulation. We then further apply this model towards a first demonstration of its utility for data-driven pedestrian-level wind velocity prediction. The data set in this work comprises results from high-fidelity numerical simulations of wind velocities for a diverse set of realistic urban layouts, based on randomized samples from a real-world, highly built-up urban city. We then provide prediction results obtained from the neural network trained on this data-set, demonstrating test errors of under 0.1 m/s for previously unseen novel urban layouts. We further illustrate how this can be useful for purposes such as rapid evaluation of pedestrian wind velocity for a potential new layout. In addition, it is hoped that this data set will further inspire, facilitate and accelerate research in data-driven urban AI, even as our baseline model facilitates quantitative comparison to future, more innovative methods.","PeriodicalId":146687,"journal":{"name":"2022 IEEE International Conference on Data Mining Workshops (ICDMW)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115061443","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
André Picado, A. Finamore, Ana Moura Santos, C. Antunes
{"title":"Students Temporal Profiling and e-Learning Resources Recommendation Based on NLP's Terms Extraction","authors":"André Picado, A. Finamore, Ana Moura Santos, C. Antunes","doi":"10.1109/ICDMW58026.2022.00044","DOIUrl":"https://doi.org/10.1109/ICDMW58026.2022.00044","url":null,"abstract":"Online education has gained significant relevance over the last few years, and the pandemic situation has brought evidence that it plays a fundamental role nowadays. However, even with the increasing number of students enrolled in online courses, these still do not allow for enough personalization, often leading students to become demotivated and dropping out. The goal of better adapting online courses to students aims to support them in an inclusive and equitable way, since the learners are often students from quite diverse backgrounds. The continuous demand for online learning, and the need to customize it according to the students' profile has led to a succession of attempts at recommendation systems. Nevertheless, many of them were entirely based on collaborative filtering, almost ignoring profiling requirements. In this paper, we propose a recommendation system to be integrated into MOOCs (Massive Open Online Courses), following a hybrid architecture. In our proposal, learning resources are described by a set of terms, extracted directly from the supporting texts in the MOOC. From these terms, those which are included in the exercises will be used to specify the important skills learners must acquire, and the results achieved by each learner in them are used to characterize the particular student's state, at a given moment. Those states are then used to make the recommendation collaboratively, allowing for different recommendations for each particular student over time. The system is validated across several MOOCs.","PeriodicalId":146687,"journal":{"name":"2022 IEEE International Conference on Data Mining Workshops (ICDMW)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123211161","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yanan Xiao, Minyu Liu, Zichen Zhang, Lu Jiang, Minghao Yin, Jianan Wang
{"title":"Streaming Traffic Flow Prediction Based on Continuous Reinforcement Learning","authors":"Yanan Xiao, Minyu Liu, Zichen Zhang, Lu Jiang, Minghao Yin, Jianan Wang","doi":"10.1109/ICDMW58026.2022.00011","DOIUrl":"https://doi.org/10.1109/ICDMW58026.2022.00011","url":null,"abstract":"Traffic flow prediction is an important part of smart transportation. The goal is to predict future traffic conditions based on historical data recorded by sensors and the traffic net-work. As the city continues to build, parts of the transportation network will be added or modified. How to accurately predict expanding and evolving long-term streaming networks is of great significance. To this end, we propose a new simulation-based criterion that considers teaching autonomous agents to mimic sensor patterns, planning their next visit based on the sensor's profile (e.g., traffic, speed, occupancy). The data recorded by the sensor is most accurate when the agent can perfectly simulate the sensor's activity pattern. We propose to formulate the problem as a continuous reinforcement learning task, where the agent is the next flow value predictor, the action is the next time-series flow value in the sensor, and the environment state is a dynamically fused representation of the sensor and transportation network. Actions taken by the agent change the environment, which in turn forces the agent's mode to update, while the agent further explores changes in the dynamic traffic network, which helps the agent predict its next visit more accurately. Therefore, we develop a strategy in which sensors and traffic networks update each other and incorporate temporal context to quantify state representations evolving over time. Along these lines, we propose streaming traffic flow prediction based on continuous reinforcement learning model (ST-CRL), a kind of predictive model based on reinforcement learning and continuous learning, and an analytical algorithm based on KL divergence that cleverly incorporates long-term novel patterns into model induction. Second, we introduce a prioritized experience replay strategy to consolidate and aggregate previously learned core knowledge into the model. The proposed model is able to continuously learn and predict as the traffic flow network expands and evolves over time. Extensive experiments show that the algorithm has great potential in predicting long-term streaming media networks, while achieving data privacy protection to a certain extent.","PeriodicalId":146687,"journal":{"name":"2022 IEEE International Conference on Data Mining Workshops (ICDMW)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123908499","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Knowledge Distillation-enabled Multi-stage Incremental Learning for Online Process Monitoring in Advanced Manufacturing","authors":"Zhangyue Shi, Yuxuan Li, Chenang Liu","doi":"10.1109/ICDMW58026.2022.00154","DOIUrl":"https://doi.org/10.1109/ICDMW58026.2022.00154","url":null,"abstract":"In advanced manufacturing, the incorporation of online sensing technologies has enabled great potentials to achieve effective in-situ process monitoring via machine learning-based approaches. In manufacturing practice, the online sensor data are usually collected in a progressive manner, and the stream data collected at latter stages may also contain informative knowledge for process monitoring. Therefore, it is highly valuable to make the machine learning-based monitoring model learn incrementally in manufacturing. To achieve this goal, this paper develops a multi-stage incremental learning approach enabled by the knowledge distillation, which distills representative information from the machine learning model trained at early/offline stage and then enhances the monitoring performance at the latter stages. To validate its effectiveness, a real-world case study in additive manufacturing, which is an emerging advanced manufacturing technology, is conducted. The experimental results show that the developed knowledge distillation-enabled multi-stage incremental learning is very promising to improve the online monitoring performance in advanced manufacturing.","PeriodicalId":146687,"journal":{"name":"2022 IEEE International Conference on Data Mining Workshops (ICDMW)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122634163","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Using Genetic Programming to Identify Probability Distribution behind Data: A Preliminary Trial","authors":"Yang Syu, Chien-Min Wang","doi":"10.1109/ICDMW58026.2022.00056","DOIUrl":"https://doi.org/10.1109/ICDMW58026.2022.00056","url":null,"abstract":"Before conducting any further applications or performing more advanced processing, analyzing and realizing the probability distribution of data is a crucial task. Traditionally, statistical methods are being developed for this procedure. In recent years, researchers in computer science have proposed and applied different machine learning-based techniques to address the abovementioned problem. However, the existing solutions remain problematic and inconvenient, such as the need for human intervention and the complexity of the resulting models. Thus, in this paper, without causing deficiency and inconvenience, a genetic programming-based approach for the identification of probability functions is proposed, implemented, and tested. Based on our empirical trials, in an immense search space of mathematical expressions, the proposed and developed approach can effectively recognize (retrieve) the probability distribution function behind data.","PeriodicalId":146687,"journal":{"name":"2022 IEEE International Conference on Data Mining Workshops (ICDMW)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122815245","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}