Role of Mixup in Topological Persistence-Based Knowledge Distillation for Wearable Sensor Data

IF 4.3 2区综合性期刊 Q1 ENGINEERING, ELECTRICAL & ELECTRONIC

IEEE Sensors Journal Pub Date : 2024-12-20 DOI:10.1109/JSEN.2024.3517653

Eun Som Jeon;Hongjun Choi;Matthew P. Buman;Pavan Turaga

{"title":"Role of Mixup in Topological Persistence-Based Knowledge Distillation for Wearable Sensor Data","authors":"Eun Som Jeon;Hongjun Choi;Matthew P. Buman;Pavan Turaga","doi":"10.1109/JSEN.2024.3517653","DOIUrl":null,"url":null,"abstract":"The analysis of wearable sensor data has enabled many successes in several applications. To represent the high-sampling rate time series with sufficient detail, the use of topological data analysis (TDA) has been considered, and it is found that TDA can complement other time-series features. Nonetheless, due to the large time consumption and high computational resource requirements of extracting topological features through TDA, it is difficult to deploy topological knowledge in machine learning and various applications. In order to tackle this problem, knowledge distillation (KD) can be adopted, which is a technique facilitating model compression and transfer learning to generate a smaller model by transferring knowledge from a larger network. By leveraging multiple teachers in KD, both time-series and topological features can be transferred, and finally, a superior student using only time-series data is distilled. On the other hand, mixup has been popularly used as a robust data augmentation technique to enhance model performance during training. Mixup and KD employ similar learning strategies. In KD, the student model learns from the smoothed distribution generated by the teacher model, while mixup creates smoothed labels by blending two labels. Hence, this common smoothness serves as the connecting link that establishes a connection between these two methods. Even though it has been widely studied to understand the interplay between mixup and KD, most of them are focused on image-based analysis only, and it still remains to be understood how mixup behaves in the context of KD for incorporating multimodal data, such as both time-series and topological knowledge using wearable sensor data. In this article, we analyze the role of mixup in KD with time series as well as topological persistence, employing multiple teachers. We present a comprehensive analysis of various methods in KD and mixup, supported by empirical results on wearable sensor data. We observe that applying a mixup to training a student in KD improves performance. We suggest a general set of recommendations to obtain an enhanced student.","PeriodicalId":447,"journal":{"name":"IEEE Sensors Journal","volume":"25 3","pages":"5853-5865"},"PeriodicalIF":4.3000,"publicationDate":"2024-12-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Sensors Journal","FirstCategoryId":"103","ListUrlMain":"https://ieeexplore.ieee.org/document/10811792/","RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}

引用次数: 0

Abstract

The analysis of wearable sensor data has enabled many successes in several applications. To represent the high-sampling rate time series with sufficient detail, the use of topological data analysis (TDA) has been considered, and it is found that TDA can complement other time-series features. Nonetheless, due to the large time consumption and high computational resource requirements of extracting topological features through TDA, it is difficult to deploy topological knowledge in machine learning and various applications. In order to tackle this problem, knowledge distillation (KD) can be adopted, which is a technique facilitating model compression and transfer learning to generate a smaller model by transferring knowledge from a larger network. By leveraging multiple teachers in KD, both time-series and topological features can be transferred, and finally, a superior student using only time-series data is distilled. On the other hand, mixup has been popularly used as a robust data augmentation technique to enhance model performance during training. Mixup and KD employ similar learning strategies. In KD, the student model learns from the smoothed distribution generated by the teacher model, while mixup creates smoothed labels by blending two labels. Hence, this common smoothness serves as the connecting link that establishes a connection between these two methods. Even though it has been widely studied to understand the interplay between mixup and KD, most of them are focused on image-based analysis only, and it still remains to be understood how mixup behaves in the context of KD for incorporating multimodal data, such as both time-series and topological knowledge using wearable sensor data. In this article, we analyze the role of mixup in KD with time series as well as topological persistence, employing multiple teachers. We present a comprehensive analysis of various methods in KD and mixup, supported by empirical results on wearable sensor data. We observe that applying a mixup to training a student in KD improves performance. We suggest a general set of recommendations to obtain an enhanced student.

查看原文本刊更多论文

混合在基于拓扑持久性的可穿戴传感器数据知识蒸馏中的作用

对可穿戴传感器数据的分析已经在许多应用中取得了成功。为了充分详细地表示高采样率时间序列，考虑了拓扑数据分析（TDA）的使用，并发现TDA可以补充其他时间序列特征。然而，由于TDA提取拓扑特征耗时大、计算资源要求高，拓扑知识难以在机器学习和各种应用中部署。为了解决这一问题，可以采用知识蒸馏（knowledge distillation， KD）技术。KD是一种便于模型压缩和迁移学习的技术，通过从较大的网络中迁移知识来生成较小的模型。通过利用KD中的多个教师，时间序列和拓扑特征都可以被转移，最后，只使用时间序列数据的优秀学生被提炼出来。另一方面，混合已被广泛用作鲁棒数据增强技术，以提高模型在训练过程中的性能。Mixup和KD采用类似的学习策略。在KD中，学生模型从教师模型生成的平滑分布中学习，而mixup通过混合两个标签来创建平滑标签。因此，这种共同的平滑性充当了在这两个方法之间建立连接的连接环节。尽管已经对mixup和KD之间的相互作用进行了广泛的研究，但其中大多数都只关注基于图像的分析，并且仍然需要了解mixup在KD背景下的行为，以结合多模态数据，例如使用可穿戴传感器数据的时间序列和拓扑知识。在本文中，我们利用时间序列和拓扑持久性分析了混合在KD中的作用。我们对KD和mixup中的各种方法进行了全面分析，并得到了可穿戴传感器数据的实证结果的支持。我们观察到，应用混合训练学生在KD提高性能。我们提出了一套通用的建议，以获得一个提高学生。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE Sensors Journal 工程技术-工程：电子与电气

CiteScore

7.70

自引率

14.00%

发文量

2058

审稿时长

5.2 months

期刊介绍： The fields of interest of the IEEE Sensors Journal are the theory, design , fabrication, manufacturing and applications of devices for sensing and transducing physical, chemical and biological phenomena, with emphasis on the electronics and physics aspect of sensors and integrated sensors-actuators. IEEE Sensors Journal deals with the following: -Sensor Phenomenology, Modelling, and Evaluation -Sensor Materials, Processing, and Fabrication -Chemical and Gas Sensors -Microfluidics and Biosensors -Optical Sensors -Physical Sensors: Temperature, Mechanical, Magnetic, and others -Acoustic and Ultrasonic Sensors -Sensor Packaging -Sensor Networks -Sensor Applications -Sensor Systems: Signals, Processing, and Interfaces -Actuators and Sensor Power Systems -Sensor Signal Processing for high precision and stability (amplification, filtering, linearization, modulation/demodulation) and under harsh conditions (EMC, radiation, humidity, temperature); energy consumption/harvesting -Sensor Data Processing (soft computing with sensor data, e.g., pattern recognition, machine learning, evolutionary computation; sensor data fusion, processing of wave e.g., electromagnetic and acoustic; and non-wave, e.g., chemical, gravity, particle, thermal, radiative and non-radiative sensor data, detection, estimation and classification based on sensor data) -Sensors in Industrial Practice