{"title":"Time-Series Physiological Data Balancing for Regression","authors":"Hiroki Yoshikawa, A. Uchiyama, T. Higashino","doi":"10.1109/ICAICA52286.2021.9498128","DOIUrl":null,"url":null,"abstract":"Many studies have shown the effectiveness of machine learning in estimating psychological or physiological states using physiological data as input. However, it is ethically and physically difficult to collect a large amount of data without bias in an uncontrolled environment. Specifically, the amount of data in rare cases is especially small compared to common data. Therefore, the distribution bias may cause overfitting in machine learning. In this paper, we propose a SMOTE-based method to alleviate the distribution bias by data augmentation in the regression problem using a dataset containing time-series physiological data. The effectiveness of the proposed method was confirmed for datasets of thermal sensation and core body temperature collected in uncontrolled environments. The results show that our method improves the performance of regression models for minor cases with a bit of decline in the mean average error.","PeriodicalId":121979,"journal":{"name":"2021 IEEE International Conference on Artificial Intelligence and Computer Applications (ICAICA)","volume":"22 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE International Conference on Artificial Intelligence and Computer Applications (ICAICA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICAICA52286.2021.9498128","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
Many studies have shown the effectiveness of machine learning in estimating psychological or physiological states using physiological data as input. However, it is ethically and physically difficult to collect a large amount of data without bias in an uncontrolled environment. Specifically, the amount of data in rare cases is especially small compared to common data. Therefore, the distribution bias may cause overfitting in machine learning. In this paper, we propose a SMOTE-based method to alleviate the distribution bias by data augmentation in the regression problem using a dataset containing time-series physiological data. The effectiveness of the proposed method was confirmed for datasets of thermal sensation and core body temperature collected in uncontrolled environments. The results show that our method improves the performance of regression models for minor cases with a bit of decline in the mean average error.