{"title":"从未标记的视频估计心率","authors":"John Gideon, Simon Stent","doi":"10.1109/ICCVW54120.2021.00307","DOIUrl":null,"url":null,"abstract":"We describe our entry for the ICCV 2021 Vision4Vitals Workshop [6] heart rate challenge, in which the goal is to estimate the heart rate of human subjects from facial video. While the challenge dataset contains extensive training data with ground truth blood pressure and heart rate signals, and therefore affords supervised learning, we pursue a different approach. We disregard the available ground truth blood pressure data entirely and instead seek to learn the photoplethysomgraphy (PPG) signal visible in subjects’ faces via a self-supervised contrastive learning technique. Since this approach does not require ground truth data, and since the challenge competition rules allow it, we therefore can train directly on test set videos. To boost performance further, we learn a supervised heart rate estimator on top of our \"dis-covered\" PPG signal, which more explicitly tries to match the ground truth heart rate. Our final approach ranked first on the competition test set, achieving a mean absolute error of 9.22 beats per minute.","PeriodicalId":226794,"journal":{"name":"2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW)","volume":"81 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":"{\"title\":\"Estimating Heart Rate from Unlabelled Video\",\"authors\":\"John Gideon, Simon Stent\",\"doi\":\"10.1109/ICCVW54120.2021.00307\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We describe our entry for the ICCV 2021 Vision4Vitals Workshop [6] heart rate challenge, in which the goal is to estimate the heart rate of human subjects from facial video. While the challenge dataset contains extensive training data with ground truth blood pressure and heart rate signals, and therefore affords supervised learning, we pursue a different approach. We disregard the available ground truth blood pressure data entirely and instead seek to learn the photoplethysomgraphy (PPG) signal visible in subjects’ faces via a self-supervised contrastive learning technique. Since this approach does not require ground truth data, and since the challenge competition rules allow it, we therefore can train directly on test set videos. To boost performance further, we learn a supervised heart rate estimator on top of our \\\"dis-covered\\\" PPG signal, which more explicitly tries to match the ground truth heart rate. Our final approach ranked first on the competition test set, achieving a mean absolute error of 9.22 beats per minute.\",\"PeriodicalId\":226794,\"journal\":{\"name\":\"2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW)\",\"volume\":\"81 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"10\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICCVW54120.2021.00307\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCVW54120.2021.00307","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
We describe our entry for the ICCV 2021 Vision4Vitals Workshop [6] heart rate challenge, in which the goal is to estimate the heart rate of human subjects from facial video. While the challenge dataset contains extensive training data with ground truth blood pressure and heart rate signals, and therefore affords supervised learning, we pursue a different approach. We disregard the available ground truth blood pressure data entirely and instead seek to learn the photoplethysomgraphy (PPG) signal visible in subjects’ faces via a self-supervised contrastive learning technique. Since this approach does not require ground truth data, and since the challenge competition rules allow it, we therefore can train directly on test set videos. To boost performance further, we learn a supervised heart rate estimator on top of our "dis-covered" PPG signal, which more explicitly tries to match the ground truth heart rate. Our final approach ranked first on the competition test set, achieving a mean absolute error of 9.22 beats per minute.