{"title":"拥抱接触:检测亲子互动","authors":"Metehan Doyran, Ronald Poppe, Albert Ali Salah","doi":"10.1145/3577190.3614147","DOIUrl":null,"url":null,"abstract":"We focus on a largely overlooked but crucial modality for parent-child interaction analysis: physical contact. In this paper, we provide a feasibility study to automatically detect contact between a parent and child from videos. Our multimodal CNN model uses a combination of 2D pose heatmaps, body part heatmaps, and cropped images. Two datasets (FlickrCI3D and YOUth PCI) are used to explore the generalization capabilities across different contact scenarios. Our experiments demonstrate that using 2D pose heatmaps and body part heatmaps yields the best performance in contact classification when trained from scratch on parent-infant interactions. We further investigate the influence of proximity on our classification performance. Our results indicate that there are unique challenges in parent-infant contact classification. Finally, we show that contact rates from aggregating frame-level predictions provide decent approximations of the true contact rates, suggesting that they can serve as an automated proxy for measuring the quality of parent-child interactions. By releasing the annotations for the YOUth PCI dataset and our code1, we encourage further research to deepen our understanding of parent-infant interactions and their implications for attachment and development.","PeriodicalId":93171,"journal":{"name":"Companion Publication of the 2020 International Conference on Multimodal Interaction","volume":"32 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Embracing Contact: Detecting Parent-Infant Interactions\",\"authors\":\"Metehan Doyran, Ronald Poppe, Albert Ali Salah\",\"doi\":\"10.1145/3577190.3614147\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We focus on a largely overlooked but crucial modality for parent-child interaction analysis: physical contact. In this paper, we provide a feasibility study to automatically detect contact between a parent and child from videos. Our multimodal CNN model uses a combination of 2D pose heatmaps, body part heatmaps, and cropped images. Two datasets (FlickrCI3D and YOUth PCI) are used to explore the generalization capabilities across different contact scenarios. Our experiments demonstrate that using 2D pose heatmaps and body part heatmaps yields the best performance in contact classification when trained from scratch on parent-infant interactions. We further investigate the influence of proximity on our classification performance. Our results indicate that there are unique challenges in parent-infant contact classification. Finally, we show that contact rates from aggregating frame-level predictions provide decent approximations of the true contact rates, suggesting that they can serve as an automated proxy for measuring the quality of parent-child interactions. By releasing the annotations for the YOUth PCI dataset and our code1, we encourage further research to deepen our understanding of parent-infant interactions and their implications for attachment and development.\",\"PeriodicalId\":93171,\"journal\":{\"name\":\"Companion Publication of the 2020 International Conference on Multimodal Interaction\",\"volume\":\"32 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-10-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Companion Publication of the 2020 International Conference on Multimodal Interaction\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3577190.3614147\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Companion Publication of the 2020 International Conference on Multimodal Interaction","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3577190.3614147","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
We focus on a largely overlooked but crucial modality for parent-child interaction analysis: physical contact. In this paper, we provide a feasibility study to automatically detect contact between a parent and child from videos. Our multimodal CNN model uses a combination of 2D pose heatmaps, body part heatmaps, and cropped images. Two datasets (FlickrCI3D and YOUth PCI) are used to explore the generalization capabilities across different contact scenarios. Our experiments demonstrate that using 2D pose heatmaps and body part heatmaps yields the best performance in contact classification when trained from scratch on parent-infant interactions. We further investigate the influence of proximity on our classification performance. Our results indicate that there are unique challenges in parent-infant contact classification. Finally, we show that contact rates from aggregating frame-level predictions provide decent approximations of the true contact rates, suggesting that they can serve as an automated proxy for measuring the quality of parent-child interactions. By releasing the annotations for the YOUth PCI dataset and our code1, we encourage further research to deepen our understanding of parent-infant interactions and their implications for attachment and development.