{"title":"三元协同问题解决中语速协调与协同调节模式的多模态建模","authors":"Angela E. B. Stewart, Z. Keirn, S. D’Mello","doi":"10.1145/3242969.3242989","DOIUrl":null,"url":null,"abstract":"We model coordination and coregulation patterns in 33 triads engaged in collaboratively solving a challenging computer programming task for approximately 20 minutes. Our goal is to prospectively model speech rate (words/sec) - an important signal of turn taking and active participation - of one teammate (A or B or C) from time lagged nonverbal signals (speech rate and acoustic-prosodic features) of the other two (i.e., A + B → C; A + C → B; B + C → A) and task-related context features. We trained feed-forward neural networks (FFNNs) and long short-term memory recurrent neural networks (LSTMs) using group-level nested cross-validation. LSTMs outperformed FFNNs and a chance baseline and could predict speech rate up to 6s into the future. A multimodal combination of speech rate, acoustic-prosodic, and task context features outperformed unimodal and bimodal signals. The extent to which the models could predict an individual's speech rate was positively related to that individual's scores on a subsequent posttest, suggesting a link between coordination/coregulation and collaborative learning outcomes. We discuss applications of the models for real-time systems that monitor the collaborative process and intervene to promote positive collaborative outcomes.","PeriodicalId":308751,"journal":{"name":"Proceedings of the 20th ACM International Conference on Multimodal Interaction","volume":"51 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"21","resultStr":"{\"title\":\"Multimodal Modeling of Coordination and Coregulation Patterns in Speech Rate during Triadic Collaborative Problem Solving\",\"authors\":\"Angela E. B. Stewart, Z. Keirn, S. D’Mello\",\"doi\":\"10.1145/3242969.3242989\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We model coordination and coregulation patterns in 33 triads engaged in collaboratively solving a challenging computer programming task for approximately 20 minutes. Our goal is to prospectively model speech rate (words/sec) - an important signal of turn taking and active participation - of one teammate (A or B or C) from time lagged nonverbal signals (speech rate and acoustic-prosodic features) of the other two (i.e., A + B → C; A + C → B; B + C → A) and task-related context features. We trained feed-forward neural networks (FFNNs) and long short-term memory recurrent neural networks (LSTMs) using group-level nested cross-validation. LSTMs outperformed FFNNs and a chance baseline and could predict speech rate up to 6s into the future. A multimodal combination of speech rate, acoustic-prosodic, and task context features outperformed unimodal and bimodal signals. The extent to which the models could predict an individual's speech rate was positively related to that individual's scores on a subsequent posttest, suggesting a link between coordination/coregulation and collaborative learning outcomes. We discuss applications of the models for real-time systems that monitor the collaborative process and intervene to promote positive collaborative outcomes.\",\"PeriodicalId\":308751,\"journal\":{\"name\":\"Proceedings of the 20th ACM International Conference on Multimodal Interaction\",\"volume\":\"51 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-10-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"21\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 20th ACM International Conference on Multimodal Interaction\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3242969.3242989\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 20th ACM International Conference on Multimodal Interaction","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3242969.3242989","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Multimodal Modeling of Coordination and Coregulation Patterns in Speech Rate during Triadic Collaborative Problem Solving
We model coordination and coregulation patterns in 33 triads engaged in collaboratively solving a challenging computer programming task for approximately 20 minutes. Our goal is to prospectively model speech rate (words/sec) - an important signal of turn taking and active participation - of one teammate (A or B or C) from time lagged nonverbal signals (speech rate and acoustic-prosodic features) of the other two (i.e., A + B → C; A + C → B; B + C → A) and task-related context features. We trained feed-forward neural networks (FFNNs) and long short-term memory recurrent neural networks (LSTMs) using group-level nested cross-validation. LSTMs outperformed FFNNs and a chance baseline and could predict speech rate up to 6s into the future. A multimodal combination of speech rate, acoustic-prosodic, and task context features outperformed unimodal and bimodal signals. The extent to which the models could predict an individual's speech rate was positively related to that individual's scores on a subsequent posttest, suggesting a link between coordination/coregulation and collaborative learning outcomes. We discuss applications of the models for real-time systems that monitor the collaborative process and intervene to promote positive collaborative outcomes.