Rory T Devine, Venelin Kovatchev, Imogen Grumley Traynor, Phillip Smith, Mark Lee
{"title":"用于“高级”心智理论自动测量的机器学习和深度学习系统:儿童和青少年的可靠性和有效性。","authors":"Rory T Devine, Venelin Kovatchev, Imogen Grumley Traynor, Phillip Smith, Mark Lee","doi":"10.1037/pas0001186","DOIUrl":null,"url":null,"abstract":"<p><p>Understanding individual differences in theory of mind (ToM; the ability to attribute mental states to others) in middle childhood and adolescence hinges on the availability of robust and scalable measures. Open-ended response tasks yield valid indicators of ToM but are labor intensive and difficult to compare across studies. We examined the reliability and validity of new machine learning and deep learning neural network automated scoring systems for measuring ToM in children and adolescents. Two large samples of British children and adolescents aged between 7 and 13 years (Sample 1: N = 1,135, Mage = 10.22 years, SD = 1.45; Sample 2: N = 1,020, Mage = 10.36 years, SD = 1.27) completed the silent film and strange stories tasks. Teachers rated Sample 2 children's social competence with peers. A single latent-factor explained variation in performance on both the silent film and strange stories task (in Sample 1 and 2) and test performance was sensitive to age-related differences and individual differences within each age-group. A deep learning neural network automated scoring system trained on Sample 1 exhibited interrater reliability and measurement invariance with manual ratings in Sample 2. Validity of ratings from the automated scoring system was supported by unique positive associations between ToM and teacher-rated social competence. The results demonstrate that reliable and valid measures of ToM can be obtained using the new freely available deep learning neural network automated scoring system to rate open-ended text responses. (PsycInfo Database Record (c) 2023 APA, all rights reserved).</p>","PeriodicalId":20770,"journal":{"name":"Psychological Assessment","volume":"35 2","pages":"165-177"},"PeriodicalIF":3.3000,"publicationDate":"2023-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Machine learning and deep learning systems for automated measurement of \\\"advanced\\\" theory of mind: Reliability and validity in children and adolescents.\",\"authors\":\"Rory T Devine, Venelin Kovatchev, Imogen Grumley Traynor, Phillip Smith, Mark Lee\",\"doi\":\"10.1037/pas0001186\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Understanding individual differences in theory of mind (ToM; the ability to attribute mental states to others) in middle childhood and adolescence hinges on the availability of robust and scalable measures. Open-ended response tasks yield valid indicators of ToM but are labor intensive and difficult to compare across studies. We examined the reliability and validity of new machine learning and deep learning neural network automated scoring systems for measuring ToM in children and adolescents. Two large samples of British children and adolescents aged between 7 and 13 years (Sample 1: N = 1,135, Mage = 10.22 years, SD = 1.45; Sample 2: N = 1,020, Mage = 10.36 years, SD = 1.27) completed the silent film and strange stories tasks. Teachers rated Sample 2 children's social competence with peers. A single latent-factor explained variation in performance on both the silent film and strange stories task (in Sample 1 and 2) and test performance was sensitive to age-related differences and individual differences within each age-group. A deep learning neural network automated scoring system trained on Sample 1 exhibited interrater reliability and measurement invariance with manual ratings in Sample 2. Validity of ratings from the automated scoring system was supported by unique positive associations between ToM and teacher-rated social competence. The results demonstrate that reliable and valid measures of ToM can be obtained using the new freely available deep learning neural network automated scoring system to rate open-ended text responses. (PsycInfo Database Record (c) 2023 APA, all rights reserved).</p>\",\"PeriodicalId\":20770,\"journal\":{\"name\":\"Psychological Assessment\",\"volume\":\"35 2\",\"pages\":\"165-177\"},\"PeriodicalIF\":3.3000,\"publicationDate\":\"2023-02-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Psychological Assessment\",\"FirstCategoryId\":\"102\",\"ListUrlMain\":\"https://doi.org/10.1037/pas0001186\",\"RegionNum\":2,\"RegionCategory\":\"心理学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"PSYCHOLOGY, CLINICAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Psychological Assessment","FirstCategoryId":"102","ListUrlMain":"https://doi.org/10.1037/pas0001186","RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"PSYCHOLOGY, CLINICAL","Score":null,"Total":0}
Machine learning and deep learning systems for automated measurement of "advanced" theory of mind: Reliability and validity in children and adolescents.
Understanding individual differences in theory of mind (ToM; the ability to attribute mental states to others) in middle childhood and adolescence hinges on the availability of robust and scalable measures. Open-ended response tasks yield valid indicators of ToM but are labor intensive and difficult to compare across studies. We examined the reliability and validity of new machine learning and deep learning neural network automated scoring systems for measuring ToM in children and adolescents. Two large samples of British children and adolescents aged between 7 and 13 years (Sample 1: N = 1,135, Mage = 10.22 years, SD = 1.45; Sample 2: N = 1,020, Mage = 10.36 years, SD = 1.27) completed the silent film and strange stories tasks. Teachers rated Sample 2 children's social competence with peers. A single latent-factor explained variation in performance on both the silent film and strange stories task (in Sample 1 and 2) and test performance was sensitive to age-related differences and individual differences within each age-group. A deep learning neural network automated scoring system trained on Sample 1 exhibited interrater reliability and measurement invariance with manual ratings in Sample 2. Validity of ratings from the automated scoring system was supported by unique positive associations between ToM and teacher-rated social competence. The results demonstrate that reliable and valid measures of ToM can be obtained using the new freely available deep learning neural network automated scoring system to rate open-ended text responses. (PsycInfo Database Record (c) 2023 APA, all rights reserved).
期刊介绍:
Psychological Assessment is concerned mainly with empirical research on measurement and evaluation relevant to the broad field of clinical psychology. Submissions are welcome in the areas of assessment processes and methods. Included are - clinical judgment and the application of decision-making models - paradigms derived from basic psychological research in cognition, personality–social psychology, and biological psychology - development, validation, and application of assessment instruments, observational methods, and interviews