Yusuke Shono, Berivan Ece, Emily H Ho, Aaron J Kaat, Erica M LaForte, Ezgi Ayturk, Richard Gershon
{"title":"美国标准样本中 NIH 工具箱执行功能任务评分算法的比较。","authors":"Yusuke Shono, Berivan Ece, Emily H Ho, Aaron J Kaat, Erica M LaForte, Ezgi Ayturk, Richard Gershon","doi":"10.1037/pas0001350","DOIUrl":null,"url":null,"abstract":"<p><p>Executive function (EF) has been extensively linked to various behavioral, clinical, and educational outcomes. There have been, however, few systematic investigations into how best to score EF tasks using speed and accuracy performance, particularly how to generate a summary and norm-referenced score. Using data from an updated norming study for the NIH Toolbox Version 3 (NIHTB V3) with the general U.S. population aged between 3 and 85 (N = 3,794; 52.3% female; Mage = 25.06, SDage = 22.92), we empirically evaluated and compared several scoring algorithms for two EF tests: The Dimensional Change Card Sort (a test of cognitive flexibility) and Flanker (a test of inhibitory control) Tests. Results showed that joint scoring algorithms integrating speed and accuracy into single scores (namely, rate-correct score, linear integrated speed-accuracy score, and speed-accuracy additive score) provided more robust psychometric evidence for the EF tests than single-index scores of accuracy and speed. These integrated speed-accuracy scores were consistent and stable within and across tasks and time; similar to that of another well-validated EF measure, but as predicted, not related to a crystallized intelligence measure score; and increased rapidly from early childhood through late adolescence/early adulthood and then declined toward late adulthood. The rate-correct score was particularly free from ceiling effects and sensitive to age-related changes and variability in EF performance. Among various scoring algorithms, we recommend rate-correct score, which served as the basis for generating new NIHTB V3 norm-referenced scores, with good test-retest reliability (Dimensional Change Card Sort = .77, Flanker = .81) and acceptable convergent and discriminant validity. (PsycInfo Database Record (c) 2024 APA, all rights reserved).</p>","PeriodicalId":20770,"journal":{"name":"Psychological Assessment","volume":"36 12","pages":"760-771"},"PeriodicalIF":3.3000,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A comparison of scoring algorithms for the NIH Toolbox executive function tasks in a U.S. norming sample.\",\"authors\":\"Yusuke Shono, Berivan Ece, Emily H Ho, Aaron J Kaat, Erica M LaForte, Ezgi Ayturk, Richard Gershon\",\"doi\":\"10.1037/pas0001350\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Executive function (EF) has been extensively linked to various behavioral, clinical, and educational outcomes. There have been, however, few systematic investigations into how best to score EF tasks using speed and accuracy performance, particularly how to generate a summary and norm-referenced score. Using data from an updated norming study for the NIH Toolbox Version 3 (NIHTB V3) with the general U.S. population aged between 3 and 85 (N = 3,794; 52.3% female; Mage = 25.06, SDage = 22.92), we empirically evaluated and compared several scoring algorithms for two EF tests: The Dimensional Change Card Sort (a test of cognitive flexibility) and Flanker (a test of inhibitory control) Tests. Results showed that joint scoring algorithms integrating speed and accuracy into single scores (namely, rate-correct score, linear integrated speed-accuracy score, and speed-accuracy additive score) provided more robust psychometric evidence for the EF tests than single-index scores of accuracy and speed. These integrated speed-accuracy scores were consistent and stable within and across tasks and time; similar to that of another well-validated EF measure, but as predicted, not related to a crystallized intelligence measure score; and increased rapidly from early childhood through late adolescence/early adulthood and then declined toward late adulthood. The rate-correct score was particularly free from ceiling effects and sensitive to age-related changes and variability in EF performance. Among various scoring algorithms, we recommend rate-correct score, which served as the basis for generating new NIHTB V3 norm-referenced scores, with good test-retest reliability (Dimensional Change Card Sort = .77, Flanker = .81) and acceptable convergent and discriminant validity. (PsycInfo Database Record (c) 2024 APA, all rights reserved).</p>\",\"PeriodicalId\":20770,\"journal\":{\"name\":\"Psychological Assessment\",\"volume\":\"36 12\",\"pages\":\"760-771\"},\"PeriodicalIF\":3.3000,\"publicationDate\":\"2024-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Psychological Assessment\",\"FirstCategoryId\":\"102\",\"ListUrlMain\":\"https://doi.org/10.1037/pas0001350\",\"RegionNum\":2,\"RegionCategory\":\"心理学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"PSYCHOLOGY, CLINICAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Psychological Assessment","FirstCategoryId":"102","ListUrlMain":"https://doi.org/10.1037/pas0001350","RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"PSYCHOLOGY, CLINICAL","Score":null,"Total":0}
A comparison of scoring algorithms for the NIH Toolbox executive function tasks in a U.S. norming sample.
Executive function (EF) has been extensively linked to various behavioral, clinical, and educational outcomes. There have been, however, few systematic investigations into how best to score EF tasks using speed and accuracy performance, particularly how to generate a summary and norm-referenced score. Using data from an updated norming study for the NIH Toolbox Version 3 (NIHTB V3) with the general U.S. population aged between 3 and 85 (N = 3,794; 52.3% female; Mage = 25.06, SDage = 22.92), we empirically evaluated and compared several scoring algorithms for two EF tests: The Dimensional Change Card Sort (a test of cognitive flexibility) and Flanker (a test of inhibitory control) Tests. Results showed that joint scoring algorithms integrating speed and accuracy into single scores (namely, rate-correct score, linear integrated speed-accuracy score, and speed-accuracy additive score) provided more robust psychometric evidence for the EF tests than single-index scores of accuracy and speed. These integrated speed-accuracy scores were consistent and stable within and across tasks and time; similar to that of another well-validated EF measure, but as predicted, not related to a crystallized intelligence measure score; and increased rapidly from early childhood through late adolescence/early adulthood and then declined toward late adulthood. The rate-correct score was particularly free from ceiling effects and sensitive to age-related changes and variability in EF performance. Among various scoring algorithms, we recommend rate-correct score, which served as the basis for generating new NIHTB V3 norm-referenced scores, with good test-retest reliability (Dimensional Change Card Sort = .77, Flanker = .81) and acceptable convergent and discriminant validity. (PsycInfo Database Record (c) 2024 APA, all rights reserved).
期刊介绍:
Psychological Assessment is concerned mainly with empirical research on measurement and evaluation relevant to the broad field of clinical psychology. Submissions are welcome in the areas of assessment processes and methods. Included are - clinical judgment and the application of decision-making models - paradigms derived from basic psychological research in cognition, personality–social psychology, and biological psychology - development, validation, and application of assessment instruments, observational methods, and interviews