Evaluation of Within- and Between-Site Agreement for Direct Observation of Physical Behavior Across Four Research Groups

S. Keadle, Julian Martinez, S. Strath, J. Sirard, D. John, S. Intille, Diego Arguello, Marcos Amalbert-Birriel, Rachel Barnett, B. Thapa-Chhetry, Melanna Cox, John Chase, Erin E. Dooley, Robert Marcotte, Alex Tolas, John W. Staudemayer
{"title":"Evaluation of Within- and Between-Site Agreement for Direct Observation of Physical Behavior Across Four Research Groups","authors":"S. Keadle, Julian Martinez, S. Strath, J. Sirard, D. John, S. Intille, Diego Arguello, Marcos Amalbert-Birriel, Rachel Barnett, B. Thapa-Chhetry, Melanna Cox, John Chase, Erin E. Dooley, Robert Marcotte, Alex Tolas, John W. Staudemayer","doi":"10.1123/jmpb.2022-0048","DOIUrl":null,"url":null,"abstract":"Direct observation (DO) is a widely accepted ground-truth measure, but the field lacks standard operational definitions. Research groups develop project-specific annotation platforms, limiting the utility of DO if labels are not consistent. Purpose: The purpose was to evaluate within- and between-site agreement for DO taxonomies (e.g., activity intensity category) across four independent research groups who have used video-recorded DO. Methods: Each site contributed video files (508 min) and had two trained research assistants annotate the shared video files according to their existing annotation protocols. The authors calculated (a) within-site agreement for the two coders at the same site expressed as intraclass correlation and (b) between-site agreement, the proportion of seconds that agree between any two coders regardless of site. Results: Within-site agreement at all sites was good–excellent for both activity intensity categories (intraclass correlation range: .82–.9) and posture/whole-body movement (intraclass correlation range: .77–.98). Between-site agreement for intensity categories was 94.6% for sedentary, 80.9% for light, and 82.8% for moderate–vigorous. Three of the four sites had common labels for eight posture/whole-body movements and had within-site agreements of 94.5% and between-site agreements of 86.1%. Conclusions: Distinct research groups can annotate key features of physical behavior with good-to-excellent interrater reliability. Operational definitions are provided for core metrics for researchers to consider in future studies to facilitate between-study comparisons and data pooling, enabling the deployment of deep learning approaches to wearable device algorithm calibration.","PeriodicalId":73572,"journal":{"name":"Journal for the measurement of physical behaviour","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal for the measurement of physical behaviour","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1123/jmpb.2022-0048","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Direct observation (DO) is a widely accepted ground-truth measure, but the field lacks standard operational definitions. Research groups develop project-specific annotation platforms, limiting the utility of DO if labels are not consistent. Purpose: The purpose was to evaluate within- and between-site agreement for DO taxonomies (e.g., activity intensity category) across four independent research groups who have used video-recorded DO. Methods: Each site contributed video files (508 min) and had two trained research assistants annotate the shared video files according to their existing annotation protocols. The authors calculated (a) within-site agreement for the two coders at the same site expressed as intraclass correlation and (b) between-site agreement, the proportion of seconds that agree between any two coders regardless of site. Results: Within-site agreement at all sites was good–excellent for both activity intensity categories (intraclass correlation range: .82–.9) and posture/whole-body movement (intraclass correlation range: .77–.98). Between-site agreement for intensity categories was 94.6% for sedentary, 80.9% for light, and 82.8% for moderate–vigorous. Three of the four sites had common labels for eight posture/whole-body movements and had within-site agreements of 94.5% and between-site agreements of 86.1%. Conclusions: Distinct research groups can annotate key features of physical behavior with good-to-excellent interrater reliability. Operational definitions are provided for core metrics for researchers to consider in future studies to facilitate between-study comparisons and data pooling, enabling the deployment of deep learning approaches to wearable device algorithm calibration.
对四个研究小组直接观察物理行为的场内和场内协议的评估
直接观测(DO)是一种被广泛接受的实地真值测量方法,但该领域缺乏标准的操作定义。研究小组开发特定于项目的注释平台,如果标签不一致,则限制了DO的效用。目的:目的是评估四个使用视频记录DO的独立研究小组对DO分类(例如,活动强度类别)的站点内和站点之间的一致性。方法:每个站点提供视频文件(508分钟),并由两名经过培训的研究助理根据其现有的注释协议对共享的视频文件进行注释。作者计算了(a)两个编码员在同一站点的站点内一致性,表示为类内相关性;(b)站点间一致性,任何两个编码员在任何站点之间一致的秒数比例。结果:所有位点在活动强度类别(类内相关范围:0.82 - 0.9)和姿势/全身运动(类内相关范围:0.77 - 0.98)上的一致性均为良好-极好。在强度类别上,久坐组的一致性为94.6%,轻度组为80.9%,中度剧烈组为82.8%。4个站点中的3个站点具有8种姿势/全身运动的共同标签,站点内一致性为94.5%,站点间一致性为86.1%。结论:不同的研究小组能够以良好到优异的可信度注释身体行为的关键特征。提供了核心指标的操作定义,供研究人员在未来的研究中考虑,以促进研究之间的比较和数据池,从而能够将深度学习方法部署到可穿戴设备算法校准中。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
2.90
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信