Ayelet Ben-Sasson, Eli Ben-Sasson, Kayla Jacobs, Elisheva Rotman Argaman, Eden Saig
{"title":"Evaluating Expert Curation in a Baby Milestone Tracking App","authors":"Ayelet Ben-Sasson, Eli Ben-Sasson, Kayla Jacobs, Elisheva Rotman Argaman, Eden Saig","doi":"10.1145/3290605.3300783","DOIUrl":null,"url":null,"abstract":"Early childhood developmental screening is critical for timely detection and intervention. babyTRACKS (Formerly Baby CROINC, CROwd INtelligence Curation.) is a free, live, interactive developmental tracking mobile app with over 3,000 children's diaries. Parents write or select short milestone texts, like \"began taking first steps,\" to record their babies' developmental achievements, and receive crowd-based percentiles to evaluate development and catch potential delays. Currently, an expert-based Curated Crowd Intelligence (CCI) process manually groups incoming novel parent-authored milestone texts according to their similarity to existing milestones in the database (for example, starting to walk), or determining that the milestone represents a new developmental concept not seen before in another child's diary. CCI cannot scale well, however, and babyTRACKS is mature enough, with a rich enough database of existing milestone texts, to now consider machine learning tools to replace or assist the human curators. Three new studies explore (1) the usefulness of automation, by analyzing the human cost of CCI and how the work is currently broken down; (2) the validity of automation, by testing the inter-rater reliability of curators; and (3) the value of automation, by appraising the \"real world\" clinical value of milestones when assessing child development. We conclude that automation can indeed be appropriate and helpful for a large percentage, though not all, of CCI work. We further establish realistic upper bounds for algorithm performance; confirm that the babyTRACKS milestones dataset is valid for training and testing purposes; and verify that it represents clinically meaningful developmental information.","PeriodicalId":20454,"journal":{"name":"Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems","volume":"16 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2019-05-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3290605.3300783","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5
Abstract
Early childhood developmental screening is critical for timely detection and intervention. babyTRACKS (Formerly Baby CROINC, CROwd INtelligence Curation.) is a free, live, interactive developmental tracking mobile app with over 3,000 children's diaries. Parents write or select short milestone texts, like "began taking first steps," to record their babies' developmental achievements, and receive crowd-based percentiles to evaluate development and catch potential delays. Currently, an expert-based Curated Crowd Intelligence (CCI) process manually groups incoming novel parent-authored milestone texts according to their similarity to existing milestones in the database (for example, starting to walk), or determining that the milestone represents a new developmental concept not seen before in another child's diary. CCI cannot scale well, however, and babyTRACKS is mature enough, with a rich enough database of existing milestone texts, to now consider machine learning tools to replace or assist the human curators. Three new studies explore (1) the usefulness of automation, by analyzing the human cost of CCI and how the work is currently broken down; (2) the validity of automation, by testing the inter-rater reliability of curators; and (3) the value of automation, by appraising the "real world" clinical value of milestones when assessing child development. We conclude that automation can indeed be appropriate and helpful for a large percentage, though not all, of CCI work. We further establish realistic upper bounds for algorithm performance; confirm that the babyTRACKS milestones dataset is valid for training and testing purposes; and verify that it represents clinically meaningful developmental information.