利用机器学习技术（早期人工智能支持的响应与社交倾听平台）加强对COVID-19信息大流行的数字社会理解：开发与实施研究。

IF 2.3 Q1 HEALTH CARE SCIENCES & SERVICES

JMIR infodemiology Pub Date : 2023-08-21 DOI:10.2196/47317

Becky K White, Arnault Gombert, Tim Nguyen, Brian Yau, Atsuyoshi Ishizumi, Laura Kirchner, Alicia León, Harry Wilson, Giovanna Jaramillo-Gutierrez, Jesus Cerquides, Marcelo D'Agostino, Cristiana Salvi, Ravi Shankar Sreenath, Kimberly Rambaud, Dalia Samhouri, Sylvie Briand, Tina D Purnat

{"title":"利用机器学习技术（早期人工智能支持的响应与社交倾听平台）加强对COVID-19信息大流行的数字社会理解：开发与实施研究。","authors":"Becky K White, Arnault Gombert, Tim Nguyen, Brian Yau, Atsuyoshi Ishizumi, Laura Kirchner, Alicia León, Harry Wilson, Giovanna Jaramillo-Gutierrez, Jesus Cerquides, Marcelo D'Agostino, Cristiana Salvi, Ravi Shankar Sreenath, Kimberly Rambaud, Dalia Samhouri, Sylvie Briand, Tina D Purnat","doi":"10.2196/47317","DOIUrl":null,"url":null,"abstract":"Background: Amid the COVID-19 pandemic, there has been a need for rapid social understanding to inform infodemic management and response. Although social media analysis platforms have traditionally been designed for commercial brands for marketing and sales purposes, they have been underused and adapted for a comprehensive understanding of social dynamics in areas such as public health. Traditional systems have challenges for public health use, and new tools and innovative methods are required. The World Health Organization Early Artificial Intelligence-Supported Response with Social Listening (EARS) platform was developed to overcome some of these challenges.Objective: This paper describes the development of the EARS platform, including data sourcing, development, and validation of a machine learning categorization approach, as well as the results from the pilot study.Methods: Data for EARS are collected daily from web-based conversations in publicly available sources in 9 languages. Public health and social media experts developed a taxonomy to categorize COVID-19 narratives into 5 relevant main categories and 41 subcategories. We developed a semisupervised machine learning algorithm to categorize social media posts into categories and various filters. To validate the results obtained by the machine learning-based approach, we compared it to a search-filter approach, applying Boolean queries with the same amount of information and measured the recall and precision. Hotelling T2 was used to determine the effect of the classification method on the combined variables.Results: The EARS platform was developed, validated, and applied to characterize conversations regarding COVID-19 since December 2020. A total of 215,469,045 social posts were collected for processing from December 2020 to February 2022. The machine learning algorithm outperformed the Boolean search filters method for precision and recall in both English and Spanish languages (P<.001). Demographic and other filters provided useful insights on data, and the gender split of users in the platform was largely consistent with population-level data on social media use.Conclusions: The EARS platform was developed to address the changing needs of public health analysts during the COVID-19 pandemic. The application of public health taxonomy and artificial intelligence technology to a user-friendly social listening platform, accessible directly by analysts, is a significant step in better enabling understanding of global narratives. The platform was designed for scalability; iterations and new countries and languages have been added. This research has shown that a machine learning approach is more accurate than using only keywords and has the benefit of categorizing and understanding large amounts of digital social data during an infodemic. Further technical developments are needed and planned for continuous improvements, to meet the challenges in the generation of infodemic insights from social media for infodemic managers and public health professionals.","PeriodicalId":73554,"journal":{"name":"JMIR infodemiology","volume":"3 ","pages":"e47317"},"PeriodicalIF":2.3000,"publicationDate":"2023-08-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10477919/pdf/","citationCount":"0","resultStr":"{\"title\":\"Using Machine Learning Technology (Early Artificial Intelligence-Supported Response With Social Listening Platform) to Enhance Digital Social Understanding for the COVID-19 Infodemic: Development and Implementation Study.\",\"authors\":\"Becky K White, Arnault Gombert, Tim Nguyen, Brian Yau, Atsuyoshi Ishizumi, Laura Kirchner, Alicia León, Harry Wilson, Giovanna Jaramillo-Gutierrez, Jesus Cerquides, Marcelo D'Agostino, Cristiana Salvi, Ravi Shankar Sreenath, Kimberly Rambaud, Dalia Samhouri, Sylvie Briand, Tina D Purnat\",\"doi\":\"10.2196/47317\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Background: Amid the COVID-19 pandemic, there has been a need for rapid social understanding to inform infodemic management and response. Although social media analysis platforms have traditionally been designed for commercial brands for marketing and sales purposes, they have been underused and adapted for a comprehensive understanding of social dynamics in areas such as public health. Traditional systems have challenges for public health use, and new tools and innovative methods are required. The World Health Organization Early Artificial Intelligence-Supported Response with Social Listening (EARS) platform was developed to overcome some of these challenges.Objective: This paper describes the development of the EARS platform, including data sourcing, development, and validation of a machine learning categorization approach, as well as the results from the pilot study.Methods: Data for EARS are collected daily from web-based conversations in publicly available sources in 9 languages. Public health and social media experts developed a taxonomy to categorize COVID-19 narratives into 5 relevant main categories and 41 subcategories. We developed a semisupervised machine learning algorithm to categorize social media posts into categories and various filters. To validate the results obtained by the machine learning-based approach, we compared it to a search-filter approach, applying Boolean queries with the same amount of information and measured the recall and precision. Hotelling T2 was used to determine the effect of the classification method on the combined variables.Results: The EARS platform was developed, validated, and applied to characterize conversations regarding COVID-19 since December 2020. A total of 215,469,045 social posts were collected for processing from December 2020 to February 2022. The machine learning algorithm outperformed the Boolean search filters method for precision and recall in both English and Spanish languages (P<.001). Demographic and other filters provided useful insights on data, and the gender split of users in the platform was largely consistent with population-level data on social media use.Conclusions: The EARS platform was developed to address the changing needs of public health analysts during the COVID-19 pandemic. The application of public health taxonomy and artificial intelligence technology to a user-friendly social listening platform, accessible directly by analysts, is a significant step in better enabling understanding of global narratives. The platform was designed for scalability; iterations and new countries and languages have been added. This research has shown that a machine learning approach is more accurate than using only keywords and has the benefit of categorizing and understanding large amounts of digital social data during an infodemic. Further technical developments are needed and planned for continuous improvements, to meet the challenges in the generation of infodemic insights from social media for infodemic managers and public health professionals.\",\"PeriodicalId\":73554,\"journal\":{\"name\":\"JMIR infodemiology\",\"volume\":\"3 \",\"pages\":\"e47317\"},\"PeriodicalIF\":2.3000,\"publicationDate\":\"2023-08-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10477919/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"JMIR infodemiology\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.2196/47317\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"HEALTH CARE SCIENCES & SERVICES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"JMIR infodemiology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2196/47317","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"HEALTH CARE SCIENCES & SERVICES","Score":null,"Total":0}

引用次数: 0

摘要

背景：在2019冠状病毒病大流行期间，需要快速的社会理解，为信息管理和应对提供信息。虽然社交媒体分析平台传统上是为商业品牌设计的，用于营销和销售目的，但它们没有得到充分利用，也没有被用于全面了解公共卫生等领域的社会动态。传统系统在公共卫生使用方面面临挑战，需要新的工具和创新方法。世界卫生组织开发了早期人工智能支持的社会倾听响应（EARS）平台，以克服其中的一些挑战。目的：本文描述了ear平台的开发，包括数据来源、开发和机器学习分类方法的验证，以及试点研究的结果。方法：ear的数据每天从公开来源的9种语言的网络对话中收集。公共卫生和社交媒体专家制定了一种分类法，将COVID-19的叙述分为5个相关的主要类别和41个小类别。我们开发了一种半监督机器学习算法，将社交媒体帖子分为类别和各种过滤器。为了验证基于机器学习的方法获得的结果，我们将其与搜索过滤器方法进行了比较，应用具有相同信息量的布尔查询，并测量了召回率和精度。采用Hotelling T2来确定分类方法对组合变量的影响。结果：自2020年12月以来，开发、验证并应用了EARS平台来描述有关COVID-19的对话。从2020年12月到2022年2月，共收集了215,469,045条社交帖子进行处理。机器学习算法在英语和西班牙语的准确率和召回率方面都优于布尔搜索过滤器方法(结论：开发EARS平台是为了满足COVID-19大流行期间公共卫生分析人员不断变化的需求。将公共卫生分类法和人工智能技术应用于一个便于分析人员直接访问的用户友好的社会倾听平台，是在更好地理解全球叙述方面迈出的重要一步。该平台的设计考虑了可扩展性；已经添加了迭代和新的国家和语言。这项研究表明，机器学习方法比仅使用关键字更准确，并且在信息大流行期间对大量数字社会数据进行分类和理解的好处。需要进一步发展技术，并计划进行持续改进，以应对从社交媒体为信息管理人员和公共卫生专业人员生成信息见解方面的挑战。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

Using Machine Learning Technology (Early Artificial Intelligence-Supported Response With Social Listening Platform) to Enhance Digital Social Understanding for the COVID-19 Infodemic: Development and Implementation Study.

查看原文本刊更多论文

Background: Amid the COVID-19 pandemic, there has been a need for rapid social understanding to inform infodemic management and response. Although social media analysis platforms have traditionally been designed for commercial brands for marketing and sales purposes, they have been underused and adapted for a comprehensive understanding of social dynamics in areas such as public health. Traditional systems have challenges for public health use, and new tools and innovative methods are required. The World Health Organization Early Artificial Intelligence-Supported Response with Social Listening (EARS) platform was developed to overcome some of these challenges.

Objective: This paper describes the development of the EARS platform, including data sourcing, development, and validation of a machine learning categorization approach, as well as the results from the pilot study.

Methods: Data for EARS are collected daily from web-based conversations in publicly available sources in 9 languages. Public health and social media experts developed a taxonomy to categorize COVID-19 narratives into 5 relevant main categories and 41 subcategories. We developed a semisupervised machine learning algorithm to categorize social media posts into categories and various filters. To validate the results obtained by the machine learning-based approach, we compared it to a search-filter approach, applying Boolean queries with the same amount of information and measured the recall and precision. Hotelling T² was used to determine the effect of the classification method on the combined variables.

Results: The EARS platform was developed, validated, and applied to characterize conversations regarding COVID-19 since December 2020. A total of 215,469,045 social posts were collected for processing from December 2020 to February 2022. The machine learning algorithm outperformed the Boolean search filters method for precision and recall in both English and Spanish languages (P<.001). Demographic and other filters provided useful insights on data, and the gender split of users in the platform was largely consistent with population-level data on social media use.

Conclusions: The EARS platform was developed to address the changing needs of public health analysts during the COVID-19 pandemic. The application of public health taxonomy and artificial intelligence technology to a user-friendly social listening platform, accessible directly by analysts, is a significant step in better enabling understanding of global narratives. The platform was designed for scalability; iterations and new countries and languages have been added. This research has shown that a machine learning approach is more accurate than using only keywords and has the benefit of categorizing and understanding large amounts of digital social data during an infodemic. Further technical developments are needed and planned for continuous improvements, to meet the challenges in the generation of infodemic insights from social media for infodemic managers and public health professionals.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

JMIR infodemiology

CiteScore

4.80

自引率

0.00%

发文量