为萨斯-Cov-2 病毒感染者文本对话内容评估确定适当 NLP 技术的初步进展

EASI: Ingeniería y Ciencias Aplicadas en la Industria Pub Date : 2023-12-27 DOI:10.53591/easi.v2i3.2488

Ivan L. Acosta-Guzmán, Eleanor Varela-Tapia, Alexandra E. Piza-Guale, Nory X. Acosta-Guzmán, Christopher I. Acosta Varela

{"title":"为萨斯-Cov-2 病毒感染者文本对话内容评估确定适当 NLP 技术的初步进展","authors":"Ivan L. Acosta-Guzmán, Eleanor Varela-Tapia, Alexandra E. Piza-Guale, Nory X. Acosta-Guzmán, Christopher I. Acosta Varela","doi":"10.53591/easi.v2i3.2488","DOIUrl":null,"url":null,"abstract":"When Covid-19 became a pandemic on March 2020, an urgent need arose for reliable info and advice, so Virtual Assistants were created to help teach the public how to avoid the Alpha variant. But when new variants like Beta, Delta, and Omicron appeared with different symptoms, they caused new waves of infections and deaths. To tackle this, a Natural Language Processing prototype was created to analyze experiences of 4422 people, who had been infected in Ecuador, and to detect which symptoms were most common in their conversations. For this purpose, Python language was used, Google Collab platform, and several combinations of text processing techniques with various classifiers were tested. Finally, the results were measured using quality metrics, accuracy, precision, Recall, F1, to identify the most appropriate model, finding that the combination of Stop Word, Tokenization, stemming techniques together with the LSTM classifier reached high effectiveness among the options tested for a classifier model with multi-label output.","PeriodicalId":191327,"journal":{"name":"EASI: Ingeniería y Ciencias Aplicadas en la Industria","volume":"17 4","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2023-12-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Initial Progress of Identification of the Appropriate NLP Technique for Content Evaluation in Textual Conversations of People Infected by Sars-Cov-2\",\"authors\":\"Ivan L. Acosta-Guzmán, Eleanor Varela-Tapia, Alexandra E. Piza-Guale, Nory X. Acosta-Guzmán, Christopher I. Acosta Varela\",\"doi\":\"10.53591/easi.v2i3.2488\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"When Covid-19 became a pandemic on March 2020, an urgent need arose for reliable info and advice, so Virtual Assistants were created to help teach the public how to avoid the Alpha variant. But when new variants like Beta, Delta, and Omicron appeared with different symptoms, they caused new waves of infections and deaths. To tackle this, a Natural Language Processing prototype was created to analyze experiences of 4422 people, who had been infected in Ecuador, and to detect which symptoms were most common in their conversations. For this purpose, Python language was used, Google Collab platform, and several combinations of text processing techniques with various classifiers were tested. Finally, the results were measured using quality metrics, accuracy, precision, Recall, F1, to identify the most appropriate model, finding that the combination of Stop Word, Tokenization, stemming techniques together with the LSTM classifier reached high effectiveness among the options tested for a classifier model with multi-label output.\",\"PeriodicalId\":191327,\"journal\":{\"name\":\"EASI: Ingeniería y Ciencias Aplicadas en la Industria\",\"volume\":\"17 4\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-12-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"EASI: Ingeniería y Ciencias Aplicadas en la Industria\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.53591/easi.v2i3.2488\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"EASI: Ingeniería y Ciencias Aplicadas en la Industria","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.53591/easi.v2i3.2488","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

2020 年 3 月，当 Covid-19 成为一种流行病时，人们迫切需要可靠的信息和建议，因此虚拟助理应运而生，帮助教导公众如何避免感染 Alpha 变种。但是，当 Beta、Delta 和 Omicron 等新变种以不同症状出现时，又引发了新一轮的感染和死亡。为了解决这个问题，我们创建了一个自然语言处理原型，对厄瓜多尔 4422 名感染者的经历进行分析，并检测他们的对话中最常见的症状。为此，我们使用了 Python 语言和 Google Collab 平台，并测试了多种文本处理技术与各种分类器的组合。最后，使用准确率、精确度、召回率、F1 等质量指标对结果进行了测量，以确定最合适的模型，结果发现，在多标签输出分类器模型的各种测试选项中，停顿词、标记化、词干化技术与 LSTM 分类器的组合达到了很高的效率。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Initial Progress of Identification of the Appropriate NLP Technique for Content Evaluation in Textual Conversations of People Infected by Sars-Cov-2

When Covid-19 became a pandemic on March 2020, an urgent need arose for reliable info and advice, so Virtual Assistants were created to help teach the public how to avoid the Alpha variant. But when new variants like Beta, Delta, and Omicron appeared with different symptoms, they caused new waves of infections and deaths. To tackle this, a Natural Language Processing prototype was created to analyze experiences of 4422 people, who had been infected in Ecuador, and to detect which symptoms were most common in their conversations. For this purpose, Python language was used, Google Collab platform, and several combinations of text processing techniques with various classifiers were tested. Finally, the results were measured using quality metrics, accuracy, precision, Recall, F1, to identify the most appropriate model, finding that the combination of Stop Word, Tokenization, stemming techniques together with the LSTM classifier reached high effectiveness among the options tested for a classifier model with multi-label output.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

EASI: Ingeniería y Ciencias Aplicadas en la Industria

自引率

0.00%

发文量