2023 年癌症相关关键词:从大型消费者门户网站的文本挖掘中获得的启示。

IF 2.3 Q3 MEDICAL INFORMATICS
Healthcare Informatics Research Pub Date : 2024-10-01 Epub Date: 2024-10-31 DOI:10.4258/hir.2024.30.4.398
Wonjeong Jeong, Eunkyoung Song, Eunzi Jeong, Kyoung Hee Oh, Hye-Sun Lee, Jae Kwan Jun
{"title":"2023 年癌症相关关键词:从大型消费者门户网站的文本挖掘中获得的启示。","authors":"Wonjeong Jeong, Eunkyoung Song, Eunzi Jeong, Kyoung Hee Oh, Hye-Sun Lee, Jae Kwan Jun","doi":"10.4258/hir.2024.30.4.398","DOIUrl":null,"url":null,"abstract":"<p><strong>Objectives: </strong>With the growing importance of monitoring cancer patients' internet usage, there is an increasing need for technology that expands access to relevant information through text mining. This study analyzed internet articles from portal sites in 2023 to identify trends in the information available to cancer patients and to derive meaningful insights.</p><p><strong>Methods: </strong>This study analyzed 19,578 news articles published on Naver, a major Korean portal site, from January 1, 2023, to December 31, 2023. Natural language processing, text mining, network analysis, and word cloud analysis were employed. The search term \"am\" (Korean for \"cancer\") was used to identify keywords related to cancer.</p><p><strong>Results: </strong>In 2023, an average of 1,631 cancer-related articles were published monthly, with a peak of 1,946 in September and a low of 1,371 in February. A total of 132,456 keywords were extracted, with \"cure\" (2,218 occurrences), \"lung cancer\" (1,652), and \"breast cancer\" (1,235) being the most frequent. Term frequency-inverse document frequency analysis ranked \"struggle\" (1064.172) as the most significant keyword, followed by \"lung cancer\" (839.988) and \"breast cancer\" (744.840). Network analysis revealed four distinct clusters focusing on treatment, celebrity-related issues, major cancer types, and cancer-causing factors.</p><p><strong>Conclusions: </strong>The analysis of cancer-related keywords in 2023 indicates that news articles often prioritize gossip over essential information. These findings provide foundational data for future policy directions and strategies to address misinformation. This study underscores the importance of understanding the nature of cancer-related information consumed by the public and offers insights to guide official policies and healthcare practices.</p>","PeriodicalId":12947,"journal":{"name":"Healthcare Informatics Research","volume":"30 4","pages":"398-408"},"PeriodicalIF":2.3000,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11570664/pdf/","citationCount":"0","resultStr":"{\"title\":\"Cancer-related Keywords in 2023: Insights from Text Mining of a Major Consumer Portal.\",\"authors\":\"Wonjeong Jeong, Eunkyoung Song, Eunzi Jeong, Kyoung Hee Oh, Hye-Sun Lee, Jae Kwan Jun\",\"doi\":\"10.4258/hir.2024.30.4.398\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Objectives: </strong>With the growing importance of monitoring cancer patients' internet usage, there is an increasing need for technology that expands access to relevant information through text mining. This study analyzed internet articles from portal sites in 2023 to identify trends in the information available to cancer patients and to derive meaningful insights.</p><p><strong>Methods: </strong>This study analyzed 19,578 news articles published on Naver, a major Korean portal site, from January 1, 2023, to December 31, 2023. Natural language processing, text mining, network analysis, and word cloud analysis were employed. The search term \\\"am\\\" (Korean for \\\"cancer\\\") was used to identify keywords related to cancer.</p><p><strong>Results: </strong>In 2023, an average of 1,631 cancer-related articles were published monthly, with a peak of 1,946 in September and a low of 1,371 in February. A total of 132,456 keywords were extracted, with \\\"cure\\\" (2,218 occurrences), \\\"lung cancer\\\" (1,652), and \\\"breast cancer\\\" (1,235) being the most frequent. Term frequency-inverse document frequency analysis ranked \\\"struggle\\\" (1064.172) as the most significant keyword, followed by \\\"lung cancer\\\" (839.988) and \\\"breast cancer\\\" (744.840). Network analysis revealed four distinct clusters focusing on treatment, celebrity-related issues, major cancer types, and cancer-causing factors.</p><p><strong>Conclusions: </strong>The analysis of cancer-related keywords in 2023 indicates that news articles often prioritize gossip over essential information. These findings provide foundational data for future policy directions and strategies to address misinformation. This study underscores the importance of understanding the nature of cancer-related information consumed by the public and offers insights to guide official policies and healthcare practices.</p>\",\"PeriodicalId\":12947,\"journal\":{\"name\":\"Healthcare Informatics Research\",\"volume\":\"30 4\",\"pages\":\"398-408\"},\"PeriodicalIF\":2.3000,\"publicationDate\":\"2024-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11570664/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Healthcare Informatics Research\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.4258/hir.2024.30.4.398\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2024/10/31 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q3\",\"JCRName\":\"MEDICAL INFORMATICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Healthcare Informatics Research","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.4258/hir.2024.30.4.398","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/10/31 0:00:00","PubModel":"Epub","JCR":"Q3","JCRName":"MEDICAL INFORMATICS","Score":null,"Total":0}
引用次数: 0

摘要

目的:随着监控癌症患者互联网使用情况的重要性与日俱增,人们越来越需要通过文本挖掘技术来扩大相关信息的获取途径。本研究分析了 2023 年门户网站上的互联网文章,以确定癌症患者可获取信息的趋势,并得出有意义的见解:本研究分析了 2023 年 1 月 1 日至 2023 年 12 月 31 日在韩国主要门户网站 Naver 上发布的 19,578 篇新闻文章。研究采用了自然语言处理、文本挖掘、网络分析和词云分析等方法。搜索词 "am"(韩语中 "癌症 "的意思)用于识别与癌症相关的关键词:2023 年,平均每月发布 1,631 篇癌症相关文章,其中 9 月份的峰值为 1,946 篇,2 月份的峰值为 1,371 篇。共提取了 132,456 个关键词,其中 "治愈"(2,218 次)、"肺癌"(1,652 次)和 "乳腺癌"(1,235 次)出现频率最高。词频-反文档频率分析将 "斗争"(1064.172)列为最重要的关键词,其次是 "肺癌"(839.988)和 "乳腺癌"(744.840)。网络分析显示了四个不同的群组,分别集中在治疗、名人相关问题、主要癌症类型和致癌因素上:对 2023 年癌症相关关键词的分析表明,新闻报道往往优先考虑八卦而非基本信息。这些发现为未来应对错误信息的政策方向和策略提供了基础数据。这项研究强调了了解公众消费的癌症相关信息性质的重要性,并为指导官方政策和医疗实践提供了启示。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Cancer-related Keywords in 2023: Insights from Text Mining of a Major Consumer Portal.

Objectives: With the growing importance of monitoring cancer patients' internet usage, there is an increasing need for technology that expands access to relevant information through text mining. This study analyzed internet articles from portal sites in 2023 to identify trends in the information available to cancer patients and to derive meaningful insights.

Methods: This study analyzed 19,578 news articles published on Naver, a major Korean portal site, from January 1, 2023, to December 31, 2023. Natural language processing, text mining, network analysis, and word cloud analysis were employed. The search term "am" (Korean for "cancer") was used to identify keywords related to cancer.

Results: In 2023, an average of 1,631 cancer-related articles were published monthly, with a peak of 1,946 in September and a low of 1,371 in February. A total of 132,456 keywords were extracted, with "cure" (2,218 occurrences), "lung cancer" (1,652), and "breast cancer" (1,235) being the most frequent. Term frequency-inverse document frequency analysis ranked "struggle" (1064.172) as the most significant keyword, followed by "lung cancer" (839.988) and "breast cancer" (744.840). Network analysis revealed four distinct clusters focusing on treatment, celebrity-related issues, major cancer types, and cancer-causing factors.

Conclusions: The analysis of cancer-related keywords in 2023 indicates that news articles often prioritize gossip over essential information. These findings provide foundational data for future policy directions and strategies to address misinformation. This study underscores the importance of understanding the nature of cancer-related information consumed by the public and offers insights to guide official policies and healthcare practices.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Healthcare Informatics Research
Healthcare Informatics Research MEDICAL INFORMATICS-
CiteScore
4.90
自引率
6.90%
发文量
44
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信