文本挖掘在地理搜索过滤器的开发和验证中的应用,以促进Ovid MEDLINE中的证据检索:一个来自美国的例子

IF 2.2 4区 医学 Q2 INFORMATION SCIENCE & LIBRARY SCIENCE
Antoinette Cheung MPH, Evan Popoff MSc, Shelagh M. Szabo MSc
{"title":"文本挖掘在地理搜索过滤器的开发和验证中的应用,以促进Ovid MEDLINE中的证据检索:一个来自美国的例子","authors":"Antoinette Cheung MPH,&nbsp;Evan Popoff MSc,&nbsp;Shelagh M. Szabo MSc","doi":"10.1111/hir.12471","DOIUrl":null,"url":null,"abstract":"<div>\n \n \n <section>\n \n <h3> Background</h3>\n \n <p>Given the increasing volume of published research in bibliographic databases, efficient retrieval of evidence is crucial and represents an opportunity to integrate novel techniques such as text mining.</p>\n </section>\n \n <section>\n \n <h3> Objectives</h3>\n \n <p>To develop and validate a geographic search filter for identifying research from the United States (US) in Ovid MEDLINE.</p>\n </section>\n \n <section>\n \n <h3> Methods</h3>\n \n <p>US and non-US citations were collected from bibliographies of evidence-based reviews. Citations were partitioned by US/non-US status and randomly divided to a training and testing set. Using text mining, common one- and two-word terms in title/abstract fields were identified, and frequencies compared between US/non-US citations.</p>\n </section>\n \n <section>\n \n <h3> Results</h3>\n \n <p>Common US-related terms included (as ratio of frequency in US/non-US citations) US populations and geographic terms [e.g., ‘Americans’ (15.5), ‘Baltimore’ (20.0)]. Common non-US terms were non-US geographic terms [e.g., ‘Japan’ (0.04), ‘French’ (0.05)]. A search filter was developed with 98.3% sensitivity and 82.7% specificity.</p>\n </section>\n \n <section>\n \n <h3> Discussion</h3>\n \n <p>This search filter will streamline the identification of evidence from the US. Periodic updates may be necessary to reflect changes in MEDLINE's controlled vocabulary.</p>\n </section>\n \n <section>\n \n <h3> Conclusion</h3>\n \n <p>Text mining was instrumental to the development of this search filter. A novel technique generated a gold standard set comprising &gt;20,000 citations. This method may be adapted to develop subsequent geographic search filters.</p>\n </section>\n </div>","PeriodicalId":47580,"journal":{"name":"Health Information and Libraries Journal","volume":"40 2","pages":"169-180"},"PeriodicalIF":2.2000,"publicationDate":"2022-12-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Application of text mining to the development and validation of a geographic search filter to facilitate evidence retrieval in Ovid MEDLINE: An example from the United States\",\"authors\":\"Antoinette Cheung MPH,&nbsp;Evan Popoff MSc,&nbsp;Shelagh M. Szabo MSc\",\"doi\":\"10.1111/hir.12471\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div>\\n \\n \\n <section>\\n \\n <h3> Background</h3>\\n \\n <p>Given the increasing volume of published research in bibliographic databases, efficient retrieval of evidence is crucial and represents an opportunity to integrate novel techniques such as text mining.</p>\\n </section>\\n \\n <section>\\n \\n <h3> Objectives</h3>\\n \\n <p>To develop and validate a geographic search filter for identifying research from the United States (US) in Ovid MEDLINE.</p>\\n </section>\\n \\n <section>\\n \\n <h3> Methods</h3>\\n \\n <p>US and non-US citations were collected from bibliographies of evidence-based reviews. Citations were partitioned by US/non-US status and randomly divided to a training and testing set. Using text mining, common one- and two-word terms in title/abstract fields were identified, and frequencies compared between US/non-US citations.</p>\\n </section>\\n \\n <section>\\n \\n <h3> Results</h3>\\n \\n <p>Common US-related terms included (as ratio of frequency in US/non-US citations) US populations and geographic terms [e.g., ‘Americans’ (15.5), ‘Baltimore’ (20.0)]. Common non-US terms were non-US geographic terms [e.g., ‘Japan’ (0.04), ‘French’ (0.05)]. A search filter was developed with 98.3% sensitivity and 82.7% specificity.</p>\\n </section>\\n \\n <section>\\n \\n <h3> Discussion</h3>\\n \\n <p>This search filter will streamline the identification of evidence from the US. Periodic updates may be necessary to reflect changes in MEDLINE's controlled vocabulary.</p>\\n </section>\\n \\n <section>\\n \\n <h3> Conclusion</h3>\\n \\n <p>Text mining was instrumental to the development of this search filter. A novel technique generated a gold standard set comprising &gt;20,000 citations. This method may be adapted to develop subsequent geographic search filters.</p>\\n </section>\\n </div>\",\"PeriodicalId\":47580,\"journal\":{\"name\":\"Health Information and Libraries Journal\",\"volume\":\"40 2\",\"pages\":\"169-180\"},\"PeriodicalIF\":2.2000,\"publicationDate\":\"2022-12-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Health Information and Libraries Journal\",\"FirstCategoryId\":\"91\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1111/hir.12471\",\"RegionNum\":4,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"INFORMATION SCIENCE & LIBRARY SCIENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Health Information and Libraries Journal","FirstCategoryId":"91","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1111/hir.12471","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"INFORMATION SCIENCE & LIBRARY SCIENCE","Score":null,"Total":0}
引用次数: 2

摘要

鉴于书目数据库中发表的研究数量不断增加,有效地检索证据至关重要,并且代表了整合文本挖掘等新技术的机会。目的开发并验证一个地理搜索过滤器,用于在Ovid MEDLINE中识别来自美国的研究。方法收集美国和非美国引用文献。引用按美国/非美国状态划分,并随机分为训练和测试集。使用文本挖掘,识别标题/摘要字段中常见的单词和双词术语,并比较美国/非美国引用之间的频率。常见的美国相关术语包括(按美国/非美国引用频率的比例)美国人口和地理术语[例如,“美国人”(15.5),“巴尔的摩”(20.0)]。常见的非美国术语是非美国地理术语[例如,“日本”(0.04),“法国”(0.05)]。开发的搜索过滤器灵敏度为98.3%,特异性为82.7%。此搜索过滤器将简化来自美国的证据的识别。可能需要定期更新以反映MEDLINE受控词汇表中的变化。结论文本挖掘有助于该搜索过滤器的开发。一项新技术产生了包含20,000条引用的金标准集。该方法可用于开发后续的地理搜索过滤器。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Application of text mining to the development and validation of a geographic search filter to facilitate evidence retrieval in Ovid MEDLINE: An example from the United States

Background

Given the increasing volume of published research in bibliographic databases, efficient retrieval of evidence is crucial and represents an opportunity to integrate novel techniques such as text mining.

Objectives

To develop and validate a geographic search filter for identifying research from the United States (US) in Ovid MEDLINE.

Methods

US and non-US citations were collected from bibliographies of evidence-based reviews. Citations were partitioned by US/non-US status and randomly divided to a training and testing set. Using text mining, common one- and two-word terms in title/abstract fields were identified, and frequencies compared between US/non-US citations.

Results

Common US-related terms included (as ratio of frequency in US/non-US citations) US populations and geographic terms [e.g., ‘Americans’ (15.5), ‘Baltimore’ (20.0)]. Common non-US terms were non-US geographic terms [e.g., ‘Japan’ (0.04), ‘French’ (0.05)]. A search filter was developed with 98.3% sensitivity and 82.7% specificity.

Discussion

This search filter will streamline the identification of evidence from the US. Periodic updates may be necessary to reflect changes in MEDLINE's controlled vocabulary.

Conclusion

Text mining was instrumental to the development of this search filter. A novel technique generated a gold standard set comprising >20,000 citations. This method may be adapted to develop subsequent geographic search filters.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Health Information and Libraries Journal
Health Information and Libraries Journal INFORMATION SCIENCE & LIBRARY SCIENCE-
CiteScore
6.70
自引率
10.50%
发文量
52
期刊介绍: Health Information and Libraries Journal (HILJ) provides practitioners, researchers, and students in library and health professions an international and interdisciplinary forum. Its objectives are to encourage discussion and to disseminate developments at the frontiers of information management and libraries. A major focus is communicating practices that are evidence based both in managing information and in supporting health care. The Journal encompasses: - Identifying health information needs and uses - Managing programmes and services in the changing health environment - Information technology and applications in health - Educating and training health information professionals - Outreach to health user groups
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信