Text phrase-mining in identifying and classifying maternal proteins and genes across preeclampsia and similar pathologies.

IF 2.2 Q3 PHYSIOLOGY
Jacqueline G Urdang, Stephanie Masters, Nneoma Edokobi, Chitra Mukherjee, Arnib Quazi, David A Liem, Monica Ahrens, Xuan Wang, Megan Whitham
{"title":"Text phrase-mining in identifying and classifying maternal proteins and genes across preeclampsia and similar pathologies.","authors":"Jacqueline G Urdang, Stephanie Masters, Nneoma Edokobi, Chitra Mukherjee, Arnib Quazi, David A Liem, Monica Ahrens, Xuan Wang, Megan Whitham","doi":"10.14814/phy2.70262","DOIUrl":null,"url":null,"abstract":"<p><p>This study aims to demonstrate that text phrase-mining and natural language processing (NLP) can annotate huge quantities of obstetrics textual data for the discovery and evaluation of maternal protein/gene (MPG)-disease interactions involved in the preeclampsia pathway. We employ a phrase-mining/NLP pipeline to evaluate unique MPGs involved in six cardiovascular derangements with overlapping presentations during pregnancy. The diseases were matched with Medical Subject Headings. A textual corpus was developed from abstracts matched to these terms through PubMed. Fourty-four MPGs were identified with respect to the diseases. Processing was performed, with unique scores for each MPG-disease pair. Components of the score were calculated and weighted for distinctness, integrity, and popularity. Statistical analyses were conducted for the examination of protein-disease relationships. Fourty-four MPGs with known associations to cardiovascular disease and preeclampsia pathways were identified among the 6 diseases. MPGs shared across the greatest number of disease states were implicated in: (1) angiogenesis and vasoconstriction, (2) hemodynamic regulation, (3) hormonal regulation of metabolism, and (4) inflammation. NLP and text phrase-mining are successfully applied to Obstetrics abstracts with accuracy and speed. This approach holds promise in synthesizing large volumes of data for presenting trends in the Obstetric literature and for the identification of promising biomarkers.</p>","PeriodicalId":20083,"journal":{"name":"Physiological Reports","volume":"13 6","pages":"e70262"},"PeriodicalIF":2.2000,"publicationDate":"2025-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11919630/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Physiological Reports","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.14814/phy2.70262","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"PHYSIOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

This study aims to demonstrate that text phrase-mining and natural language processing (NLP) can annotate huge quantities of obstetrics textual data for the discovery and evaluation of maternal protein/gene (MPG)-disease interactions involved in the preeclampsia pathway. We employ a phrase-mining/NLP pipeline to evaluate unique MPGs involved in six cardiovascular derangements with overlapping presentations during pregnancy. The diseases were matched with Medical Subject Headings. A textual corpus was developed from abstracts matched to these terms through PubMed. Fourty-four MPGs were identified with respect to the diseases. Processing was performed, with unique scores for each MPG-disease pair. Components of the score were calculated and weighted for distinctness, integrity, and popularity. Statistical analyses were conducted for the examination of protein-disease relationships. Fourty-four MPGs with known associations to cardiovascular disease and preeclampsia pathways were identified among the 6 diseases. MPGs shared across the greatest number of disease states were implicated in: (1) angiogenesis and vasoconstriction, (2) hemodynamic regulation, (3) hormonal regulation of metabolism, and (4) inflammation. NLP and text phrase-mining are successfully applied to Obstetrics abstracts with accuracy and speed. This approach holds promise in synthesizing large volumes of data for presenting trends in the Obstetric literature and for the identification of promising biomarkers.

求助全文
约1分钟内获得全文 求助全文
来源期刊
Physiological Reports
Physiological Reports PHYSIOLOGY-
CiteScore
4.20
自引率
4.00%
发文量
374
审稿时长
9 weeks
期刊介绍: Physiological Reports is an online only, open access journal that will publish peer reviewed research across all areas of basic, translational, and clinical physiology and allied disciplines. Physiological Reports is a collaboration between The Physiological Society and the American Physiological Society, and is therefore in a unique position to serve the international physiology community through quick time to publication while upholding a quality standard of sound research that constitutes a useful contribution to the field.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信