{"title":"Current Trends and New Approaches in Participatory Health Informatics.","authors":"Kerstin Denecke, Elia Gabarron, Carolyn Petersen","doi":"10.1055/s-0043-1777732","DOIUrl":"10.1055/s-0043-1777732","url":null,"abstract":"","PeriodicalId":49822,"journal":{"name":"Methods of Information in Medicine","volume":" ","pages":"151-153"},"PeriodicalIF":1.7,"publicationDate":"2023-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139075728","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Elizabeth I Harrison, Laura A Kirkpatrick, Patrick W Harrison, Traci M Kazmerski, Yoshimi Sogawa, Harry S Hochheiser
{"title":"Use of Natural Language Processing to Identify Sexual and Reproductive Health Information in Clinical Text.","authors":"Elizabeth I Harrison, Laura A Kirkpatrick, Patrick W Harrison, Traci M Kazmerski, Yoshimi Sogawa, Harry S Hochheiser","doi":"10.1055/a-2233-2736","DOIUrl":"10.1055/a-2233-2736","url":null,"abstract":"<p><strong>Objectives: </strong>This study aimed to enable clinical researchers without expertise in natural language processing (NLP) to extract and analyze information about sexual and reproductive health (SRH), or other sensitive health topics, from large sets of clinical notes.</p><p><strong>Methods: </strong>(1) We retrieved text from the electronic health record as individual notes. (2) We segmented notes into sentences using one of scispaCy's NLP toolkits. (3) We exported sentences to the labeling application Watchful and annotated subsets of these as relevant or irrelevant to various SRH categories by applying a combination of regular expressions and manual annotation. (4) The labeled sentences served as training data to create machine learning models for classifying text; specifically, we used spaCy's default text classification ensemble, comprising a bag-of-words model and a neural network with attention. (5) We applied each model to unlabeled sentences to identify additional references to SRH with novel relevant vocabulary. We used this information and repeated steps 3 to 5 iteratively until the models identified no new relevant sentences for each topic. Finally, we aggregated the labeled data for analysis.</p><p><strong>Results: </strong>This methodology was applied to 3,663 Child Neurology notes for 971 female patients. Our search focused on six SRH categories. We validated the approach using two subject matter experts, who independently labeled a sample of 400 sentences. Cohen's kappa values were calculated for each category between the reviewers (menstruation: 1, sexual activity: 0.9499, contraception: 0.9887, folic acid: 1, teratogens: 0.8864, pregnancy: 0.9499). After removing the sentences on which reviewers did not agree, we compared the reviewers' labels to those produced via our methodology, again using Cohen's kappa (menstruation: 1, sexual activity: 1, contraception: 0.9885, folic acid: 1, teratogens: 0.9841, pregnancy: 0.9871).</p><p><strong>Conclusion: </strong>Our methodology is reproducible, enables analysis of large amounts of text, and has produced results that are highly comparable to subject matter expert manual review.</p>","PeriodicalId":49822,"journal":{"name":"Methods of Information in Medicine","volume":" ","pages":"193-201"},"PeriodicalIF":1.7,"publicationDate":"2023-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138832647","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jonas Bienzeisler, Ariadna Perez-Garriga, Lea C Brandl, Ann-Kristin Kock-Schoppenhauer, Yasmin Hollenbenders, Maximilian Kurscheidt, Christina Schüttler
{"title":"Report from the 68th GMDS Annual Meeting: Science. Close to People.","authors":"Jonas Bienzeisler, Ariadna Perez-Garriga, Lea C Brandl, Ann-Kristin Kock-Schoppenhauer, Yasmin Hollenbenders, Maximilian Kurscheidt, Christina Schüttler","doi":"10.1055/s-0043-1777733","DOIUrl":"10.1055/s-0043-1777733","url":null,"abstract":"","PeriodicalId":49822,"journal":{"name":"Methods of Information in Medicine","volume":"62 5-06","pages":"202-205"},"PeriodicalIF":1.7,"publicationDate":"2023-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139913957","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An Exploratory Study on the Utility of Patient-Generated Health Data as a Tool for Health Care Professionals in Multiple Sclerosis Care.","authors":"Sharon Guardado, Vasiliki Mylonopoulou, Octavio Rivera-Romero, Nadine Patt, Jens Bansi, Guido Giunti","doi":"10.1055/s-0043-1775718","DOIUrl":"10.1055/s-0043-1775718","url":null,"abstract":"<p><strong>Background: </strong>Patient-generated health data (PGHD) are data collected through technologies such as mobile devices and health apps. The integration of PGHD into health care workflows can support the care of chronic conditions such as multiple sclerosis (MS). Patients are often willing to share data with health care professionals (HCPs) in their care team; however, the benefits of PGHD can be limited if HCPs do not find it useful, leading patients to discontinue data tracking and sharing eventually. Therefore, understanding the usefulness of mobile health (mHealth) solutions, which provide PGHD and serve as enablers of the HCPs' involvement in participatory care, could motivate them to continue using these technologies.</p><p><strong>Objective: </strong>The objective of this study is to explore the perceived utility of different types of PGHD from mHealth solutions which could serve as tools for HCPs to support participatory care in MS.</p><p><strong>Method: </strong>A mixed-methods approach was used, combining qualitative research and participatory design. This study includes three sequential phases: data collection, assessment of PGHD utility, and design of data visualizations. In the first phase, 16 HCPs were interviewed. The second and third phases were carried out through participatory workshops, where PGHD types were conceptualized in terms of utility.</p><p><strong>Results: </strong>The study found that HCPs are optimistic about PGHD in MS care. The most useful types of PGHD for HCPs in MS care are patients' habits, lifestyles, and fatigue-inducing activities. Although these subjective data seem more useful for HCPs, it is more challenging to visualize them in a useful and actionable way.</p><p><strong>Conclusion: </strong>HCPs are optimistic about mHealth and PGHD as tools to further understand their patients' needs and support care in MS. HCPs from different disciplines have different perceptions of what types of PGHD are useful; however, subjective types of PGHD seem potentially more useful for MS care.</p>","PeriodicalId":49822,"journal":{"name":"Methods of Information in Medicine","volume":" ","pages":"165-173"},"PeriodicalIF":1.7,"publicationDate":"2023-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10878743/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41137368","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Martti Juhola, Tommi Nikkanen, Juho Niemi, Maiju Welling, Olli Kampman
{"title":"Machine Learning Classification of Psychiatric Data Associated with Compensation Claims for Patient Injuries.","authors":"Martti Juhola, Tommi Nikkanen, Juho Niemi, Maiju Welling, Olli Kampman","doi":"10.1055/s-0043-1771378","DOIUrl":"10.1055/s-0043-1771378","url":null,"abstract":"<p><strong>Background: </strong>Adverse events are common in health care. In psychiatric treatment, compensation claims for patient injuries appear to be less common than in other medical specialties. The most common types of patient injury claims in psychiatry include diagnostic flaws, unprevented suicide, or coercive treatment deemed as unnecessary or harmful.</p><p><strong>Objectives: </strong>The objective was to study whether it is possible to form different categories of patient injury types associated with the psychiatric evaluations of compensation claims and to base machine learning classification on these categories. Further, the binary classification of positive and negative decisions for compensation claims was the other objective.</p><p><strong>Methods: </strong>Finnish psychiatric specialist evaluations for the compensation claims of patient injuries were classified into six different categories called classes applying the machine learning methods of artificial intelligence. In addition, another classification of the same data into two classes was performed to test whether it was possible to classify data cases according to their known decisions, either accepted or declined compensation claim.</p><p><strong>Results: </strong>The former classification task produced relatively good classification results subject to separating between different classes. Instead, the latter was more complex. However, classification accuracies of both tasks could be improved by using the generation of artificial data cases in the preprocessing phase before classifications. This preprocessing improved the classification accuracy of six classes up to 88% when the method of random forests was used for classification and that of the binary classification to 89%.</p><p><strong>Conclusion: </strong>The results show that the objectives defined were possible to solve reasonably.</p>","PeriodicalId":49822,"journal":{"name":"Methods of Information in Medicine","volume":" ","pages":"174-182"},"PeriodicalIF":1.7,"publicationDate":"2023-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10878742/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9868179","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Proposal for a Robust Validated Weighted General Data Protection Regulation-Based Scale to Assess the Quality of Privacy Policies of Mobile Health Applications: An eDelphi Study.","authors":"Jaime Benjumea, Jorge Ropero, Enrique Dorronzoro-Zubiete, Octavio Rivera-Romero, Alejandro Carrasco","doi":"10.1055/a-2155-2021","DOIUrl":"10.1055/a-2155-2021","url":null,"abstract":"<p><strong>Background: </strong>Health care services are undergoing a digital transformation in which the Participatory Health Informatics field has a key role. Within this field, studies aimed to assess the quality of digital tools, including mHealth apps, are conducted. Privacy is one dimension of the quality of an mHealth app. Privacy consists of several components, including organizational, technical, and legal safeguards. Within legal safeguards, giving transparent information to the users on how their data are handled is crucial. This information is usually disclosed to users through the privacy policy document. Assessing the quality of a privacy policy is a complex task and several scales supporting this process have been proposed in the literature. However, these scales are heterogeneous and even not very objective. In our previous study, we proposed a checklist of items guiding the assessment of the quality of an mHealth app privacy policy, based on the General Data Protection Regulation.</p><p><strong>Objective: </strong>To refine the robustness of our General Data Protection Regulation-based privacy scale to assess the quality of an mHealth app privacy policy, to identify new items, and to assign weights for every item in the scale.</p><p><strong>Methods: </strong>A two-round modified eDelphi study was conducted involving a privacy expert panel.</p><p><strong>Results: </strong>After the Delphi process, all the items in the scale were considered \"important\" or \"very important\" (4 and 5 in a 5-point Likert scale, respectively) by most of the experts. One of the original items was suggested to be reworded, while eight tentative items were suggested. Only two of them were finally added after Round 2. Eleven of the 16 items in the scale were considered \"very important\" (weight of 1), while the other 5 were considered \"important\" (weight of 0.5).</p><p><strong>Conclusion: </strong>The Benjumea privacy scale is a new robust tool to assess the quality of an mHealth app privacy policy, providing a deeper and complementary analysis to other scales. Also, this robust scale provides a guideline for the development of high-quality privacy policies of mHealth apps.</p>","PeriodicalId":49822,"journal":{"name":"Methods of Information in Medicine","volume":" ","pages":"154-164"},"PeriodicalIF":1.7,"publicationDate":"2023-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10878744/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10077516","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Prehospital Cardiac Arrest Should be Considered When Evaluating Coronavirus Disease 2019 Mortality in the United States.","authors":"Nick Williams","doi":"10.1055/a-2015-1244","DOIUrl":"https://doi.org/10.1055/a-2015-1244","url":null,"abstract":"<p><strong>Background: </strong>Public health emergencies leave little time to develop novel surveillance efforts. Understanding which preexisting clinical datasets are fit for surveillance use is of high value. Coronavirus disease 2019 (COVID-19) offers a natural applied informatics experiment to understand the fitness of clinical datasets for use in disease surveillance.</p><p><strong>Objectives: </strong>This study evaluates the agreement between legacy surveillance time series data and discovers their relative fitness for use in understanding the severity of the COVID-19 emergency. Here fitness for use means the statistical agreement between events across series.</p><p><strong>Methods: </strong>Thirteen weekly clinical event series from before and during the COVID-19 era for the United States were collected and integrated into a (multi) time series event data model. The Centers for Disease Control and Prevention (CDC) COVID-19 attributable mortality, CDC's excess mortality model, national Emergency Medical Services (EMS) calls, and Medicare encounter level claims were the data sources considered in this study. Cases were indexed by week from January 2015 through June of 2021 and fit to Distributed Random Forest models. Models returned the variable importance when predicting the series of interest from the remaining time series.</p><p><strong>Results: </strong>Model r2 statistics ranged from 0.78 to 0.99 for the share of the volumes predicted correctly. Prehospital data were of high value, and cardiac arrest (CA) prior to EMS arrival was on average the best predictor (tied with study week). COVID-19 Medicare claims volumes can predict COVID-19 death certificates (agreement), while viral respiratory Medicare claim volumes cannot predict Medicare COVID-19 claims (disagreement).</p><p><strong>Conclusion: </strong>Prehospital EMS data should be considered when evaluating the severity of COVID-19 because prehospital CA known to EMS was the strongest predictor on average across indices.</p>","PeriodicalId":49822,"journal":{"name":"Methods of Information in Medicine","volume":"62 3-04","pages":"100-109"},"PeriodicalIF":1.7,"publicationDate":"2023-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ftp.ncbi.nlm.nih.gov/pub/pmc/oa_pdf/81/24/10-1055-a-2015-1244.PMC10462431.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10512033","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Trans-O-MIM-An International Research Project on Open Access Transformation: Outcomes and Lessons Learned.","authors":"Reinhold Haux, Esther Greussing, Stefanie Kuballa, Corinna Mielke, Mareike Schulze, Monika Taddicken","doi":"10.1055/s-0043-1761499","DOIUrl":"https://doi.org/10.1055/s-0043-1761499","url":null,"abstract":"<p><strong>Background: </strong>During the last decades, the Open Access paradigm has become an important approach for publishing new scientific knowledge. From 2015 to 2020, the Trans-O-MIM research project was undertaken with the intention to identify and to explore solutions in transforming subscription-based journals into Open Access journals. Trans-O-MIM stands for strategies, models, and evaluation metrics for the goal-oriented, stepwise, sustainable, and fair transformation of established subscription-based scientific journals into Open-Access-based journals with <i>Methods of Information in Medicine</i> as an example.</p><p><strong>Objectives: </strong>To present an overview of the outcomes of the Trans-O-MIM research project as a whole and to share our major lessons learned.</p><p><strong>Methods: </strong>As an approach for transforming journals, a Tandem Model has been proposed and implemented for <i>Methods of Information in Medicine</i>. For developing a metric to observe and assess journal transformations, scenario analysis has been used. A qualitative and a two-tier quantitative study on drivers and obstacles of Open Access publishing for medical informatics researchers was designed and conducted. A project setup with a research team, a steering committee, and an international advisory board was established. Major international medical informatics events have been used for reporting and for receiving feedback.</p><p><strong>Results: </strong>Based on the Tandem Model, the journal <i>Methods of Information in Medicine</i> has been transformed into a journal where, in addition to its subscription-based track, from 2017 onwards a Gold Open Access track has been successfully added. An evaluation metric, composed of 5 scenarios and 65 parameters, has been developed, which can assist respective decision makers in assessing such transformations. The studies on drivers and obstacles of Open Access publishing showed that, while most researchers support the idea of making scientific knowledge freely accessible to everyone, they are hesitant about actually living this practice by choosing Open Access journals to publish their own work. Article-processing charges and quality issues are perceived as the main obstacles in this respect, revealing a two-sided evaluation of Open Access models, reflecting the different viewpoints of researchers as authors or readers. Especially researchers from low-income countries benefit from a barrier-free communication mainly in their role as readers and much less in their role as authors of scientific information. This became also evident at the institutional level, as Open Access policies or financial support through funding bodies are most prevalent in Europe and North America.</p><p><strong>Conclusion: </strong>With Trans-O-MIM, an international research project was performed. An existing journal has been transformed. In addition, with the support of the International Medical Informatics Association, as well","PeriodicalId":49822,"journal":{"name":"Methods of Information in Medicine","volume":"62 3-04","pages":"140-150"},"PeriodicalIF":1.7,"publicationDate":"2023-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ftp.ncbi.nlm.nih.gov/pub/pmc/oa_pdf/d3/5e/10-1055-s-0043-1761499.PMC10462433.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10139694","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Kerstin Denecke, Octavio Rivera Romero, Carolyn Petersen, Marge Benham-Hutchins, Miguel Cabrer, Shauna Davies, Rebecca Grainger, Rada Hussein, Guillermo Lopez-Campos, Fernando Martin-Sanchez, Mollie McKillop, Mark Merolli, Talya Miron-Shatz, Jesús Daniel Trigo, Graham Wright, Rolf Wynn, Carol Hullin Lucay Cossio, Elia Gabarron
{"title":"Defining and Scoping Participatory Health Informatics: An eDelphi Study.","authors":"Kerstin Denecke, Octavio Rivera Romero, Carolyn Petersen, Marge Benham-Hutchins, Miguel Cabrer, Shauna Davies, Rebecca Grainger, Rada Hussein, Guillermo Lopez-Campos, Fernando Martin-Sanchez, Mollie McKillop, Mark Merolli, Talya Miron-Shatz, Jesús Daniel Trigo, Graham Wright, Rolf Wynn, Carol Hullin Lucay Cossio, Elia Gabarron","doi":"10.1055/a-2035-3008","DOIUrl":"https://doi.org/10.1055/a-2035-3008","url":null,"abstract":"<p><strong>Background: </strong>Health care has evolved to support the involvement of individuals in decision making by, for example, using mobile apps and wearables that may help empower people to actively participate in their treatment and health monitoring. While the term \"participatory health informatics\" (PHI) has emerged in literature to describe these activities, along with the use of social media for health purposes, the scope of the research field of PHI is not yet well defined.</p><p><strong>Objective: </strong>This article proposes a preliminary definition of PHI and defines the scope of the field.</p><p><strong>Methods: </strong>We used an adapted Delphi study design to gain consensus from participants on a definition developed from a previous review of literature. From the literature we derived a set of attributes describing PHI as comprising 18 characteristics, 14 aims, and 4 relations. We invited researchers, health professionals, and health informaticians to score these characteristics and aims of PHI and their relations to other fields over three survey rounds. In the first round participants were able to offer additional attributes for voting.</p><p><strong>Results: </strong>The first round had 44 participants, with 28 participants participating in all three rounds. These 28 participants were gender-balanced and comprised participants from industry, academia, and health sectors from all continents. Consensus was reached on 16 characteristics, 9 aims, and 6 related fields.</p><p><strong>Discussion: </strong>The consensus reached on attributes of PHI describe PHI as a multidisciplinary field that uses information technology and delivers tools with a focus on individual-centered care. It studies various effects of the use of such tools and technology. Its aims address the individuals in the role of patients, but also the health of a society as a whole. There are relationships to the fields of health informatics, digital health, medical informatics, and consumer health informatics.</p><p><strong>Conclusion: </strong>We have proposed a preliminary definition, aims, and relationships of PHI based on literature and expert consensus. These can begin to be used to support development of research priorities and outcomes measurements.</p>","PeriodicalId":49822,"journal":{"name":"Methods of Information in Medicine","volume":"62 3-04","pages":"90-99"},"PeriodicalIF":1.7,"publicationDate":"2023-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ftp.ncbi.nlm.nih.gov/pub/pmc/oa_pdf/67/87/10-1055-a-2035-3008.PMC10462430.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10139697","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An Alternative Application of Natural Language Processing to Express a Characteristic Feature of Diseases in Japanese Medical Records.","authors":"Yoshinori Yamanouchi, Taishi Nakamura, Tokunori Ikeda, Koichiro Usuku","doi":"10.1055/a-2039-3773","DOIUrl":"https://doi.org/10.1055/a-2039-3773","url":null,"abstract":"<p><strong>Background: </strong>Owing to the linguistic situation, Japanese natural language processing (NLP) requires morphological analyses for word segmentation using dictionary techniques.</p><p><strong>Objective: </strong>We aimed to clarify whether it can be substituted with an open-end discovery-based NLP (OD-NLP), which does not use any dictionary techniques.</p><p><strong>Methods: </strong>Clinical texts at the first medical visit were collected for comparison of OD-NLP with word dictionary-based-NLP (WD-NLP). Topics were generated in each document using a topic model, which later corresponded to the respective diseases determined in International Statistical Classification of Diseases and Related Health Problems 10 revision. The prediction accuracy and expressivity of each disease were examined in equivalent number of entities/words after filtration with either term frequency and inverse document frequency (TF-IDF) or dominance value (DMV).</p><p><strong>Results: </strong>In documents from 10,520 observed patients, 169,913 entities and 44,758 words were segmented using OD-NLP and WD-NLP, simultaneously. Without filtering, accuracy and recall levels were low, and there was no difference in the harmonic mean of the F-measure between NLPs. However, physicians reported OD-NLP contained more meaningful words than WD-NLP. When datasets were created in an equivalent number of entities/words with TF-IDF, F-measure in OD-NLP was higher than WD-NLP at lower thresholds. When the threshold increased, the number of datasets created decreased, resulting in increased values of F-measure, although the differences disappeared. Two datasets near the maximum threshold showing differences in F-measure were examined whether their topics were associated with diseases. The results showed that more diseases were found in OD-NLP at lower thresholds, indicating that the topics described characteristics of diseases. The superiority remained as much as that of TF-IDF when filtration was changed to DMV.</p><p><strong>Conclusion: </strong>The current findings prefer the use of OD-NLP to express characteristics of diseases from Japanese clinical texts and may help in the construction of document summaries and retrieval in clinical settings.</p>","PeriodicalId":49822,"journal":{"name":"Methods of Information in Medicine","volume":"62 3-04","pages":"110-118"},"PeriodicalIF":1.7,"publicationDate":"2023-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ftp.ncbi.nlm.nih.gov/pub/pmc/oa_pdf/2b/3b/10-1055-a-2039-3773.PMC10462427.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10141870","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}