{"title":"Continued Supporting a Systematic Literature Review by Applying Text Mining Methods","authors":"T. Georgieva-Trifonova","doi":"10.1109/INFOTEH53737.2022.9751318","DOIUrl":null,"url":null,"abstract":"In the present paper, a framework for the continued supporting a systematic literature review (SLR) is proposed, which includes the application of text mining methods in order to automate the classification of scientific publications and the more in-depth analysis of their content. For this purpose, a dataset is created from the titles, abstracts and keywords of papers, included in a systematic literature review on the application of semantic technologies in bibliographic databases. Data analytics methods are applied - frequency analysis of words and word combinations; linear regression for trend exploration; text classification, where the categories are the applied semantic technologies or the researched problems in accordance with a pre-defined classification framework. The vector space model enriched with PMI (pointwise mutual information) measure is used for the classification. An assessment of the text classification performance in terms of various measures is made and the obtained results are summarized.","PeriodicalId":6839,"journal":{"name":"2022 21st International Symposium INFOTEH-JAHORINA (INFOTEH)","volume":"1 1","pages":"1-5"},"PeriodicalIF":0.0000,"publicationDate":"2022-03-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 21st International Symposium INFOTEH-JAHORINA (INFOTEH)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/INFOTEH53737.2022.9751318","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
In the present paper, a framework for the continued supporting a systematic literature review (SLR) is proposed, which includes the application of text mining methods in order to automate the classification of scientific publications and the more in-depth analysis of their content. For this purpose, a dataset is created from the titles, abstracts and keywords of papers, included in a systematic literature review on the application of semantic technologies in bibliographic databases. Data analytics methods are applied - frequency analysis of words and word combinations; linear regression for trend exploration; text classification, where the categories are the applied semantic technologies or the researched problems in accordance with a pre-defined classification framework. The vector space model enriched with PMI (pointwise mutual information) measure is used for the classification. An assessment of the text classification performance in terms of various measures is made and the obtained results are summarized.