{"title":"Topic Evolution and Emerging Topic Analysis Based on Open Source Software","authors":"Xiang Shen, Li Wang","doi":"10.2478/jdis-2020-0033","DOIUrl":"https://doi.org/10.2478/jdis-2020-0033","url":null,"abstract":"Abstract Purpose We present an analytical, open source and flexible natural language processing and text mining method for topic evolution, emerging topic detection and research trend forecasting for all kinds of data-tagged text. Design/methodology/approach We make full use of the functions provided by the open source VOSviewer and Microsoft Office, including a thesaurus for data clean-up and a LOOKUP function for comparative analysis. Findings Through application and verification in the domain of perovskite solar cells research, this method proves to be effective. Research limitations A certain amount of manual data processing and a specific research domain background are required for better, more illustrative analysis results. Adequate time for analysis is also necessary. Practical implications We try to set up an easy, useful, and flexible interdisciplinary text analyzing procedure for researchers, especially those without solid computer programming skills or who cannot easily access complex software. This procedure can also serve as a wonderful example for teaching information literacy. Originality/value This text analysis approach has not been reported before.","PeriodicalId":92237,"journal":{"name":"Journal of data and information science (Warsaw, Poland)","volume":"5 1","pages":"126 - 136"},"PeriodicalIF":0.0,"publicationDate":"2020-09-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44822114","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Global Collaboration in Artificial Intelligence: Bibliometrics and Network Analysis from 1985 to 2019","authors":"Haotian Hu, Dongbo Wang, Sanhong Deng","doi":"10.2478/jdis-2020-0027","DOIUrl":"https://doi.org/10.2478/jdis-2020-0027","url":null,"abstract":"Abstract Purpose This study aims to explore the trend and status of international collaboration in the field of artificial intelligence (AI) and to understand the hot topics, core groups, and major collaboration patterns in global AI research. Design/methodology/approach We selected 38,224 papers in the field of AI from 1985 to 2019 in the core collection database of Web of Science (WoS) and studied international collaboration from the perspectives of authors, institutions, and countries through bibliometric analysis and social network analysis. Findings The bibliometric results show that in the field of AI, the number of published papers is increasing every year, and 84.8% of them are cooperative papers. Collaboration with more than three authors, collaboration between two countries and collaboration within institutions are the three main levels of collaboration patterns. Through social network analysis, this study found that the US, the UK, France, and Spain led global collaboration research in the field of AI at the country level, while Vietnam, Saudi Arabia, and United Arab Emirates had a high degree of international participation. Collaboration at the institution level reflects obvious regional and economic characteristics. There are the Developing Countries Institution Collaboration Group led by Iran, China, and Vietnam, as well as the Developed Countries Institution Collaboration Group led by the US, Canada, the UK. Also, the Chinese Academy of Sciences (China) plays an important, pivotal role in connecting the these institutional collaboration groups. Research limitations First, participant contributions in international collaboration may have varied, but in our research they are viewed equally when building collaboration networks. Second, although the edge weight in the collaboration network is considered, it is only used to help reduce the network and does not reflect the strength of collaboration. Practical implications The findings fill the current shortage of research on international collaboration in AI. They will help inform scientists and policy makers about the future of AI research. Originality/value This work is the longest to date regarding international collaboration in the field of AI. This research explores the evolution, future trends, and major collaboration patterns of international collaboration in the field of AI over the past 35 years. It also reveals the leading countries, core groups, and characteristics of collaboration in the field of AI.","PeriodicalId":92237,"journal":{"name":"Journal of data and information science (Warsaw, Poland)","volume":"5 1","pages":"86 - 115"},"PeriodicalIF":0.0,"publicationDate":"2020-09-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47767991","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Can Crossref Citations Replace Web of Science for Research Evaluation? The Share of Open Citations","authors":"Tomás Chudlarský, J. Dvorák","doi":"10.2478/jdis-2020-0037","DOIUrl":"https://doi.org/10.2478/jdis-2020-0037","url":null,"abstract":"Abstract Purpose We study the proportion of Web of Science (WoS) citation links that are represented in the Crossref Open Citation Index (COCI), with the possible aim of using COCI in research evaluation instead of the WoS, if the level of coverage was sufficient. Design/methodology/approach We calculate the proportion on citation links where both publications have a WoS accession number and a DOI simultaneously, and where the cited publications have had at least one author from our institution, the Czech Technical University in Prague. We attempt to look up each such citation link in COCI. Findings We find that 53.7% of WoS citation links are present in the COCI. The proportion varies largely by discipline. The total figures differ significantly from 40% in the large-scale study by Van Eck, Waltman, Larivière, and Sugimoto (blog 2018, https://www.cwts.nl/blog?article=n-r2s234). Research limitations The sample does not cover all science areas uniformly; it is heavily focused on Engineering and Technology, and only some disciplines of Natural Sciences are present. However, this reflects the real scientific orientation and publication profile of our institution. Practical implications The current level of coverage is not sufficient for the WoS to be replaced by COCI for research evaluation. Originality/value The present study illustrates a COCI vs WoS comparison on the scale of a larger technical university in Central Europe.","PeriodicalId":92237,"journal":{"name":"Journal of data and information science (Warsaw, Poland)","volume":"5 1","pages":"35 - 42"},"PeriodicalIF":0.0,"publicationDate":"2020-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43546453","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
R. Sampaio, António Abreu, B. Ferreira, M. Barreto, Jesús P. Mena-Chalco
{"title":"Scientometric Analysis of Research Output from Brazil in Response to the Zika Crisis Using e-Lattes","authors":"R. Sampaio, António Abreu, B. Ferreira, M. Barreto, Jesús P. Mena-Chalco","doi":"10.2478/jdis-2020-0038","DOIUrl":"https://doi.org/10.2478/jdis-2020-0038","url":null,"abstract":"Abstract Purpose This paper aims to test the use of e-Lattes to map the Brazilian scientific output in a recent research health subject: Zika Virus. Design/methodology/approach From a set of Lattes CVs of Zika researchers registered on the Lattes Platform, we used the e-Lattes to map the Brazilian scientific response to the Zika crisis. Findings Brazilian science articulated quickly during the public health emergency of international concern (PHEIC) due to the creation of mechanisms to streamline funding of scientific research. Research limitations We did not assess any dimension of research quality, including the scientific impact and societal value. Practical implications e-Lattes can provide useful guidelines for different stakeholders in research groups from Lattes CVs of members. Originality/value The information included in Lattes CVs permits us to assess science from a broader perspective taking into account not only scientific research production but also the training of human resources and scientific collaboration.","PeriodicalId":92237,"journal":{"name":"Journal of data and information science (Warsaw, Poland)","volume":"5 1","pages":"137 - 146"},"PeriodicalIF":0.0,"publicationDate":"2020-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42912574","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
G. Catalano, C. Daraio, J. Leta, H. Moed, G. Ruocco, Xiaolin Zhang
{"title":"Novel Approaches to the Development and Application of Informetric and Scientometric Tools","authors":"G. Catalano, C. Daraio, J. Leta, H. Moed, G. Ruocco, Xiaolin Zhang","doi":"10.2478/jdis-2020-0022","DOIUrl":"https://doi.org/10.2478/jdis-2020-0022","url":null,"abstract":"This volume (Vol. 5, No. 3) of the Journal of Data and Information Science (JDIS) is the Part I of the Special Issue on ISSI 2019, the 17th International Conference on Scientometrics and Informetrics (ISSI2019) held in Rome, on 2–5 September 2019 and includes the first part of the selected posters presented during the conference and extended by the authors afterward. The goal of ISSI 2019 was to bring together scholars and practitioners in the area of informetrics, bibliometrics, scientometrics, webometrics and altmetrics to discuss new research directions, methods and theories, and to highlight the best research in this area. The 13 selected papers included in this issue relate the general topic of novel approaches to the development and application of informetric and scientometric tools and have been grouped in four themes:","PeriodicalId":92237,"journal":{"name":"Journal of data and information science (Warsaw, Poland)","volume":"5 1","pages":"1 - 4"},"PeriodicalIF":0.0,"publicationDate":"2020-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43348632","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Micro Perspective of Research Dynamics Through “Citations of Citations” Topic Analysis","authors":"Xiaoli Chen, T. Han","doi":"10.2478/jdis-2020-0034","DOIUrl":"https://doi.org/10.2478/jdis-2020-0034","url":null,"abstract":"Abstract Purpose Research dynamics have long been a research interest. It is a macro perspective tool for discovering temporal research trends of a certain discipline or subject. A micro perspective of research dynamics, however, concerning a single researcher or a highly cited paper in terms of their citations and “citations of citations” (forward chaining) remains unexplored. Design/methodology/approach In this paper, we use a cross-collection topic model to reveal the research dynamics of topic disappearance topic inheritance, and topic innovation in each generation of forward chaining. Findings For highly cited work, scientific influence exists in indirect citations. Topic modeling can reveal how long this influence exists in forward chaining, as well as its influence. Research limitations This paper measures scientific influence and indirect scientific influence only if the relevant words or phrases are borrowed or used in direct or indirect citations. Paraphrasing or semantically similar concept may be neglected in this research. Practical implications This paper demonstrates that a scientific influence exists in indirect citations through its analysis of forward chaining. This can serve as an inspiration on how to adequately evaluate research influence. Originality The main contributions of this paper are the following three aspects. First, besides research dynamics of topic inheritance and topic innovation, we model topic disappearance by using a cross-collection topic model. Second, we explore the length and character of the research impact through “citations of citations” content analysis. Finally, we analyze the research dynamics of artificial intelligence researcher Geoffrey Hinton's publications and the topic dynamics of forward chaining.","PeriodicalId":92237,"journal":{"name":"Journal of data and information science (Warsaw, Poland)","volume":"5 1","pages":"19 - 34"},"PeriodicalIF":0.0,"publicationDate":"2020-07-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49228151","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Classification of Paper Values Based on Citation Rank and PageRank","authors":"W. Souma, I. Vodenska, Lubomir T. Chitkushev","doi":"10.2478/jdis-2020-0031","DOIUrl":"https://doi.org/10.2478/jdis-2020-0031","url":null,"abstract":"Abstract Purpose The number of citations has been widely used to measure the significance of a paper. However, there is a need in introducing another index to determine superiority or inferiority of papers with the same number of citations. We determine superiority or inferiority of papers by using the ranking based on the number of citations and PageRank. Design/methodology/approach We show the positive linear correlation between Citation Rank (the ranking of the number of citation) and PageRank. On this basis, we identify high-quality, prestige, emerging, and popular papers. Findings We found that the high-quality papers belong to the subjects of biochemistry and molecular biology, chemistry, and multidisciplinary sciences. The prestige papers correspond to the subjects of computer science, engineering, and information science. The emerging papers are related to biochemistry and molecular biology, as well as those published in the journal “Cell.” The popular papers belong to the subject of multidisciplinary sciences. Research limitations We analyze the Science Citation Index Expanded (SCIE) from 1981 to 2015 to calculate Citation Rank and PageRank within a citation network consisting of 34,666,719 papers and 591,321,826 citations. Practical implications Our method is applicable to forecast emerging fields of research subjects in science and helps policymakers to consider science policy. Originality/value We calculated PageRank for a giant citation network which is extremely larger than the citation networks investigated by previous researchers.","PeriodicalId":92237,"journal":{"name":"Journal of data and information science (Warsaw, Poland)","volume":"5 1","pages":"57 - 70"},"PeriodicalIF":0.0,"publicationDate":"2020-07-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42121497","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Scientometric Study of Digital Literacy, ICT Literacy, Information Literacy, and Media Literacy","authors":"H. Park, Hansol Kim, H. Park","doi":"10.2478/jdis-2021-0001","DOIUrl":"https://doi.org/10.2478/jdis-2021-0001","url":null,"abstract":"Abstract Purpose Digital literacy and related fields have received interests from scholars and practitioners for more than 20 years; nonetheless, academic communities need to systematically review how the fields have developed. This study aims to investigate the research trends of digital literacy and related concepts since the year of 2000, especially in education. Design/methodology/approach The current study analyzes keywords, co-authorship, and cited publications in digital literacy through the scientometric method. The journal articles have been retrieved from the WoS (Web of Science) using four keywords: “Digital literacy,” “ICT literacy,” “information literacy,” and “media literacy.” Further, keywords, publications, and co-authorship are examined and further classified into clusters for more in-depth investigation. Findings Digital literacy is a multidisciplinary field that widely embraces literacy, ICT, the Internet, computer skill proficiency, science, nursing, health, and language education. The participants, or study subjects, in digital literacy research range from primary students to professionals, and the co-authorship clusters are distinctive by countries in America and Europe. Research limitations This paper analyzes one fixed chunk of a dataset obtained by searching for all four keywords at once. Further studies will retrieve the data from diverse disciplines and will trace the change of the leading research themes by time spans. Practical implications To shed light on the findings, using customized digital literacy curriculums and technology is critical for learners at different ages to nurture digital literacy according to their learning aims. They need to cultivate their understanding of the social impact of exploiting technology and computational thinking. To increase the originality of digital literacy-related studies, researchers from different countries and cultures may collaborate to investigate a broader range of digital literacy environments. Originality/value The present study reviews research trends in digital literacy and related areas by performing a scientometric study to analyze multidimensional aspects in the fields, including keywords, journal titles, co-authorship, and cited publications.","PeriodicalId":92237,"journal":{"name":"Journal of data and information science (Warsaw, Poland)","volume":"54 13","pages":"116 - 138"},"PeriodicalIF":0.0,"publicationDate":"2020-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41265068","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Discrimination Index Based on Jain's Fairness Index to Differentiate Researchers with Identical H-index Values","authors":"Adian Fatchur Rochim, Abdul Muis, R. F. Sari","doi":"10.2478/jdis-2020-0026","DOIUrl":"https://doi.org/10.2478/jdis-2020-0026","url":null,"abstract":"Abstract Purpose This paper proposes a discrimination index method based on the Jain's fairness index to distinguish researchers with the same H-index. Design/methodology/approach A validity test is used to measure the correlation of D-offset with the parameters, i.e. H-index, the number of cited papers, the total number of citations, the number of indexed papers, and the number of uncited papers. The correlation test is based on the Saphiro-Wilk method and Pearson's product-moment correlation. Findings The result from the discrimination index calculation is a two-digit decimal value called the discrimination-offset (D-offset), with a range of D-offset from 0.00 to 0.99. The result of the correlation value between the D-offset and the number of uncited papers is 0.35, D-offset with the number of indexed papers is 0.24, and the number of cited papers is 0.27. The test provides the result that it is very unlikely that there exists no relationship between the parameters. Practical implications For this reason, D-offset is proposed as an additional parameter for H-index to differentiate researchers with the same H-index. The H-index for researchers can be written with the format of “H-index: D-offset”. Originality/value D-offset is worthy to be considered as a complement value to add the H-index value. If the D-offset is added in the H-index value, the H-index will have more discrimination power to differentiate the rank of the researchers who have the same H-index.","PeriodicalId":92237,"journal":{"name":"Journal of data and information science (Warsaw, Poland)","volume":"5 1","pages":"5 - 18"},"PeriodicalIF":0.0,"publicationDate":"2020-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46340395","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Novel Method for Resolving and Completing Authors’ Country Affiliation Data in Bibliographic Records","authors":"B. Nguyen, J. Dinneen, Markus Luczak-Rösch","doi":"10.2478/jdis-2020-0020","DOIUrl":"https://doi.org/10.2478/jdis-2020-0020","url":null,"abstract":"Abstract Purpose Our work seeks to overcome data quality issues related to incomplete author affiliation data in bibliographic records in order to support accurate and reliable measurement of international research collaboration (IRC). Design/methodology/approch We propose, implement, and evaluate a method that leverages the Web-based knowledge graph Wikidata to resolve publication affiliation data to particular countries. The method is tested with general and domain-specific data sets. Findings Our evaluation covers the magnitude of improvement, accuracy, and consistency. Results suggest the method is beneficial, reliable, and consistent, and thus a viable and improved approach to measuring IRC. Research limitations Though our evaluation suggests the method works with both general and domain-specific bibliographic data sets, it may perform differently with data sets not tested here. Further limitations stem from the use of the R programming language and R libraries for country identification as well as imbalanced data coverage and quality in Wikidata that may also change over time. Practical implications The new method helps to increase the accuracy in IRC studies and provides a basis for further development into a general tool that enriches bibliographic data using the Wikidata knowledge graph. Originality This is the first attempt to enrich bibliographic data using a peer-produced, Web-based knowledge graph like Wikidata.","PeriodicalId":92237,"journal":{"name":"Journal of data and information science (Warsaw, Poland)","volume":"5 1","pages":"115 - 97"},"PeriodicalIF":0.0,"publicationDate":"2020-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43735062","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}