{"title":"八个免费学术数据库中出版物元数据的完整程度","authors":"Lorena Delgado-Quirós, José Luis Ortega","doi":"10.1162/qss_a_00286","DOIUrl":null,"url":null,"abstract":"\n The main objective of this study is to compare the amount of metadata and the completeness degree of research publications in new academic databases. Using a quantitative approach, we selected a random Crossref sample of more than 115k records, which was then searched in seven databases (Dimensions, Google Scholar, Microsoft Academic, OpenAlex, Scilit, Semantic Scholar, and The Lens). Seven characteristics were analyzed (abstract, access, bibliographic info, document type, publication date, language, and identifiers), to observe fields that describe this information, the completeness rate of these fields, and the agreement among databases. The results show that academic search engines (Google Scholar, Microsoft Academic, and Semantic Scholar) gather less information and have a low degree of completeness. Conversely, third-party databases (Dimensions, OpenAlex, Scilit, and The Lens) have more metadata quality and a higher completeness rate. We conclude that academic search engines lack the ability to retrieve reliable descriptive data by crawling the Web, while the main problem of third-party databases is the loss of information derived from integrating different sources.\n \n \n https://www.webofscience.com/api/gateway/wos/peer-review/10.1162/qss_a_00286\n","PeriodicalId":4,"journal":{"name":"ACS Applied Energy Materials","volume":" 6","pages":""},"PeriodicalIF":5.4000,"publicationDate":"2024-02-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Completeness degree of publication metadata in eight free-access scholarly databases\",\"authors\":\"Lorena Delgado-Quirós, José Luis Ortega\",\"doi\":\"10.1162/qss_a_00286\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"\\n The main objective of this study is to compare the amount of metadata and the completeness degree of research publications in new academic databases. Using a quantitative approach, we selected a random Crossref sample of more than 115k records, which was then searched in seven databases (Dimensions, Google Scholar, Microsoft Academic, OpenAlex, Scilit, Semantic Scholar, and The Lens). Seven characteristics were analyzed (abstract, access, bibliographic info, document type, publication date, language, and identifiers), to observe fields that describe this information, the completeness rate of these fields, and the agreement among databases. The results show that academic search engines (Google Scholar, Microsoft Academic, and Semantic Scholar) gather less information and have a low degree of completeness. Conversely, third-party databases (Dimensions, OpenAlex, Scilit, and The Lens) have more metadata quality and a higher completeness rate. We conclude that academic search engines lack the ability to retrieve reliable descriptive data by crawling the Web, while the main problem of third-party databases is the loss of information derived from integrating different sources.\\n \\n \\n https://www.webofscience.com/api/gateway/wos/peer-review/10.1162/qss_a_00286\\n\",\"PeriodicalId\":4,\"journal\":{\"name\":\"ACS Applied Energy Materials\",\"volume\":\" 6\",\"pages\":\"\"},\"PeriodicalIF\":5.4000,\"publicationDate\":\"2024-02-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ACS Applied Energy Materials\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1162/qss_a_00286\",\"RegionNum\":3,\"RegionCategory\":\"材料科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"CHEMISTRY, PHYSICAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACS Applied Energy Materials","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1162/qss_a_00286","RegionNum":3,"RegionCategory":"材料科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"CHEMISTRY, PHYSICAL","Score":null,"Total":0}
引用次数: 0
摘要
本研究的主要目的是比较新学术数据库中研究出版物的元数据量和完整程度。我们采用定量方法随机抽取了 Crossref 中超过 11.5 万条记录,然后在七个数据库(Dimensions、Google Scholar、Microsoft Academic、OpenAlex、Scilit、Semantic Scholar 和 The Lens)中进行了检索。分析了七个特征(摘要、访问、书目信息、文档类型、出版日期、语言和标识符),以观察描述这些信息的字段、这些字段的完整率以及数据库之间的一致性。结果显示,学术搜索引擎(Google Scholar、Microsoft Academic 和 Semantic Scholar)收集的信息较少,完整性也较低。相反,第三方数据库(Dimensions、OpenAlex、Scilit 和 The Lens)的元数据质量更高,完整率更高。我们得出的结论是,学术搜索引擎缺乏通过抓取网络检索可靠描述性数据的能力,而第三方数据库的主要问题是整合不同来源的信息时出现丢失。https://www.webofscience.com/api/gateway/wos/peer-review/10.1162/qss_a_00286。
Completeness degree of publication metadata in eight free-access scholarly databases
The main objective of this study is to compare the amount of metadata and the completeness degree of research publications in new academic databases. Using a quantitative approach, we selected a random Crossref sample of more than 115k records, which was then searched in seven databases (Dimensions, Google Scholar, Microsoft Academic, OpenAlex, Scilit, Semantic Scholar, and The Lens). Seven characteristics were analyzed (abstract, access, bibliographic info, document type, publication date, language, and identifiers), to observe fields that describe this information, the completeness rate of these fields, and the agreement among databases. The results show that academic search engines (Google Scholar, Microsoft Academic, and Semantic Scholar) gather less information and have a low degree of completeness. Conversely, third-party databases (Dimensions, OpenAlex, Scilit, and The Lens) have more metadata quality and a higher completeness rate. We conclude that academic search engines lack the ability to retrieve reliable descriptive data by crawling the Web, while the main problem of third-party databases is the loss of information derived from integrating different sources.
https://www.webofscience.com/api/gateway/wos/peer-review/10.1162/qss_a_00286
期刊介绍:
ACS Applied Energy Materials is an interdisciplinary journal publishing original research covering all aspects of materials, engineering, chemistry, physics and biology relevant to energy conversion and storage. The journal is devoted to reports of new and original experimental and theoretical research of an applied nature that integrate knowledge in the areas of materials, engineering, physics, bioscience, and chemistry into important energy applications.