International Journal on Digital Libraries最新文献

Methods for generation, recommendation, exploration and analysis of scholarly publications 学术出版物的生成、推荐、探索和分析方法

IF 1.5

International Journal on Digital Libraries Pub Date : 2024-09-03 DOI: 10.1007/s00799-024-00409-1

Gianmaria Silvello, Oscar Corcho, Paolo Manghi

引用次数: 0

Comparing free reference extraction pipelines 比较免费参考文献提取管道

IF 1.5

International Journal on Digital Libraries Pub Date : 2024-06-20 DOI: 10.1007/s00799-024-00404-6

Tobias Backes, Anastasiia Iurshina, Muhammad Ahsan Shahid, Philipp Mayr

引用次数: 0

Digital detection of play characters’ relationships in Shakespeare’s plays: extended cross-correlation analysis of the character appearance frequencies 莎士比亚戏剧中戏剧人物关系的数字检测：人物出场频率的扩展交叉相关分析

IF 1.5

International Journal on Digital Libraries Pub Date : 2024-05-27 DOI: 10.1007/s00799-024-00401-9

Miyuki Yamada, Yuichi Murai, Ichiro Kumagai

引用次数: 0

Book recommendation system: reviewing different techniques and approaches 图书推荐系统：回顾不同的技术和方法

IF 1.5

International Journal on Digital Libraries Pub Date : 2024-05-14 DOI: 10.1007/s00799-024-00403-7

P. Devika, A. Milton

{"title":"Book recommendation system: reviewing different techniques and approaches","authors":"P. Devika, A. Milton","doi":"10.1007/s00799-024-00403-7","DOIUrl":"https://doi.org/10.1007/s00799-024-00403-7","url":null,"abstract":"E-reading has become more popular by making the number of book readers high in number. With online book reading websites, it is much simpler to read any book at any time by simply typing its name into a search engine. These websites offer free reading platform to users with unlimited number of choices without exceeding any rights. However, statistics reveal that reading is dwindling, particularly among young people. In this survey, we presented several existing approaches employed to design a book recommendation system from 2012 to 2023. Different types of datasets, used to extract information about books and users, in terms of features, source and usage were discussed. Six different categories for book recommendation techniques have been recognized and discussed which would build the groundwork for future study in this area. The issues related to book recommendation system was also briefly discussed. We have discussed on the performance analysis of various research works on book recommendation system. We have also highlighted the research concerns and future scope to improve the performance of book recommender system. We hope these findings will help researchers to explore more in book recommender systems particularly.","PeriodicalId":44974,"journal":{"name":"International Journal on Digital Libraries","volume":"64 1","pages":""},"PeriodicalIF":1.5,"publicationDate":"2024-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140926722","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Structured abstract generator (SAG) model: analysis of IMRAD structure of articles and its effect on extractive summarization 结构化摘要生成器（SAG）模型：分析文章的 IMRAD 结构及其对提取式摘要的影响

IF 1.5

International Journal on Digital Libraries Pub Date : 2024-05-07 DOI: 10.1007/s00799-024-00402-8

Ayşe Esra Özkan Çelik, Umut Al

{"title":"Structured abstract generator (SAG) model: analysis of IMRAD structure of articles and its effect on extractive summarization","authors":"Ayşe Esra Özkan Çelik, Umut Al","doi":"10.1007/s00799-024-00402-8","DOIUrl":"https://doi.org/10.1007/s00799-024-00402-8","url":null,"abstract":"An abstract is the most crucial element that may convince readers to read the complete text of a scientific publication. However, studies show that in terms of organization, readability, and style, abstracts are also among the most troublesome parts of the pertinent manuscript. The ultimate goal of this article is to produce better understandable abstracts with automatic methods that will contribute to scientific communication in Turkish. We propose a summarization system based on extractive techniques combining general features that have been shown to be beneficial for Turkish. To construct the data set for this aim, a sample of 421 peer-reviewed Turkish articles in the field of librarianship and information science was developed. First, the structure of the full-texts, and their readability in comparison with author abstracts, were examined for text quality evaluation. A content-based evaluation of the system outputs was then carried out. System outputs, in cases of using and ignoring structural features of full-texts, were compared. Structured outputs outperformed classical outputs in terms of content and text quality. Each output group has better readability levels than their original abstracts. Additionally, it was discovered that higher-quality outputs are correlated with more structured full-texts, highlighting the importance of structural writing. Finally, it was determined that our system can facilitate the scholarly communication process as an auxiliary tool for authors and editors. Findings also indicate the significance of structural writing for better scholarly communication.\u0000","PeriodicalId":44974,"journal":{"name":"International Journal on Digital Libraries","volume":"27 1","pages":""},"PeriodicalIF":1.5,"publicationDate":"2024-05-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140926494","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Building datasets to support information extraction and structure parsing from electronic theses and dissertations 建立数据集，支持从电子论文中提取信息和解析结构

IF 1.5

International Journal on Digital Libraries Pub Date : 2024-05-03 DOI: 10.1007/s00799-024-00395-4

William A. Ingram, Jian Wu, Sampanna Yashwant Kahu, Javaid Akbar Manzoor, Bipasha Banerjee, Aman Ahuja, Muntabir Hasan Choudhury, Lamia Salsabil, Winston Shields, Edward A. Fox

{"title":"Building datasets to support information extraction and structure parsing from electronic theses and dissertations","authors":"William A. Ingram, Jian Wu, Sampanna Yashwant Kahu, Javaid Akbar Manzoor, Bipasha Banerjee, Aman Ahuja, Muntabir Hasan Choudhury, Lamia Salsabil, Winston Shields, Edward A. Fox","doi":"10.1007/s00799-024-00395-4","DOIUrl":"https://doi.org/10.1007/s00799-024-00395-4","url":null,"abstract":"Despite the millions of electronic theses and dissertations (ETDs) publicly available online, digital library services for ETDs have not evolved past simple search and browse at the metadata level. We need better digital library services that allow users to discover and explore the content buried in these long documents. Recent advances in machine learning have shown promising results for decomposing documents into their constituent parts, but these models and techniques require data for training and evaluation. In this article, we present high-quality datasets to train, evaluate, and compare machine learning methods in tasks that are specifically suited to identify and extract key elements of ETD documents. We explain how we construct the datasets by manual labeling the data or by deriving labeled data through synthetic processes. We demonstrate how our datasets can be used to develop downstream applications and to evaluate, retrain, or fine-tune pre-trained machine learning models. We describe our ongoing work to compile benchmark datasets and exploit machine learning techniques to build intelligent digital libraries for ETDs.","PeriodicalId":44974,"journal":{"name":"International Journal on Digital Libraries","volume":"83 1","pages":""},"PeriodicalIF":1.5,"publicationDate":"2024-05-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140926607","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Robots still outnumber humans in web archives in 2019, but less than in 2015 and 2012 2019 年网络档案中机器人的数量仍将超过人类，但低于 2015 年和 2012 年

IF 1.5

International Journal on Digital Libraries Pub Date : 2024-03-07 DOI: 10.1007/s00799-024-00397-2

Himarsha R. Jayanetti, Kritika Garg, Sawood Alam, Michael L. Nelson, Michele C. Weigle

{"title":"Robots still outnumber humans in web archives in 2019, but less than in 2015 and 2012","authors":"Himarsha R. Jayanetti, Kritika Garg, Sawood Alam, Michael L. Nelson, Michele C. Weigle","doi":"10.1007/s00799-024-00397-2","DOIUrl":"https://doi.org/10.1007/s00799-024-00397-2","url":null,"abstract":"The significance of the web and the crucial role of web archives in its preservation highlight the necessity of understanding how users, both human and robot, access web archive content, and how best to satisfy this disparate needs of both types of users. To identify robots and humans in web archives and analyze their respective access patterns, we used the Internet Archive’s (IA) Wayback Machine access logs from 2012, 2015, and 2019, as well as Arquivo.pt’s (Portuguese Web Archive) access logs from 2019. We identified user sessions in the access logs and classified those sessions as human or robot based on their browsing behavior. To better understand how users navigate through the web archives, we evaluated these sessions to discover user access patterns. Based on the two archives and between the three years of IA access logs (2012 vs. 2015 vs. 2019), we present a comparison of detected robots vs. humans and their user access patterns and temporal preferences. The total number of robots detected in IA 2012 (91% of requests) and IA 2015 (88% of requests) is greater than in IA 2019 (70% of requests). Robots account for 98% of requests in Arquivo.pt (2019). We found that the robots are almost entirely limited to “Dip” and “Skim” access patterns in IA 2012 and 2015, but exhibit all the patterns and their combinations in IA 2019. Both humans and robots show a preference for web pages archived in the near past.","PeriodicalId":44974,"journal":{"name":"International Journal on Digital Libraries","volume":"10 1","pages":""},"PeriodicalIF":1.5,"publicationDate":"2024-03-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140074149","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Stance prediction with a relevance attribute to political issues in comparing the opinions of citizens and city councilors 在比较市民和市议员的意见时，利用政治问题的相关性属性进行立场预测

IF 1.5

International Journal on Digital Libraries Pub Date : 2024-02-26 DOI: 10.1007/s00799-024-00396-3

Ko Senoo, Yohei Seki, Wakako Kashino, Atsushi Keyaki, Noriko Kando

{"title":"Stance prediction with a relevance attribute to political issues in comparing the opinions of citizens and city councilors","authors":"Ko Senoo, Yohei Seki, Wakako Kashino, Atsushi Keyaki, Noriko Kando","doi":"10.1007/s00799-024-00396-3","DOIUrl":"https://doi.org/10.1007/s00799-024-00396-3","url":null,"abstract":"This study focuses on a method for differentiating between the stance of citizens and city councilors on political issues (i.e., in favor or against) and attempts to compare the arguments of both sides. We created a dataset by annotating citizen tweets and city council minutes with labels for four attributes: stance, usefulness, regional dependence, and relevance. We then fine-tuned pretrained large language model using this dataset to assign the attribute labels to a large quantity of unlabeled data automatically. We introduced multitask learning to train each attribute jointly with relevance to identify the clues by focusing on those sentences that were relevant to the political issues. Our prediction models are based on T5, a large language model suitable for multitask learning. We compared the results from our system with those that used BERT or RoBERTa. Our experimental results showed that the macro-F1-scores for stance were improved by 1.8% for citizen tweets and 1.7% for city council minutes with multitask learning. Using the fine-tuned model to analyze real opinion gaps, we found that although the vaccination regime was positively evaluated by city councilors in Fukuoka city, it was not rated very highly by citizens.","PeriodicalId":44974,"journal":{"name":"International Journal on Digital Libraries","volume":"73 1","pages":""},"PeriodicalIF":1.5,"publicationDate":"2024-02-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139979688","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Towards privacy-aware exploration of archived personal emails 实现对存档个人电子邮件的隐私感知探索

IF 1.5

International Journal on Digital Libraries Pub Date : 2024-02-21 DOI: 10.1007/s00799-024-00394-5

Zoe Bartliff, Yunhyong Kim, Frank Hopfgartner

{"title":"Towards privacy-aware exploration of archived personal emails","authors":"Zoe Bartliff, Yunhyong Kim, Frank Hopfgartner","doi":"10.1007/s00799-024-00394-5","DOIUrl":"https://doi.org/10.1007/s00799-024-00394-5","url":null,"abstract":"This paper examines how privacy measures, such as anonymisation and aggregation processes for email collections, can affect the perceived usefulness of email visualisations for research, especially in the humanities and social sciences. The work is intended to inform archivists and data managers who are faced with the challenge of accessioning and reviewing increasingly sizeable and complex personal digital collections. The research in this paper provides a focused user study to investigate the usefulness of data visualisation as a mediator between privacy-aware management of data and maximisation of research value of data. The research is carried out with researchers and archivists with vested interest in using, making sense of, and/or archiving the data to derive meaningful results. Participants tend to perceive email visualisations as useful, with an average rating of 4.281 (out of 7) for all the visualisations in the study, with above average ratings for mountain graphs and word trees. The study shows that while participants voice a strong desire for information identifying individuals in email data, they perceive visualisations as almost equally useful for their research and/or work when aggregation is employed in addition to anonymisation.\u0000","PeriodicalId":44974,"journal":{"name":"International Journal on Digital Libraries","volume":"79 1","pages":""},"PeriodicalIF":1.5,"publicationDate":"2024-02-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139921521","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Exploiting the untapped functional potential of Memento aggregators beyond aggregation 挖掘 Memento 聚合器聚合之外的未开发功能潜力

IF 1.5

International Journal on Digital Libraries Pub Date : 2024-01-27 DOI: 10.1007/s00799-023-00391-0

Mat Kelly

{"title":"Exploiting the untapped functional potential of Memento aggregators beyond aggregation","authors":"Mat Kelly","doi":"10.1007/s00799-023-00391-0","DOIUrl":"https://doi.org/10.1007/s00799-023-00391-0","url":null,"abstract":"Web archives capture, retain, and present historical versions of web pages. Viewing web archives often amounts to a user visiting the Wayback Machine homepage, typing in a URL, then choosing a date and time significant of the capture. Other web archives also capture the web and use Memento as an interoperable point of querying their captures. Memento aggregators are web accessible software packages that allow clients to send requests for past web pages to a single endpoint source that then relays that request to a set of web archives. Though few deployed aggregator instances exist that exhibit this aggregation trait, they all, for the most part, align to a model of serving a request for a URI of an original resource (URI-R) to a client by first querying then aggregating the results of the responses from a collection of web archives. This single tier querying need not be the logical flow of an aggregator, so long as a user can still utilize the aggregator from a single URL. In this paper, we discuss theoretical aggregation models of web archives. We first describe the status quo as the conventional behavior exhibited by an aggregator. We then build on prior work to describe a multi-tiered, structured querying model that may be exhibited by an aggregator. We highlight some potential issues and high-level optimization to ensure efficient aggregation while also extending on the state-of-the-art of memento aggregation. Part of our contribution is the extension of an open-source, user-deployable Memento aggregator to exhibit the capability described in this paper. We also extend a browser extension that typically consults an aggregator to have the ability to aggregate itself rather than needing to consult an external service. A purely client-side, browser-based Memento aggregator is novel to this work.","PeriodicalId":44974,"journal":{"name":"International Journal on Digital Libraries","volume":"4 1","pages":""},"PeriodicalIF":1.5,"publicationDate":"2024-01-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139582920","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0