{"title":"Human-AI interaction research agenda: A user-centered perspective","authors":"Tingting Jiang , Zhumo Sun , Shiting Fu , Yan Lv","doi":"10.1016/j.dim.2024.100078","DOIUrl":"10.1016/j.dim.2024.100078","url":null,"abstract":"<div><div>The rapid growth of artificial intelligence (AI) has given rise to the field of Human-AI Interaction (HAII). This study meticulously reviewed the research themes, theoretical foundations, and methodological frameworks of the HAII field, aiming to construct a comprehensive overview of this field and provide robust support for future investigations. HAII research themes include human-AI collaboration, competition, conflict, and symbiosis. Theories drawn from communication, psychology, and sociology support these studies, while the employed methods include both self-reporting and observational approaches commonly utilized in user studies. It is suggested that future research should broaden its focus to encompass diverse user groups, AI roles, and tasks. Moreover, it is necessary to develop multi-disciplinary theories and integrate multi-level research methods to support the sustained development of the field. This study not only furnishes indispensable theoretical and practical insights for forthcoming research endeavors but also catalyzes the realization of a future distinguished by seamless interaction between humans and AI.</div></div>","PeriodicalId":72769,"journal":{"name":"Data and information management","volume":"8 4","pages":"Article 100078"},"PeriodicalIF":0.0,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141691159","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Patterns in paradata preferences among the makers and reusers of archaeological data","authors":"Isto Huvila, Lisa Andersson, Olle Sköld","doi":"10.1016/j.dim.2024.100077","DOIUrl":"10.1016/j.dim.2024.100077","url":null,"abstract":"<div><div>Knowledge of data reusers' and makers' preferences of data that describe processes and practices (paradata) remains limited, especially concerning broader patterns of such priorities. The aim of this study is to address this gap. Drawing on an exploratory factor analysis of a survey of makers and users of archaeological data, the study investigates 1) what patterns related to types of informational content can be identified in data makers' and users’ views of the usefulness of specific types of paradata, 2) how the patterns differ between data makers and users, and 3) how the patterns can be explained in terms of information needs and preferences. The findings show that paradata preferences are patterned and there are differences between data-makers and data-users ideas of what is useful. However, the differences limit to details that make data related processes and practices understandable rather than to the broader patterns of what types of information is needed. We identified five broad categories of uses for paradata (Data collection procedures and tools, Data in context, Standards and guidelines, Credentials, Data processing), and corresponding, applicable types of paradata. The findings point also to indicative possibilities of linking paradata preferences to orientational, contextualising and content-oriented data practices. From a practical perspective, this study underlines the importance of approaching paradata not as a monolith but rather as an arrangement that is structured by different understandings of (para)data and how it is acted upon. Instead of caring for paradata in general, it is crucial to engage with specific types of paradata for different data practices. Keywords: paradata, archaeology, data management, data reuse, research data management.</div></div>","PeriodicalId":72769,"journal":{"name":"Data and information management","volume":"8 4","pages":"Article 100077"},"PeriodicalIF":0.0,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141714890","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Bjarne Bartlett , Carter M. Zamora , Jon-Paul Bingham , Amy S Ebesu Hubbard , Michelle Tseng , Bryan Runck , Michael Kantar
{"title":"Impact factor does not predict long-term article impact across 15 journals","authors":"Bjarne Bartlett , Carter M. Zamora , Jon-Paul Bingham , Amy S Ebesu Hubbard , Michelle Tseng , Bryan Runck , Michael Kantar","doi":"10.1016/j.dim.2024.100079","DOIUrl":"10.1016/j.dim.2024.100079","url":null,"abstract":"<div><div>Academic journals are ranked using a variety of methods with the most common metric being ‘journal impact factor’. Authors who publish in journals with higher impact factors are deemed to contribute more to their discipline. However, the impact factor of a journal does not indicate how long a specific article stays in the scientific discourse, and metrics that measure the length of time articles within a journal continue to be cited are not typically used. We examined citations of 443,732 research articles [786,064 total] between 1980 and 2020 across 15 journals. We explored the range of longevity values found across different journals as well as the relationship between impact factor and longevity. We found no relationship between impact factor and longevity, indicating that immediate attention to an article is not correlated with longer-term impact. In the set of journals that we examined, articles published in some journals (e.g., Ecology, Genetics) continued to be cited at a steady rate long beyond their initial publication date. This slow but steady citation accumulation resulted in the total citations in these journals approaching those of higher impact journals (e.g., Science, Nature) within the length of a typical academic career (30–40 years).</div></div>","PeriodicalId":72769,"journal":{"name":"Data and information management","volume":"8 4","pages":"Article 100079"},"PeriodicalIF":0.0,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143146513","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"How readers attentive and inattentive to task-related information scan: An eye-tracking study","authors":"Jing Chen , Lu Zhang , Quan Lu","doi":"10.1016/j.dim.2024.100073","DOIUrl":"10.1016/j.dim.2024.100073","url":null,"abstract":"<div><h3>Purpose</h3><div>Attentive to task-related information should be the highest priority for all readers engaged in task-reading. Investigating the scanning behaviors of attentive versus inattentive readers shed new insights into the sequential cognitive processes, but it has seldom been studied. This study investigates their global patterns on a global scale, pertaining to the whole length of scanpaths, and further compares local tactics, local strategies, and local strategy transitions on a local scale, related to the isolated regions of scanpath.</div></div><div><h3>Design/methodology/approach</h3><div>A regular style reading system with the question, navigating, and text areas on its interface, and two types of task, namely fact-finding (FF) and content understanding (CU), were designed in an eye-tracking experiment. 24 participants were placed into attentive (AR) or inattentive (IAR) readers groups according to their fixation duration on task-related paragraphs. A global sequence analysis algorithm, Needleman-Wunsch, was applied to uncover global patterns across the whole length of scanpaths (whole-scanpaths). A local sequence analysis method related to frequent sub-scanpaths was adopted to extract local tactics specific to the reader and task. Coding was performed to identify local strategies by classifying local tactics. A local strategy transition was further identified as a sequence of frequent local strategies at the beginning, middle, and ending phases.</div></div><div><h3>Findings</h3><div>Whole<strong>-</strong>scanpaths of AR significantly differed from those of IAR, despite the absence of global patterns for each group. Five types of local strategy were identified, namely <em>locating information</em> <em>(</em><em>LI</em><em>)</em>, <em>evaluating and verifying text relevance (EVR)</em>, <em>navigation heuristics (NH)</em>, <em>synthesizing information</em> <em>(</em><em>SI</em><em>)</em>, and <em>contextual clues (CC)</em>. AR applied all types in both tasks, whereas IAR applied only two types and stuck with <em>EVR</em>. Furthermore, two types of local strategy transition were identified: <em>comprehensive exploration</em> and <em>iterative content evaluation</em>. AR employed the former with the linear feature in FF and the spiral feature in CU, while IAR employed the latter in both tasks.</div></div><div><h3>Originality</h3><div>This study advances the knowledge of dynamic cognitive processing from an attentive and inattentive to task-related information perspective. An objective analysis perspective for obtaining global patterns, local tactics, local strategies, and local strategy transitions is provided, then it can provide new insights into automatically classifying readers. The results also generate detailed and valuable guidance for improving reading system design and training readers.</div></div>","PeriodicalId":72769,"journal":{"name":"Data and information management","volume":"8 4","pages":"Article 100073"},"PeriodicalIF":0.0,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141136574","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Empirical insights into the interaction effects of groups at high risk of depression on online social platforms with NLP-based sentiment analysis","authors":"Yi Xiao , Yutong Yang , Haozhe Xu , Shijuan Li","doi":"10.1016/j.dim.2024.100080","DOIUrl":"10.1016/j.dim.2024.100080","url":null,"abstract":"<div><div>With the proliferation of digital technology and the increasing prevalence of social media, some users at high risk of depression have opted to seek solace, acceptance, and assistance in online communities. However, the extant research is deficient in terms of the segmentation of groups, particularly subcultural groups. By analyzing the “Super Hashtags” and “Tree Hole” groups on Sina Weibo from January to March 2023 using a crawler and the ERNIE 3.0-Base model for sentiment analysis, the study uncovers distinct sentiment profiles and interaction patterns, revealing significant correlations between interaction metrics and sentiment levels. The findings indicate that while there are no significant differences in sentiment levels between the two communities, the “Tree Hole” community exhibits greater sentiment variability. Moreover, the study identifies that interaction behaviors are closely linked to sentiment states, emphasizing the importance of understanding the complex dynamics between online interactions and mental well-being. These insights contribute to the development of more effective support mechanisms within online platforms for individuals at risk of depression.</div></div>","PeriodicalId":72769,"journal":{"name":"Data and information management","volume":"8 4","pages":"Article 100080"},"PeriodicalIF":0.0,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143146512","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Erratum regarding missing Declaration of Competing Interest statements in previously published articles (Volume 6, Issues 1–4)","authors":"","doi":"10.1016/j.dim.2024.100085","DOIUrl":"10.1016/j.dim.2024.100085","url":null,"abstract":"","PeriodicalId":72769,"journal":{"name":"Data and information management","volume":"8 4","pages":"Article 100085"},"PeriodicalIF":0.0,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142328312","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Responsibility toward society: A review and prospect of Savolainen's everyday information practice","authors":"","doi":"10.1016/j.dim.2024.100070","DOIUrl":"10.1016/j.dim.2024.100070","url":null,"abstract":"<div><p>The emphasis on social phenomena that defines the Everyday Information Practice (EIP) domain sets it apart from information behavior fields. This study highlights the importance of researching everyday information practices in contemporary social-cultural contexts by using Savolainen's EIP-related models as examples. A synopsis of the characteristics of earlier studies in terms of research contexts, participants, research questions, and research methods was created by evaluating the pertinent studies using EIP-related models. A trend of social responsibility-focused EIP research was presented, along with recommendations for future research in the field of EIP from the perspectives of participants and research methods.</p></div>","PeriodicalId":72769,"journal":{"name":"Data and information management","volume":"8 3","pages":"Article 100070"},"PeriodicalIF":0.0,"publicationDate":"2024-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2543925124000068/pdfft?md5=5a4d2516d88e2f08572ccfc67abb9576&pid=1-s2.0-S2543925124000068-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140792044","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Does internet use affect public risk perception? — From the perspective of political participation","authors":"","doi":"10.1016/j.dim.2023.100059","DOIUrl":"10.1016/j.dim.2023.100059","url":null,"abstract":"<div><p>Internet use has resulted in the flow and interweaving of risks and increased the difficulty of risk governance. Strengthening public risk perception research can not only make up for the shortcomings of traditional government-centered risk governance research but also improve the ability of risk governance. By employing data from Chinese Social Survey (CSS) and the mediating test with the process plug-in in SPSS, this paper tries to explore the influence mechanism of Internet use on public risk perception, as well as the mediating effect of different types of political participation. The results show that Internet use has a significantly positive impact on comprehensive public risk perception. Network political participation has significantly enhanced the public risk perception, while traditional political participation has significantly reduced the public risk perception. Besides, network political participation plays a mediating role in the relationship between Internet use and public risk perception.</p></div>","PeriodicalId":72769,"journal":{"name":"Data and information management","volume":"8 3","pages":"Article 100059"},"PeriodicalIF":0.0,"publicationDate":"2024-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2543925123000335/pdfft?md5=963a1f5301e23f6905ef0c6d6fe962ed&pid=1-s2.0-S2543925123000335-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139306100","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Adaptive K-means clustering based under-sampling methods to solve the class imbalance problem","authors":"","doi":"10.1016/j.dim.2023.100064","DOIUrl":"10.1016/j.dim.2023.100064","url":null,"abstract":"<div><p>In the field of machine learning, the issue of class imbalance is a common problem. It refers to an imbalance in the quantity of data collected, where one class has a significantly larger number of data compared to another class, which can negatively affect the classification efficiency of algorithms. Under-sampling methods address class imbalance by reducing the quantity of data in the majority class, thereby achieving a balanced dataset and mitigating the class imbalance problem. Traditional under-sampling methods based on k-means clustering either set the unified value of <em>k</em> (number of clusters) or determine it directly based on the quantity of data in the minority or majority class. This paper proposes an adaptive k-means clustering under-sampling algorithm that calculates an appropriate <em>k</em> for each dataset. After clustering the majority class dataset into <em>k</em> clusters, our algorithm calculates the distances between the data within each cluster and the cluster centroids from two perspectives and selects data based on these distances. Subsequently, the subset of the majority class dataset are combined with the minority class dataset to generate a new balanced dataset, which is then used for classification algorithms. The performance of our algorithm is evaluated on 45 datasets. Experimental results demonstrate that our algorithm can dynamically determine appropriate <em>k</em> for different datasets and output a balanced dataset, thus enhancing the classification efficiency of machine learning algorithms. This work can provide new algorithmic ensemble strategies for addressing class imbalance problem.</p></div>","PeriodicalId":72769,"journal":{"name":"Data and information management","volume":"8 3","pages":"Article 100064"},"PeriodicalIF":0.0,"publicationDate":"2024-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2543925123000384/pdfft?md5=25a3920a1a4e803650366fa56c8a9827&pid=1-s2.0-S2543925123000384-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139189188","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Improved detection of transient events in wide area sky survey using convolutional neural networks","authors":"","doi":"10.1016/j.dim.2023.100035","DOIUrl":"10.1016/j.dim.2023.100035","url":null,"abstract":"<div><p>The aim of data science is to catch up with the data-intensive life style as well as the demand for decision support, which becomes common in various domains such as medical, education and other smart solutions. As such, high quality of data analysis is greatly desired for accurate and effective downstreaming exploitations. This is also true for the domain of astronomical survey like GOTO (Gravitational-wave Optical Transient Observer), where large amount of raw data has been collected daily. This is one of recognised projects that search for transient events with the new breed of optical survey telescopes that can detect the sky faster and deeper. This is accomplished by comparing the night-specific data with the reference such that new bright sources are obtained for further study. However, the huge size of data makes it difficult to sift by naked eyes, thus requiring an automated system. Yet, many conventional machine-learning models have been sub-optimal for this task, as true positives can hardly be recognised due to the nature of imbalance data. This motivates the exploration of convolutional neural networks or CNN for this binary classification problem. Based on existing technologies, the paper reports the original application of basic CNN model to a representative data, which has been designed and generated within the GOTO project. In addition to the improvement over those previous works, this empirical study also includes details of parameter analysis, which will be useful for practice and further investigation.</p></div>","PeriodicalId":72769,"journal":{"name":"Data and information management","volume":"8 3","pages":"Article 100035"},"PeriodicalIF":0.0,"publicationDate":"2024-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2543925123000098/pdfft?md5=2a55597016759c169b3af4100cbcbbb7&pid=1-s2.0-S2543925123000098-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44409897","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}