{"title":"Information science: Why it is not data science","authors":"Michael Seadle , Stefanie Havelka","doi":"10.1016/j.dim.2023.100027","DOIUrl":"https://doi.org/10.1016/j.dim.2023.100027","url":null,"abstract":"","PeriodicalId":72769,"journal":{"name":"Data and information management","volume":"7 1","pages":"100027"},"PeriodicalIF":0.0,"publicationDate":"2023-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49766884","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A review on method framework construction of Chinese Information Science","authors":"Bowen Li , Liang Tian , Yingyi Zhang , Heng Zhang , Chengzhi Zhang","doi":"10.1016/j.dim.2022.100023","DOIUrl":"10.1016/j.dim.2022.100023","url":null,"abstract":"<div><p>As the unique academic culture in Chinese philosophy and social sciences, researches on method framework provide an opportunity for understanding the thinking model and value orientation of the ancient eastern civilization. The field of information science has achieved fruitful results in the method framework research closely related to the unique discipline history and academic mission. The paper reviews information science method frameworks in China and presents their academic features from three aspects: 1. levels of the framework, 2. research strategies, and 3. essential techniques. At the same time, we summarize the value of this Chinese academic wisdom and the practical experience of information science in China to promote this research.</p></div>","PeriodicalId":72769,"journal":{"name":"Data and information management","volume":"6 4","pages":"Article 100023"},"PeriodicalIF":0.0,"publicationDate":"2022-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2543925122001218/pdfft?md5=1fee3c0d19e079d9d009c3cbdcf4cd80&pid=1-s2.0-S2543925122001218-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85556175","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Collaborative Filtering system based on multi-level user clustering and aspect sentiment","authors":"Samin Poudel, Marwan Bikdash","doi":"10.1016/j.dim.2022.100021","DOIUrl":"10.1016/j.dim.2022.100021","url":null,"abstract":"<div><p>A Collaborative Filtering (CF) method predicts an unknown overall rating of a target user towards an item based on the known overall ratings of the users that are similar to the target user. The similarity between two users is generally found based on their overall ratings toward items that both have reviewed. Two users may have similar overall ratings towards a given item, but different sentiments towards various aspects of the item. Understanding the effect of user sentiment towards specific aspects on overall ratings will sharpen estimates of user similarity as well as provide an rationale for making specific recommendations. We propose an Aspect-Sentiments based Multi-level Clustering of Users (ASMCU) approach that finds the multiple clusters of users similar to a specific user where similarity between users is based on various aspect sentiments. The proposed ASMCU CF approach can be used to predict both the overall ratings and the aspect-sentiments. The ASMCU based CF approach performed mostly better than and sometimes comparable to the eight well-established CF methods that rely only on the overall ratings or a particular aspect-sentiments. Note however that the ASMCU can also explicitly justify the recommendation in terms of aspect sentiments. We evaluated our approach using three datasets: One Hotel dataset and Two Beer datasets. The Hotel dataset involved six aspects and each Beer dataset has four aspects. Each dataset has one overall rating matrix and one sentiment tensor.</p></div>","PeriodicalId":72769,"journal":{"name":"Data and information management","volume":"6 4","pages":"Article 100021"},"PeriodicalIF":0.0,"publicationDate":"2022-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S254392512200119X/pdfft?md5=ec69cff03eac43d1ad77febf5cf90c67&pid=1-s2.0-S254392512200119X-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79950741","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Bernard J. Jansen , Joni Salminen , Soon-gyo Jung , Hind Almerekhi
{"title":"The illusion of data validity: Why numbers about people are likely wrong","authors":"Bernard J. Jansen , Joni Salminen , Soon-gyo Jung , Hind Almerekhi","doi":"10.1016/j.dim.2022.100020","DOIUrl":"10.1016/j.dim.2022.100020","url":null,"abstract":"<div><p>This reflection article addresses a difficulty faced by scholars and practitioners working with numbers about people, which is that <em>those who study people want numerical data about these people. Unfortunately, time and time again, this numerical data about people is wrong.</em> Addressing the potential causes of this wrongness, we present examples of analyzing people numbers, i.e., numbers derived from digital data by or about people, and discuss the comforting illusion of data validity. We first lay a foundation by highlighting potential inaccuracies in collecting people data, such as selection bias. Then, we discuss inaccuracies in analyzing people data, such as the flaw of averages, followed by a discussion of errors that are made when trying to make sense of people data through techniques such as posterior labeling. Finally, we discuss a root cause of people data often being wrong – the conceptual conundrum of thinking the numbers are <em>counts</em> when they are actually <em>measures</em>. Practical solutions to address this illusion of data validity are proposed. The implications for theories derived from people data are also highlighted, namely that these people theories are generally wrong as they are often derived from people numbers that are wrong.</p></div>","PeriodicalId":72769,"journal":{"name":"Data and information management","volume":"6 4","pages":"Article 100020"},"PeriodicalIF":0.0,"publicationDate":"2022-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2543925122001188/pdfft?md5=ea40bf274d8b7c53f1cf69e1a4a2e214&pid=1-s2.0-S2543925122001188-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82717217","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Twenty important theories and applications of empirical research on IS","authors":"Chuanhui Wu , Shijing Huang","doi":"10.1016/j.dim.2022.100003","DOIUrl":"10.1016/j.dim.2022.100003","url":null,"abstract":"","PeriodicalId":72769,"journal":{"name":"Data and information management","volume":"6 4","pages":"Article 100003"},"PeriodicalIF":0.0,"publicationDate":"2022-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2543925122001012/pdfft?md5=46a5a60b82b3aacd8a6ee305453be2c0&pid=1-s2.0-S2543925122001012-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81758474","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Hind Almerekhi , Haewoon Kwak , Joni Salminen , Bernard J. Jansen
{"title":"PROVOKE: Toxicity trigger detection in conversations from the top 100 subreddits","authors":"Hind Almerekhi , Haewoon Kwak , Joni Salminen , Bernard J. Jansen","doi":"10.1016/j.dim.2022.100019","DOIUrl":"10.1016/j.dim.2022.100019","url":null,"abstract":"<div><p>Promoting healthy discourse on community-based online platforms like Reddit can be challenging, especially when conversations show ominous signs of toxicity. Therefore, in this study, we find the turning points (i.e., toxicity triggers) making conversations toxic. Before finding toxicity triggers, we built and evaluated various machine learning models to detect toxicity from Reddit comments.</p><p>Subsequently, we used our best-performing model, a fine-tuned Bidirectional Encoder Representations from Transformers (BERT) model that achieved an area under the receiver operating characteristic curve (AUC) score of 0.983 to detect toxicity. Next, we constructed conversation threads and used the toxicity prediction results to build a training set for detecting toxicity triggers. This procedure entailed using our large-scale dataset to refine toxicity triggers' definition and build a trigger detection dataset using 991,806 conversation threads from the top 100 communities on Reddit. Then, we extracted a set of sentiment shift, topical shift, and context-based features from the trigger detection dataset, using them to build a dual embedding biLSTM neural network that achieved an AUC score of 0.789. Our trigger detection dataset analysis showed that specific triggering keywords are common across all communities, like ‘racist’ and ‘women’. In contrast, other triggering keywords are specific to certain communities, like ‘overwatch’ in r/Games. Implications are that toxicity trigger detection algorithms can leverage generic approaches but must also tailor detections to specific communities.</p></div>","PeriodicalId":72769,"journal":{"name":"Data and information management","volume":"6 4","pages":"Article 100019"},"PeriodicalIF":0.0,"publicationDate":"2022-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2543925122001176/pdfft?md5=a441c48620ba8685678f44afdb856b82&pid=1-s2.0-S2543925122001176-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85301907","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Analysis of interoperability, security and usability of digital repositories in Kenyan Institutions of Higher Learning","authors":"Johnson Mulongo Masinde , Otuoma Sanya","doi":"10.1016/j.dim.2022.100011","DOIUrl":"10.1016/j.dim.2022.100011","url":null,"abstract":"<div><p>Kenya has experienced a significant growth in the number of institutional repositories in the recent past. The number grew from a paltry two (2) in 2009 to 42 in August 2020. The growth is a positive indicator as repositories play a crucial role in solving some of the problems experienced in the broader area of scholarly communication. This study sought to establish the current extent to which institutions of higher learning in Kenya have established and implemented digital repositories, from a technical perspective. To achieve this goal, the study undertook a technical analysis of institutional repositories implemented by accredited universities in Kenya by the Commission for University Education as at June 2020. The analysis focused on numerous metrics on interoperability, security and usability of the analyzed institutional repositories. The study employed an exploratory approach to collecting the data. The data collected was stored on a MySQL database using the PhpMyAdmin tool. Data analysis was done by SQL querying and the result set copied to MS Excel for generation of graphical visualizations. From a total of 49 institutions examined, 34 (69%) had institutional repositories while 15 (39%) did not have institutional repositories. All the 34 institutions with repositories were using Dspace software. Of all the metrics analyzed, the study established that most of the institutional repositories did not implement essential features that improve interoperability, security and usability of their repository platforms. The study recommends either further training for repository managers or outsourcing of the technical process of establishing and maintaining functional institutional repositories. We further recommend more comprehensive studies to cover all the aspects of the FAIR principles of data management in Kenya.</p></div>","PeriodicalId":72769,"journal":{"name":"Data and information management","volume":"6 4","pages":"Article 100011"},"PeriodicalIF":0.0,"publicationDate":"2022-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2543925122001097/pdfft?md5=fd66351c4bd813fb9395e180deb5f70d&pid=1-s2.0-S2543925122001097-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88986941","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Gain-framed product descriptions are more appealing to elderly consumers in live streaming E-commerce: Implications from a controlled experiment","authors":"Zhumo Sun , Shiting Fu , Tingting Jiang","doi":"10.1016/j.dim.2022.100022","DOIUrl":"10.1016/j.dim.2022.100022","url":null,"abstract":"<div><p>Live streaming e-commerce has become increasingly popular among elderly consumers. This new form of online shopping allows the elderly, who might be less effective in making purchase decisions than younger people, to better understand the products sold through the comprehensive descriptions provided by the anchors. This study is interested in investigating the effects of the gain-loss framing of product descriptions on the elderly's purchase intention. A total of 36 participants between the ages of 60 and 70 were invited to watch a number of live streaming videos involving either gain- or loss-framed product descriptions in a controlled experiment. The results show that the gain-framed descriptions of the products engendered significantly higher purchase intention among the participants than the loss-framed ones. In particular, the gain-framed descriptions were effective for the participants with high approach motivation, but not for those with low approach motivation, which suggests the significant moderating effect of approach motivation. This study focused on the elderly customers whose life quality can be greatly enhanced by live streaming e-commerce. The findings not only add to the knowledge about the effects of message framing, but also provide useful implications for live-streaming e-commerce practitioners to increase the persuasiveness of their product descriptions.</p></div>","PeriodicalId":72769,"journal":{"name":"Data and information management","volume":"6 4","pages":"Article 100022"},"PeriodicalIF":0.0,"publicationDate":"2022-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2543925122001206/pdfft?md5=8a433638ccccae0818a89fa816f9ac93&pid=1-s2.0-S2543925122001206-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73773429","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Empowering linked data in cultural heritage institutions: A knowledge management perspective","authors":"Lei Zhang","doi":"10.1016/j.dim.2022.100013","DOIUrl":"10.1016/j.dim.2022.100013","url":null,"abstract":"<div><p>This reported research explores the barriers and challenges in linked data implementation in cultural heritage institutions, i.e., libraries, archives, and museums. Various data were collected from different sources regarding the linked data use cases related to libraries, archives, and museums over the past decade and analyzed from multiple facets. The analysis revealed very few activities of effective knowledge management in the linked data implementation and suggested that the crucial role of knowledge management and innovation should deserve enough attention in linked data projects and services. The findings will add value to the literature on knowledge management in the context of linked data and the semantic web.</p></div>","PeriodicalId":72769,"journal":{"name":"Data and information management","volume":"6 3","pages":"Article 100013"},"PeriodicalIF":0.0,"publicationDate":"2022-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2543925122001115/pdfft?md5=ec643c8a122444273c5fb106e23ab65f&pid=1-s2.0-S2543925122001115-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75079078","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Knowledge management and innovation","authors":"Lu An, Alton Y.K. Chua, Md Anwarul Islam","doi":"10.1016/j.dim.2022.100018","DOIUrl":"10.1016/j.dim.2022.100018","url":null,"abstract":"","PeriodicalId":72769,"journal":{"name":"Data and information management","volume":"6 3","pages":"Article 100018"},"PeriodicalIF":0.0,"publicationDate":"2022-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2543925122001164/pdfft?md5=4a5c7ccb600c67f9a5032cd2d910de22&pid=1-s2.0-S2543925122001164-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76928293","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}