{"title":"Investigating customer churn in banking: A machine learning approach and visualization app for data science and management","authors":"Pahul Preet Singh , Fahim Islam Anik , Rahul Senapati , Arnav Sinha , Nazmus Sakib , Eklas Hossain","doi":"10.1016/j.dsm.2023.09.002","DOIUrl":"10.1016/j.dsm.2023.09.002","url":null,"abstract":"<div><p>Customer attrition in the banking industry occurs when consumers quit using the goods and services offered by the bank for some time and, after that, end their connection with the bank. Therefore, customer retention is essential in today’s extremely competitive banking market. Additionally, having a solid customer base helps attract new consumers by fostering confidence and a referral from a current clientele. These factors make reducing client attrition a crucial step that banks must pursue. In our research, we aim to examine bank data and forecast which users will most likely discontinue using the bank’s services and become paying customers. We use various machine learning algorithms to analyze the data and show comparative analysis on different evaluation metrics. In addition, we developed a Data Visualization RShiny app for data science and management regarding customer churn analysis. Analyzing this data will help the bank indicate the trend and then try to retain customers on the verge of attrition.</p></div>","PeriodicalId":100353,"journal":{"name":"Data Science and Management","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-09-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666764923000401/pdfft?md5=cfc2f4530901aaf2ea8c8c1c0289f259&pid=1-s2.0-S2666764923000401-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134993822","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Understanding the antecedents of patients’ missed appointments: The perspective of attribution theory","authors":"Guorui Fan , Zhaohua Deng , Lai C. Liu","doi":"10.1016/j.dsm.2023.09.004","DOIUrl":"10.1016/j.dsm.2023.09.004","url":null,"abstract":"<div><p>The occurrence of missed appointment appointments from online outpatient bookings significantly hinders the operational efficiency of outpatient services. This study aimed to investigate various factors influencing patients’ missed appointments from online outpatient bookings. Drawing on attribution theory, an empirical analysis was conducted using 382,004 authentic online outpatient appointments. The empirical findings revealed that appointment lead-time, appointment time, weekday appointments, online doctor rating, appointment doctor’s expertise, patient distance, and previous outpatient visit experience significantly influenced patients’ missed appointment behaviors from online outpatient bookings. Importantly, previous outpatient experience positively moderated the relationship between the appointment doctor’s expertise and patients’ missed-appointment behavior. This study provides insights into the factors influencing patients’ missed-appointment behavior from online outpatient bookings. It further offers a theoretical foundation for medical institutions in China to mitigate the likelihood and adverse effects of patients’ missed-appointment behavior from online outpatient bookings.</p></div>","PeriodicalId":100353,"journal":{"name":"Data Science and Management","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666764923000425/pdfft?md5=71ebf712a9bb6a9bf75d7915cb4d0602&pid=1-s2.0-S2666764923000425-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134918275","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Dual-stage ensemble approach using online knowledge distillation for forecasting carbon emissions in the electric power industry","authors":"Ruibin Lin, Xing Lv, Huanling Hu, Liwen Ling, Zehui Yu, Dabin Zhang","doi":"10.1016/j.dsm.2023.09.001","DOIUrl":"10.1016/j.dsm.2023.09.001","url":null,"abstract":"<div><p>The electric power industry is the key to achieving the goals of carbon peak and neutrality. Accurate forecasting of carbon emissions in the electric power industry can aid in the prompt adjustment of power generation policies and the early achievement of carbon reduction targets. This study proposes a new approach that combines the decomposition-ensemble paradigm with knowledge distillation to forecast daily carbon emissions. First, seasonal and trend decomposition using locally weighted scatterplot smoothing (STL) is used to decompose the data into three subcomponents. Second, two heterogeneous deep neural network models are jointly trained to predict each subcomponent based on online knowledge distillation. During training, the two models learn and provide feedback to each other. The first model-ensemble stage is performed by synthesizing the predictions for each subcomponent of the two models. Finally, the second model-ensemble stage is performed. The predictions for each subcomponent are integrated using linear addition to obtain the final results. In addition, to avoid leakage of test data caused by decomposing the entire time series, a recursive forecasting strategy is applied. Multistep predictions are obtained by forecasting 7, 15, and 30 days in the future. Experimental results using metaheuristic algorithms to optimize hyperparameters show that the proposed method evaluated on the daily carbon emissions dataset has better forecasting performance than all baselines.</p></div>","PeriodicalId":100353,"journal":{"name":"Data Science and Management","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666764923000395/pdfft?md5=f20a2e0ce3f1de499a3c6ddbd9113351&pid=1-s2.0-S2666764923000395-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74621919","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xu Zhu , Qingyong Chu , Xinchang Song , Ping Hu , Lu Peng
{"title":"Explainable prediction of loan default based on machine learning models","authors":"Xu Zhu , Qingyong Chu , Xinchang Song , Ping Hu , Lu Peng","doi":"10.1016/j.dsm.2023.04.003","DOIUrl":"https://doi.org/10.1016/j.dsm.2023.04.003","url":null,"abstract":"<div><p>Owing to the convenience of online loans, an increasing number of people are borrowing money on online platforms. With the emergence of machine learning technology, predicting loan defaults has become a popular topic. However, machine learning models have a black-box problem that cannot be disregarded. To make the prediction model rules more understandable and thereby increase the user’s faith in the model, an explanatory model must be used. Logistic regression, decision tree, XGBoost, and LightGBM models are employed to predict a loan default. The prediction results show that LightGBM and XGBoost outperform logistic regression and decision tree models in terms of the predictive ability. The area under curve for LightGBM is 0.7213. The accuracies of LightGBM and XGBoost exceed 0.8. The precisions of LightGBM and XGBoost exceed 0.55. Simultaneously, we employed the local interpretable model-agnostic explanations approach to undertake an explainable analysis of the prediction findings. The results show that factors such as the loan term, loan grade, credit rating, and loan amount affect the predicted outcomes.</p></div>","PeriodicalId":100353,"journal":{"name":"Data Science and Management","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49765243","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Joint decision-making of virtual module formation and scheduling considering queuing time","authors":"Liang Mei , Liu Yue , Shilun Ge","doi":"10.1016/j.dsm.2023.04.002","DOIUrl":"https://doi.org/10.1016/j.dsm.2023.04.002","url":null,"abstract":"<div><p>Formation and scheduling are the most important decisions in the virtual modular manufacturing system; however, the global performance optimization of the system may be sacrificed via the superposition of two independent decision-making results. The joint decision of formation and scheduling is very important for system design. Complex and discrete manufacturing enterprises such as shipbuilding and aerospace often comprise multiple tasks, processes, and parallel machines, resulting in complex routes. The queuing time of parts in front of machines may account for 90% of the production cycle time. This study established a weighted allocation model of a formation-scheduling joint decision problem considering queuing time in system. To solve this nondeterministic polynomial (NP) problem, an adaptive differential evolution-simulated annealing (ADE-SA) algorithm is proposed. Compared with the standard differential evolution (DE) algorithm, the adaptive mutation factor overcomes the disadvantage that the scale of DE’s differential vector is difficult to control. The selection strategy of the SA algorithm compensates for the deficiency that DE’s greedy strategy may fall into a local optimal solution. The comparison results of four algorithms of a series of random examples demonstrate that the overall performance of ADE-SA is superior to the genetic algorithm, and average iteration, maximum completion time, and move time are 24%, 11%, and 7% lower than the average of other three algorithms, respectively. The method can generate the joint decision-making scheme with better overall performance, and effectively identify production bottlenecks through quantitative analysis of queuing time.</p></div>","PeriodicalId":100353,"journal":{"name":"Data Science and Management","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49749903","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Novel method for ranking batsmen in Indian Premier League","authors":"M.K. Manju , Abin Oommen Philip","doi":"10.1016/j.dsm.2023.06.004","DOIUrl":"https://doi.org/10.1016/j.dsm.2023.06.004","url":null,"abstract":"<div><p>Sports analytics have benefited immensely from the growth and popularity of artificial intelligence and machine learning. These techniques enable sports analysts to evaluate player performance more effectively. A literature review of player performance evaluation methods shows the need to develop a new performance evaluation index for Twenty20 (T20) cricket. A novel framework was proposed to evaluate batsman strength based on individual performance, role in the team, and team interactions. Traditionally, proposed ranking systems are derived from static networks, that is, the aggregation of game results over time. However, the scores of the players (or teams) fluctuate over time. Intuitively, defeating a renowned player during peak performance is more rewarding than defeating the same player during other periods. To account for this, we propose a new method and apply it to the T20 format Indian Premier League. The method serves three main purposes: First, it creates a new performance index for players to rank them more accurately and effectively. Second, the players are clustered based on their expertise. In the third phase, a social network analysis approach is applied to visualize and analyze crickets as a network to gain better insights into players’ team interactions. This novel approach is a helpful index for sports coaches, analysts, cricket fans, and managers to evaluate player performance and rank for future aspects.</p></div>","PeriodicalId":100353,"journal":{"name":"Data Science and Management","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49749481","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Effects of economic factors on median list and selling prices in the U.S. housing market","authors":"Durga Vaidynathan, Parthajit Kayal, Moinak Maiti","doi":"10.1016/j.dsm.2023.08.001","DOIUrl":"https://doi.org/10.1016/j.dsm.2023.08.001","url":null,"abstract":"","PeriodicalId":100353,"journal":{"name":"Data Science and Management","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86397210","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The impact of the digital economy on the servitization of industrial structures: the moderating effect of human capital","authors":"Rong Ran, Xinyuan Wang, Ting Wang, Lei Hua","doi":"10.1016/j.dsm.2023.06.003","DOIUrl":"https://doi.org/10.1016/j.dsm.2023.06.003","url":null,"abstract":"<div><p>The digital economy, which was born during the late third technological revolution, has caused significant economic and societal changes. Amid sluggish global economic growth, China’s economy is facing upgrades and transformations. The sample selection for this study was conducted from 2013 to 2020. Data related to the digital economy and servitization of the industrial structure of 30 Chinese provinces, municipalities, and autonomous regions were collected. This study presents the human capital variable, based on which an econometric analysis was conducted, and examines its moderating effect. The findings indicate that even after the replacement variable indicator’s robustness test, the relationship between the digital economy and the servitization of industrial structures remains unchanged. This study demonstrats that the quality of human capital plays a positive role in this effect. Finally, a heterogeneity test demonstrated that there are different pathways for the impact of the digital economy on the servitization of industrial structures in the eastern, central, and western regions. This study provides evidence to help researchers understand the moderating utility of human capital.</p></div>","PeriodicalId":100353,"journal":{"name":"Data Science and Management","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49749640","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Systematic review of data-centric approaches in artificial intelligence and machine learning","authors":"Prerna Singh","doi":"10.1016/j.dsm.2023.06.001","DOIUrl":"https://doi.org/10.1016/j.dsm.2023.06.001","url":null,"abstract":"<div><p>Artificial intelligence (AI) relies on data and algorithms. State-of-the-art (SOTA) AI smart algorithms have been developed to improve the performance of AI-oriented structures. However, model-centric approaches are limited by the absence of high-quality data. Data-centric AI is an emerging approach for solving machine learning (ML) problems. It is a collection of various data manipulation techniques that allow ML practitioners to systematically improve the quality of the data used in an ML pipeline. However, data-centric AI approaches are not well documented. Researchers have conducted various experiments without a clear set of guidelines. This survey highlights six major data-centric AI aspects that researchers are already using to intentionally or unintentionally improve the quality of AI systems. These include big data quality assessment, data preprocessing, transfer learning, semi-supervised learning, machine learning operations (MLOps), and the effect of adding more data. In addition, it highlights recent data-centric techniques adopted by ML practitioners. We addressed how adding data might harm datasets and how HoloClean can be used to restore and clean them. Finally, we discuss the causes of technical debt in AI. Technical debt builds up when software design and implementation decisions run into “or outright collide with” business goals and timelines. This survey lays the groundwork for future data-centric AI discussions by summarizing various data-centric approaches.</p></div>","PeriodicalId":100353,"journal":{"name":"Data Science and Management","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49749915","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Saeed Banaeian Far , Azadeh Imani Rad , Maryam Rajabzadeh Asaar
{"title":"Blockchain and its derived technologies shape the future generation of digital businesses: a focus on decentralized finance and the Metaverse","authors":"Saeed Banaeian Far , Azadeh Imani Rad , Maryam Rajabzadeh Asaar","doi":"10.1016/j.dsm.2023.06.002","DOIUrl":"https://doi.org/10.1016/j.dsm.2023.06.002","url":null,"abstract":"<div><p>Without a doubt, blockchain, as one of the most valuable technological advancements, has been introduced over the past decade and has played a significant role in the industrial revolution. The emergence of other technologies derived from blockchain, such as decentralized finance (DeFi) and the Metaverse, has fundamentally transformed people’s daily lives and profoundly impacted future versions of digital businesses. This study explored the evolution of digital businesses in the near future, with a specific focus on the two primary technologies mentioned above. First, we reviewed DeFi-based technologies, including GameFi, SciFi, SocialFi, and others which serve as foundational building blocks for future jobs and businesses. Second, we examined Metaverse-based jobs, such as Metaverse-based academies and markets which are expected to be launched as commonly used businesses. Ultimately, this study provides several guidelines, such as how to use DeFi 2.0 and apply centralized decentralized finance (CeDeFi)-based platforms. Additionally, it offers future directions for launching these businesses, including controlling the progress of artificial intelligence (AI) in practical applications, utilizing cloud-assisted models for the Metaverse, and providing conditional privacy for future Metaverse-based businesses.</p></div>","PeriodicalId":100353,"journal":{"name":"Data Science and Management","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49758595","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}