{"title":"Novel time slicing approach for customer defection models in e-commerce: a case study","authors":"Kyriakos Georgiou , Alexandros Chasapis","doi":"10.1016/j.dsm.2022.07.001","DOIUrl":"10.1016/j.dsm.2022.07.001","url":null,"abstract":"<div><p>In this study, we examine the problem of predicting customer defection in a noncontractual setting. Motivated by recent work on machine learning using multiple time slices, we develop a novel training and testing framework, the sliding multi-time slicing (SMTS) method. We apply this method to data from the largest marketplace in Greece, namely, Skroutz, considering the standard features that account for the important characteristics of customer activity and custom performance metrics aimed at capturing business-related goals established by the company. The dataset comprises customers over a relatively short period, since April 2018, the number of which has also exhibited a significant increase in recent months. Despite these difficulties and the inherent seasonality of customer defection, our results demonstrate that, with SMTS, developing models that outperform previous approaches and optimize decision-making is possible. We validate the approach to a benchmark dataset from the commerce sector and discuss the practical considerations and requirements of the proposed method.</p></div>","PeriodicalId":100353,"journal":{"name":"Data Science and Management","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666764922000285/pdfft?md5=90cc770a3700d52be7c17ade53d2e0ae&pid=1-s2.0-S2666764922000285-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89955151","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Monitoring machine learning models: a categorization of challenges and methods","authors":"Tim Schröder, Michael Schulz","doi":"10.1016/j.dsm.2022.07.004","DOIUrl":"10.1016/j.dsm.2022.07.004","url":null,"abstract":"<div><p>The importance of software based on machine learning is growing rapidly, but the potential of prototypes may not be realized in operation. This study identified six categories of challenges for verification and validation of machine learning applications during production. Subsequently, monitoring was analyzed as a possible solution to mitigate those challenges. Capturing relevant data and model metrics may reveal problems at an early stage, allowing for targeted countermeasures. This study presents a taxonomy of methods and metrics currently addressed in scientific literature and compares these categories with case studies from practice.</p></div>","PeriodicalId":100353,"journal":{"name":"Data Science and Management","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666764922000303/pdfft?md5=55f9a032588179192732a092b760d946&pid=1-s2.0-S2666764922000303-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77178317","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
S. Lai, I. Bogoch, N. Ruktanonchai, A. Watts, Xin Lu, Weizhong Yang, Hongjie Yu, K. Khan, A. Tatem
{"title":"Assessing spread risk of COVID-19 in early 2020","authors":"S. Lai, I. Bogoch, N. Ruktanonchai, A. Watts, Xin Lu, Weizhong Yang, Hongjie Yu, K. Khan, A. Tatem","doi":"10.1016/j.dsm.2022.08.004","DOIUrl":"https://doi.org/10.1016/j.dsm.2022.08.004","url":null,"abstract":"","PeriodicalId":100353,"journal":{"name":"Data Science and Management","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-08-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88883488","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A hybrid differential evolution algorithm for a stochastic location-inventory-delivery problem with joint replenishment","authors":"Sirui Wang, Lin Wang, Yingying Pi","doi":"10.1016/j.dsm.2022.07.003","DOIUrl":"https://doi.org/10.1016/j.dsm.2022.07.003","url":null,"abstract":"","PeriodicalId":100353,"journal":{"name":"Data Science and Management","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91333492","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Forecasting hourly retail customer flow on intermittent time series with multiple seasonality","authors":"Martim Sousa, Ana Maria Tom, José Moreira","doi":"10.1016/j.dsm.2022.07.002","DOIUrl":"https://doi.org/10.1016/j.dsm.2022.07.002","url":null,"abstract":"","PeriodicalId":100353,"journal":{"name":"Data Science and Management","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77170243","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Effect of data resampling on feature importance in imbalanced blockchain data: comparison studies of resampling techniques","authors":"Ismail Alarab, Simant Prakoonwit","doi":"10.1016/j.dsm.2022.04.003","DOIUrl":"https://doi.org/10.1016/j.dsm.2022.04.003","url":null,"abstract":"<div><p>Cryptocurrency blockchain data encounter a class-imbalance problem due to only a few known labels of illicit or fraudulent activities in the blockchain network. For this purpose, we seek to compare various resampling methods applied to two highly imbalanced datasets derived from the blockchain of Bitcoin and Ethereum after further dimensionality reductions, which is different from previous studies on these datasets. Firstly, we study the performance of various classical supervised learning methods to classify illicit transactions or accounts on Bitcoin or Ethereum datasets, respectively. Consequently, we apply various resampling techniques to these datasets using the best performing learning algorithm on each of these datasets. Subsequently, we study the feature importance of the given models, wherein the resampled datasets directly influenced on the explainability of the model. Our main finding is that undersampling using the edited nearest-neighbour technique has attained an accuracy of more than 99% on the given datasets by removing the noisy data points from the whole dataset. Moreover, the best-performing learning algorithms have shown superior performance after feature reduction on these datasets in comparison to their original studies. The matchless contribution lies in discussing the effect of the data resampling on feature importance which is interconnected with explainable artificial intelligence (XAI) techniques.</p></div>","PeriodicalId":100353,"journal":{"name":"Data Science and Management","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666764922000145/pdfft?md5=6bf238fdec6a4e856548c8bd4110e94a&pid=1-s2.0-S2666764922000145-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"137349425","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Research on value co-creation elements in full-scene intelligent service","authors":"Weina Wang , Hong Zhang , Sumeet Gupta","doi":"10.1016/j.dsm.2022.05.001","DOIUrl":"10.1016/j.dsm.2022.05.001","url":null,"abstract":"<div><p>Compared with common intelligent service, full-scene intelligent service has its uniqueness in high integration, synergy, and technological spillover. However, the traditional service or business model theories cannot precisely elaborate its sociotechnical contextual nature and value creation logic. To fill this knowledge gap, we provide initial insights into the value co-creation logic in full-scene intelligent service by exploring the value co-creation elements using a data-driven text mining approach. We analyzed 171 business reports on the full-scene intelligent service by the topic modeling using the Latent Dirichlet Allocation (LDA). The findings reveal three main clusters: value proposition, participants, and connection platform. This study presents a theoretical framework for a further exploratory case study and quantitative research on full-scene intelligent service. This study also helps small and medium-sized enterprises to explore and exploit value co-creation opportunities.</p></div>","PeriodicalId":100353,"journal":{"name":"Data Science and Management","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666764922000200/pdfft?md5=f889499c5423ccdc293ff1c543976dcd&pid=1-s2.0-S2666764922000200-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88454510","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"New developments in wind energy forecasting with artificial intelligence and big data: a scientometric insight","authors":"Erlong Zhao , Shaolong Sun , Shouyang Wang","doi":"10.1016/j.dsm.2022.05.002","DOIUrl":"10.1016/j.dsm.2022.05.002","url":null,"abstract":"<div><p>Accurate forecasting results are crucial for increasing energy efficiency and lowering energy consumption in wind energy. Big data and artificial intelligence (AI) have great potential in wind energy forecasting. Although the literature on this subject is extensive, it lacks a comprehensive research status survey. In identifying the evolution rules of big data and AI methods in wind energy forecasting, this paper summarizes the studies on big data and AI in wind energy forecasting over the last two decades. The existing big data types, analysis techniques, and forecasting methods are classified and sorted by combining literature reviews and scientometrics methods. Furthermore, the research trend of wind energy forecasting methods is determined based on big data and artificial intelligence by combing the existing research hotspots and frontier progress. Finally, this paper summarizes existing research’s opportunities, challenges, and implications from various perspectives. The research results serve as a foundation for future research and promote the further development of wind energy forecasting.</p></div>","PeriodicalId":100353,"journal":{"name":"Data Science and Management","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666764922000212/pdfft?md5=53efdc16a677b8948a6955a1f86304a5&pid=1-s2.0-S2666764922000212-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73276920","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A combined forecasting method for intermittent demand using the automotive aftermarket data","authors":"Xiaotian Zhuang , Ying Yu , Aihui Chen","doi":"10.1016/j.dsm.2022.04.001","DOIUrl":"10.1016/j.dsm.2022.04.001","url":null,"abstract":"<div><p>Intermittent demand forecasting is an important challenge in the process of smart supply chain transformation, and accurate demand forecasting can reduce costs and increase efficiency for enterprises. This study proposes an intermittent demand combination forecasting method based on internal and external data, builds intermittent demand feature engineering from the perspective of machine learning, predicts the occurrence of demand by classification model, and predicts non-zero demand quantity by regression model. Based on the strategy selection on the inventory side and the stocking needs on the replenishment side, this study focuses on the optimization of the classification problem, incorporates the internal and external data of the enterprise, and proposes two combination forecasting optimization methods on the basis of the best classification threshold searching and transfer learning, respectively. Based on the real data of auto after-sales business, these methods are evaluated and validated in multiple dimensions. Compared with other intermittent forecasting methods, the models proposed in this study have been improved significantly in terms of classification accuracy and forecasting precision, which validates the potential of combined forecasting framework for intermittent demand and provides an empirical study of the framework in industry practice. The results show that this research can further provide accurate upstream inputs for smart inventory and guarantee intelligent supply chain decision-making in terms of accuracy and efficiency.</p></div>","PeriodicalId":100353,"journal":{"name":"Data Science and Management","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666764922000121/pdfft?md5=b4e7fd469fe0882d34c5c1fd97ff6fc7&pid=1-s2.0-S2666764922000121-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80828936","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Feature extraction of search product based on multi-feature fusion-oriented to Chinese online reviews","authors":"Xunjiang Huang, Yaqian Liu, Yang Wang, Xue Wang","doi":"10.1016/j.dsm.2022.04.002","DOIUrl":"10.1016/j.dsm.2022.04.002","url":null,"abstract":"<div><p>The increasing Chinese online reviews contain rich product demand information, especially for search products. This study suggests a product feature extraction model from online reviews based on multi-feature fusion named PFEMF (products features extraction based on multi-feature fusion) model. Combining sentence and word characteristics of Chinese online reviews, the model explores the lexical features, frequency features, span features, and semantic similarity features of words. And then, they are fused to identify the features that customers are concerned about most by sequential relationship analysis. The identified product feature provides direction for product innovation and facilitates the product selection for customers. Finally, the study takes iPad Air as an example to prove this model. The results show that the extraction performance of the PFEMF model is superior to the traditional term frequency-inverse document frequency (tf-idf) algorithm, word span algorithm, and semantic similarity algorithm.</p></div>","PeriodicalId":100353,"journal":{"name":"Data Science and Management","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666764922000133/pdfft?md5=6b7a8eaaa58360079b9b945c9422c32c&pid=1-s2.0-S2666764922000133-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75872743","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}