{"title":"Handling Imbalanced Data With Weighted Logistic Regression and Propensity Score Matching methods","authors":"L. Agrawal, Pavankumar Mulgund, Raj Sharman","doi":"10.4018/jdm.335888","DOIUrl":"https://doi.org/10.4018/jdm.335888","url":null,"abstract":"The adoption of empirical methods for secondary data analysis has witnessed a significant surge in IS research. However, the secondary data is often incomplete, skewed, and imbalanced at best. Consequently, there is a growing recognition of the importance of empirical techniques and methodological decisions made to navigate through such issues. However, there is not enough methodological guidance, especially in the form of a worked case study that demonstrates the challenges of imbalanced datasets and offers prescriptive on how to deal with them. Using data on P2P money transfer services, this article presents a running example by analyzing the same dataset using several different methods. It then compares the outcomes of these choices and explicates the rationale behind some decisions such as inclusion and categorization of variables, parameter setting, and model selection. Finally, the article discusses certain regressions models such as weighted logistic regression and propensity matching, and when they should be used.","PeriodicalId":51086,"journal":{"name":"Journal of Database Management","volume":"5 12","pages":""},"PeriodicalIF":2.6,"publicationDate":"2024-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139448800","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"RDF(S) Store in Object-Relational Databases","authors":"Z. Ma, Daiyi Li, Jiawen Lu, Ruizhe Ma, Li Yan","doi":"10.4018/jdm.334710","DOIUrl":"https://doi.org/10.4018/jdm.334710","url":null,"abstract":"The Resource Description Framework (RDF) and RDF Schema (RDFS) recommended by World Wide Web Consortium (W3C) provide a flexible model for semantically representing data on the web. With the widespread acceptance of RDF(S) (RDF and RDFS for short), a large number of RDF(S) is available. Databases play an important role in managing RDF(S). However, there are few studies on using object-relational databases to store RDF(S). In this paper, the authors propose the formal definitions of RDF(S) model and object-relational databases model. Then they introduce the approach for storing RDF(S) in object-relational databases based on the formal definitions. They implement a prototype system to demonstrate the feasibility of the approach and test the performance and semantic retention ability of this prototype system with the benchmark dataset.","PeriodicalId":51086,"journal":{"name":"Journal of Database Management","volume":"42 4","pages":""},"PeriodicalIF":2.6,"publicationDate":"2023-12-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138979725","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Sample-Aware Database Tuning System With Deep Reinforcement Learning","authors":"Zhongliang Li, Yaofeng Tu, Zongmin Ma","doi":"10.4018/jdm.333519","DOIUrl":"https://doi.org/10.4018/jdm.333519","url":null,"abstract":"Based on the relationship between client load and overall system performance, the authors propose a sample-aware deep deterministic policy gradient model. Specifically, they improve sample quality by filtering out sample noise caused by the fluctuations of client load, which accelerates the model convergence speed of the intelligent tuning system and improves the tuning effect. Also, the hardware resources and client load consumed by the database in the working process are added to the model for training. This can enhance the performance characterization ability of the model and improve the recommended parameters of the algorithm. Meanwhile, they propose an improved closed-loop distributed comprehensive training architecture of online and offline training to quickly obtain high-quality samples and improve the efficiency of parameter tuning. Experimental results show that the configuration parameters can make the performance of the database system better and shorten the tuning time.","PeriodicalId":51086,"journal":{"name":"Journal of Database Management","volume":" 0","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135192052","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Narrativization in Information Systems Development","authors":"Pasi Raatikainen, Samuli Pekkola, Maria Mäkelä","doi":"10.4018/jdm.333471","DOIUrl":"https://doi.org/10.4018/jdm.333471","url":null,"abstract":"People see the world and convey their perception of it with narratives. In an information system context, stories are told and collected when the systems are developed. Requirements elicitation is largely dependent on communication between systems designers and users. Thus, stories have a significant impact on conceptualizing future users' needs. This paper presents a literature review on how stories and narratives have been considered in central IS literature. Narrative-theoretical parameters are used as a lens to analyze the literature. This shows that explicit discussion is non-existent, and the characteristics are considered partially. The result is a biased and narrow understanding of the informants' needs and wishes. This may be significant in the requirements because narratives are not as simple a form of communication as is usually assumed. It is proposed that better understanding narratives would equip systems analysts with an in-depth understanding about the nuances inherent in communication when communicating with users.","PeriodicalId":51086,"journal":{"name":"Journal of Database Management","volume":"5 11","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135480524","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Artificial Intelligence and Machine Learning for Job Automation","authors":"Gang Peng, R. Bhaskar","doi":"10.4018/jdm.318455","DOIUrl":"https://doi.org/10.4018/jdm.318455","url":null,"abstract":"Job automation is a critical decision that has brought about profound changes in the workplace. However, the question of what drives job automation remains unclear. This study conducts an interdisciplinary review of five theoretical frameworks on job automation, paying particular attention to the role played by artificial intelligence and machine learning. It highlights the concepts and mechanisms underlying each of the frameworks, compares and contrasts their similarities and differences, and highlights challenges and suggests opportunities of job automation. It also proposes an integrated framework on job automation by addressing the research gaps in extant frameworks and thereby contributes to the research and practice on this important topic.","PeriodicalId":51086,"journal":{"name":"Journal of Database Management","volume":"1 1","pages":""},"PeriodicalIF":2.6,"publicationDate":"2023-02-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47109311","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"College English Intelligent Writing Score System Based on Big Data Analysis and Deep Learning Algorithm","authors":"Fei Qin","doi":"10.4018/jdm.314561","DOIUrl":"https://doi.org/10.4018/jdm.314561","url":null,"abstract":"With the development of technologies such as big data analysis and deep learning, various industries have begun to integrate with big data analysis and deep learning and continue to promote the development of the industry. This system is an intelligent writing scoring system for college English teaching. It uses popular big data analysis and deep learning to distinguish training algorithms. From 2015 to 2022, the number of college students taking exams has increased yearly, with an increase of more than 50%. Therefore, the system proposes a text vector calculation method that can find matching samples in the text set after the text is weighted by the weight function and uses deep learning to distinguish the algorithm evaluates the matched text, and finally can get the final score according to the content quality, semantic coherence, text readability, and other aspects of the text. Compared with traditional manual scoring, this technology is more convenient, quick, concise, and effective. This system is significant for improving the efficiency of teaching English writing in college.","PeriodicalId":51086,"journal":{"name":"Journal of Database Management","volume":" ","pages":""},"PeriodicalIF":2.6,"publicationDate":"2022-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41382831","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Deep Convolutional Neural Networks With Transfer Learning for Automobile Damage Image Classification","authors":"Xiaoguang Tian, Henry Han","doi":"10.4018/jdm.309738","DOIUrl":"https://doi.org/10.4018/jdm.309738","url":null,"abstract":"Deep learning models are more capable of handling large and complex datasets that generally appear in the insurance industry than traditional machine learning models. In this study, transfer learning was employed to build and optimize a simulated automobile damage assessment system. Several classic deep learning methods were applied to extract features from original and augmented automobile damage images. Then, traditional machine learning and cross-validation techniques were applied to train and validate the system. The proposed deep learning model demonstrated advantages over traditional machine learning models regarding features extraction and accuracy. Deep learning approaches fused with logistic regression and support vector machine were found performing as well as those with artificial neural networks under two simulated scenarios. With the proposed method, automobile damage images can be evaluated for insurance adjustment purposes automatically, based on the acquired input. Hence, insurers can automate the claim and adjustment process, thereby achieving cost and time savings.","PeriodicalId":51086,"journal":{"name":"Journal of Database Management","volume":"1 1","pages":""},"PeriodicalIF":2.6,"publicationDate":"2022-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41312529","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Shih-Yi Cheng, Jinbao Zhang, Zhan Gao, Jiehua Wang
{"title":"Circuit Implementation of Respiratory Information Extracted from Electrocardiograms","authors":"Shih-Yi Cheng, Jinbao Zhang, Zhan Gao, Jiehua Wang","doi":"10.4018/jdm.314211","DOIUrl":"https://doi.org/10.4018/jdm.314211","url":null,"abstract":"Breathing is an important physiological process in the human body. The wavelet transform method can extract respiratory information from electrocardiogram (ECG) data; thus, the authors designed an integrated circuit of ECG-derived respiration (EDR). They propose a discrete wavelet transform (DWT) EDR algorithm based on an analysis of the heartbeat frequency and respiration. They verified the algorithm in both the time domain and the frequency domain using Matlab. Next, the DWT EDR digital circuit was designed using the QUARTUS program. Finally, they used a field programmable gate array (FPGA) for downloading and simulation, and they verified the designed circuits using a logic analyzer, where they compared the waveform of the data obtained from the EDR circuit with the waveform obtained after processing the wavelet transform EDR in Matlab. The experimental results showed that the circuit can allow the extraction of respiratory information from ECG data.","PeriodicalId":51086,"journal":{"name":"Journal of Database Management","volume":" ","pages":""},"PeriodicalIF":2.6,"publicationDate":"2022-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47665298","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Lan Huang, Yuanwei Zhao, P. Mestre, Laipeng Han, Kangping Wang, Wenjuan Gao, Rui Zhang
{"title":"Research on Reverse Skyline Query Algorithm Based on Decision Set","authors":"Lan Huang, Yuanwei Zhao, P. Mestre, Laipeng Han, Kangping Wang, Wenjuan Gao, Rui Zhang","doi":"10.4018/jdm.313971","DOIUrl":"https://doi.org/10.4018/jdm.313971","url":null,"abstract":"Reverse skyline query is an extension of the classical skyline query, widely used in the decision support in e-business. The vast burst of big data in e-business challenges the classical algorithms for such queries. This paper provides a novel definition of decision set and a decision set based reverse skyline query method called DRS on the double-layer R tree indexing in a map-reduce manner. Theoretical proofs are provided for the correctness and complexity of the DRS algorithm. Experiments made using several large data sets are presented and analyzed to illustrate the applicability and the outperformance of DRS over the state-of-the-art reverse skyline query methods.","PeriodicalId":51086,"journal":{"name":"Journal of Database Management","volume":" ","pages":""},"PeriodicalIF":2.6,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41507217","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Arodh Lal Karn, G. Bagale, Bhavana Raj Kondamudi., D. Srivastava, R. Gupta, Sudhakar Sengan
{"title":"Measuring the Determining Factors of Financial Development of Commercial Banks in Selected SAARC Countries","authors":"Arodh Lal Karn, G. Bagale, Bhavana Raj Kondamudi., D. Srivastava, R. Gupta, Sudhakar Sengan","doi":"10.4018/jdm.311092","DOIUrl":"https://doi.org/10.4018/jdm.311092","url":null,"abstract":"Traditional banks face the issue of risk diversification, and it is dealt with when they evolve into financial institutions. So, the present study aims to investigate banking and off-balance sheet (OBS)-based risks and regulatory changes in certain age-old South Asian (SA) banks and finds the tenacity of the OBS in the long run. For these research goals, two estimates are applied: fixed effects (FE) and generalized method of moments (GMM). Using FE, the researchers estimate the realm and time for finding financial shocks and other time-related factors affecting the SA countries. The majority of findings reveal a constant market theory stating the performance of SA in assessing OBS-related risks. Banks in SA also seem to follow the market regulatory and TT in capital needs that will incentivize banks to take too much risk in off-balance sheet activities (OBSA). The research findings are practically applied to bank-related risks, pressure from regulatory restructuring, and dangers from the systematic factors beneficial to policymakers and practitioners.","PeriodicalId":51086,"journal":{"name":"Journal of Database Management","volume":" ","pages":""},"PeriodicalIF":2.6,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49092300","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}