Big DataPub Date : 2025-02-01Epub Date: 2023-04-17DOI: 10.1089/big.2022.0261
Shadi A Aljawarneh, Romesaa Al-Quraan
{"title":"Pneumonia Detection Using Enhanced Convolutional Neural Network Model on Chest X-Ray Images.","authors":"Shadi A Aljawarneh, Romesaa Al-Quraan","doi":"10.1089/big.2022.0261","DOIUrl":"10.1089/big.2022.0261","url":null,"abstract":"<p><p>Pneumonia, caused by microorganisms, is a severely contagious disease that damages one or both the lungs of the patients. Early detection and treatment are typically favored to recover infected patients since untreated pneumonia can lead to major complications in the elderly (>65 years) and children (<5 years). The objectives of this work are to develop several models to evaluate big X-ray images (XRIs) of the chest, to determine whether the images show/do not show signs of pneumonia, and to compare the models based on their accuracy, precision, recall, loss, and receiver operating characteristic area under the ROC curve scores. Enhanced convolutional neural network (CNN), VGG-19, ResNet-50, and ResNet-50 with fine-tuning are some of the deep learning (DL) algorithms employed in this study. By training the transfer learning model and enhanced CNN model using a big data set, these techniques are used to identify pneumonia. The data set for the study was obtained from Kaggle. It should be noted that the data set has been expanded to include further records. This data set included 5863 chest XRIs, which were categorized into 3 different folders (i.e., train, val, test). These data are produced every day from personnel records and Internet of Medical Things devices. According to the experimental findings, the ResNet-50 model showed the lowest accuracy, that is, 82.8%, while the enhanced CNN model showed the highest accuracy of 92.4%. Owing to its high accuracy, enhanced CNN was regarded as the best model in this study. The techniques developed in this study outperformed the popular ensemble techniques, and the models showed better results than those generated by cutting-edge methods. Our study implication is that a DL models can detect the progression of pneumonia, which improves the general diagnostic accuracy and gives patients new hope for speedy treatment. Since enhanced CNN and ResNet-50 showed the highest accuracy compared with other algorithms, it was concluded that these techniques could be effectively used to identify pneumonia after performing fine-tuning.</p>","PeriodicalId":51314,"journal":{"name":"Big Data","volume":" ","pages":"16-29"},"PeriodicalIF":2.6,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9737399","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Big DataPub Date : 2025-02-01Epub Date: 2023-06-16DOI: 10.1089/big.2022.0299
Yi Gao, Dawei Yan, Xiangyu Kong, Ning Liu, Zhiyu Zou, Bixuan Gao, Yang Wang, Yue Chen, Shuai Luo
{"title":"A Data-Driven Analysis Method for the Trajectory of Power Carbon Emission in the Urban Area.","authors":"Yi Gao, Dawei Yan, Xiangyu Kong, Ning Liu, Zhiyu Zou, Bixuan Gao, Yang Wang, Yue Chen, Shuai Luo","doi":"10.1089/big.2022.0299","DOIUrl":"10.1089/big.2022.0299","url":null,"abstract":"<p><p>\"Industry 4.0\" aims to build a highly versatile, individualized digital production model for goods and services. The carbon emission (CE) issue needs to be addressed by changing from centralized control to decentralized and enhanced control. Based on a solid CE monitoring, reporting, and verification system, it is necessary to study future power system CE dynamics simulation technology. In this article, a data-driven approach is proposed to analyzing the trajectory of urban electricity CEs based on empirical mode decomposition, which suggests combining macro-energy thinking and big data thinking by removing the barriers among power systems and related technological, economic, and environmental domains. Based on multisource heterogeneous mass data acquisition, effective secondary data can be extracted through the integration of statistical analysis, causal analysis, and behavior analysis, which can help construct a simulation environment supporting the dynamic interaction among mathematical models, multi-agents, and human participants.</p>","PeriodicalId":51314,"journal":{"name":"Big Data","volume":" ","pages":"42-58"},"PeriodicalIF":2.6,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9634989","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Big DataPub Date : 2025-02-01DOI: 10.1089/big.2024.0132
Victor Chang, Péter Kacsuk, Gary Wills, Reinhold Behringer
{"title":"Introduction to the Special Issue on Big Data and the Internet of Things in Complex Information Systems.","authors":"Victor Chang, Péter Kacsuk, Gary Wills, Reinhold Behringer","doi":"10.1089/big.2024.0132","DOIUrl":"https://doi.org/10.1089/big.2024.0132","url":null,"abstract":"","PeriodicalId":51314,"journal":{"name":"Big Data","volume":"13 1","pages":"1-2"},"PeriodicalIF":2.6,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143450644","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Big DataPub Date : 2025-01-17DOI: 10.1089/big.2024.0113
Yiting Bai, Baiqian Gu, Chao Tang
{"title":"Enhancing Real-Time Patient Monitoring in Intensive Care Units with Deep Learning and the Internet of Things.","authors":"Yiting Bai, Baiqian Gu, Chao Tang","doi":"10.1089/big.2024.0113","DOIUrl":"https://doi.org/10.1089/big.2024.0113","url":null,"abstract":"<p><p>The demand for intensive care units (ICUs) is steadily increasing, yet there is a relative shortage of medical staff to meet this need. Intensive care work is inherently heavy and stressful, highlighting the importance of optimizing these units' working conditions and processes. Such optimization is crucial for enhancing work efficiency and elevating the level of diagnosis and treatment provided in ICUs. The intelligent ICU concept represents a novel ward management model that has emerged through advancements in modern science and technology. This includes communication technology, the Internet of Things (IoT), artificial intelligence (AI), robotics, and big data analytics. By leveraging these technologies, the intelligent ICU aims to significantly reduce potential risks associated with human error and improve patient monitoring and treatment outcomes. Deep learning (DL) and IoT technologies have huge potential to revolutionize the surveillance of patients in the ICUs due to the critical and complex nature of their conditions. This article provides an overview of the most recent research and applications of linical data for critically ill patients, with a focus on the execution of AI. In the ICU, seamless and continuous monitoring is critical, as even little delays in patient care decision-making can result in irreparable repercussions or death. This article looks at how modern technologies like DL and the IoT can improve patient monitoring, clinical results, and ICU processes. Furthermore, it investigates the function of wearable and advanced health sensors coupled with IoT networking systems, which enable the secure connection and analysis of various forms of patient data for predictive and remote analysis by medical professionals. By assessing existing patient monitoring systems, outlining the roles of DL and IoT, and analyzing the benefits and limitations of their integration, this study hopes to shed light on the future of ICU patient care and identify opportunities for further research.</p>","PeriodicalId":51314,"journal":{"name":"Big Data","volume":" ","pages":""},"PeriodicalIF":2.6,"publicationDate":"2025-01-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143015934","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Big DataPub Date : 2025-01-10DOI: 10.1089/big.2024.0036
Sofie Goethals, Sandra Matz, Foster Provost, David Martens, Yanou Ramon
{"title":"The Impact of Cloaking Digital Footprints on User Privacy and Personalization.","authors":"Sofie Goethals, Sandra Matz, Foster Provost, David Martens, Yanou Ramon","doi":"10.1089/big.2024.0036","DOIUrl":"https://doi.org/10.1089/big.2024.0036","url":null,"abstract":"<p><p>Our online lives generate a wealth of behavioral records-<i>digital footprints</i>-which are stored and leveraged by technology platforms. These data can be used to create value for users by personalizing services. At the same time, however, it also poses a threat to people's privacy by offering a highly intimate window into their private traits (e.g., their personality, political ideology, sexual orientation). We explore the concept of <i>cloaking</i>: allowing users to hide parts of their digital footprints from predictive algorithms, to prevent unwanted inferences. This article addresses two open questions: (i) can cloaking be effective in the longer term, as users continue to generate new digital footprints? And (ii) what is the potential impact of cloaking on the accuracy of <i>desirable</i> inferences? We introduce a novel strategy focused on cloaking \"metafeatures\" and compare its efficacy against just cloaking the raw footprints. The main findings are (i) while cloaking effectiveness does indeed diminish over time, using metafeatures slows the degradation; (ii) there is a tradeoff between privacy and personalization: cloaking undesired inferences also can inhibit desirable inferences. Furthermore, the metafeature strategy-which yields more stable cloaking-also incurs a larger reduction in desirable inferences.</p>","PeriodicalId":51314,"journal":{"name":"Big Data","volume":" ","pages":""},"PeriodicalIF":2.6,"publicationDate":"2025-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142958560","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Prognostic Modeling for Liver Cirrhosis Mortality Prediction and Real-Time Health Monitoring from Electronic Health Data.","authors":"Chengping Zhang, Muhammad Faisal Buland Iqbal, Imran Iqbal, Minghao Cheng, Nadia Sarhan, Emad Mahrous Awwad, Yazeed Yasin Ghadi","doi":"10.1089/big.2024.0071","DOIUrl":"https://doi.org/10.1089/big.2024.0071","url":null,"abstract":"<p><p>Liver cirrhosis stands as a prominent contributor to mortality, impacting millions across the United States. Enabling health care providers to predict early mortality among patients with cirrhosis holds the potential to enhance treatment efficacy significantly. Our hypothesis centers on the correlation between mortality and laboratory test results along with relevant diagnoses in this patient cohort. Additionally, we posit that a deep learning model could surpass the predictive capabilities of the existing Model for End-Stage Liver Disease score. This research seeks to advance prognostic accuracy and refine approaches to address the critical challenges posed by cirrhosis-related mortality. This study evaluates the performance of an artificial neural network model for liver disease classification using various training dataset sizes. Through meticulous experimentation, three distinct training proportions were analyzed: 70%, 80%, and 90%. The model's efficacy was assessed using precision, recall, F1-score, accuracy, and support metrics, alongside receiver operating characteristic (ROC) and precision-recall (PR) curves. The ROC curves were quantified using the area under the curve (AUC) metric. Results indicated that the model's performance improved with an increased size of the training dataset. Specifically, the 80% training data model achieved the highest AUC, suggesting superior classification ability over the models trained with 70% and 90% data. PR analysis revealed a steep trade-off between precision and recall across all datasets, with 80% training data again demonstrating a slightly better balance. This is indicative of the challenges faced in achieving high precision with a concurrently high recall, a common issue in imbalanced datasets such as those found in medical diagnostics.</p>","PeriodicalId":51314,"journal":{"name":"Big Data","volume":" ","pages":""},"PeriodicalIF":2.6,"publicationDate":"2024-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142803050","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Big DataPub Date : 2024-12-01Epub Date: 2023-09-04DOI: 10.1089/big.2022.0021
Xinjun Lai, Guitao Huang, Ziyue Zhao, Shenhe Lin, Sheng Zhang, Huiyu Zhang, Qingxin Chen, Ning Mao
{"title":"Social Listening for Product Design Requirement Analysis and Segmentation: A Graph Analysis Approach with User Comments Mining.","authors":"Xinjun Lai, Guitao Huang, Ziyue Zhao, Shenhe Lin, Sheng Zhang, Huiyu Zhang, Qingxin Chen, Ning Mao","doi":"10.1089/big.2022.0021","DOIUrl":"10.1089/big.2022.0021","url":null,"abstract":"<p><p>This study investigates customers' product design requirements through online comments from social media, and quickly translates these needs into product design specifications. First, the exponential discriminative snowball sampling method was proposed to generate a product-related subnetwork. Second, natural language processing (NLP) was utilized to mine user-generated comments, and a Graph SAmple and aggreGatE method was employed to embed the user's node neighborhood information in the network to jointly define a user's persona. Clustering was used for market and product model segmentation. Finally, a deep learning bidirectional long short-term memory with conditional random fields framework was introduced for opinion mining. A comment frequency-invert group frequency indicator was proposed to quantify all user groups' positive and negative opinions for various specifications of different product functions. A case study of smartphone design analysis is presented with data from a large Chinese online community called Baidu Tieba. Eleven layers of social relationships were snowball sampled, with 14,018 users and 30,803 comments. The proposed method produced a more reasonable user group clustering result than the conventional method. With our approach, user groups' dominating likes and dislikes for specifications could be immediately identified, and the similar and different preferences of product features by different user groups were instantly revealed. Managerial and engineering insights were also discussed.</p>","PeriodicalId":51314,"journal":{"name":"Big Data","volume":" ","pages":"456-477"},"PeriodicalIF":2.6,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10508327","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Big DataPub Date : 2024-12-01Epub Date: 2023-05-18DOI: 10.1089/big.2022.0158
Farah Haneef, Muddassar A Sindhu
{"title":"IDLIQ: An Incremental <i>Deterministic Finite Automaton</i> Learning Algorithm Through Inverse Queries for Regular Grammar Inference.","authors":"Farah Haneef, Muddassar A Sindhu","doi":"10.1089/big.2022.0158","DOIUrl":"10.1089/big.2022.0158","url":null,"abstract":"<p><p>We present an efficient incremental learning algorithm for <i>Deterministic Finite Automaton</i> (DFA) with the help of inverse query (IQ) and membership query (MQ). This algorithm is an extension of the <i>Identification of Regular Languages</i> (ID) algorithm from a complete to an incremental learning setup. The learning algorithm learns by making use of a set of labeled examples and by posing queries to a knowledgeable teacher, which is equipped to answer IQs along with MQs and equivalence query. Based on the examples (elements of the live complete set) and responses against IQs from the <i>minimally adequate teacher</i> (MAT), the learning algorithm constructs the hypothesis automaton, consistent with all observed examples. The Incremental DFA Learning algorithm through Inverse Queries (IDLIQ) takes <math><mstyle><mi>O</mi></mstyle><mrow><mo>(</mo><mrow><mo>|</mo><mi>Σ</mi><mo>|</mo><mi>N</mi><mo>+</mo><mo>|</mo><msub><mrow><mi>P</mi></mrow><mrow><mi>c</mi></mrow></msub><mo>|</mo><mo>|</mo><mi>F</mi><mo>|</mo></mrow><mo>)</mo></mrow></math> time complexity in the presence of a MAT and ensures convergence to a minimal representation of the target DFA with finite number of labeled examples. Existing incremental learning algorithms; the Incremental ID, the Incremental Distinguishing Strings have polynomial (cubic) time complexity in the presence of a MAT. Therefore, sometimes, these algorithms even fail to learn large complex software systems. In this research work, we have reduced the complexity (from cubic to square form) of the DFA learning in an incremental setup. Finally, we prove the correctness and termination of the IDLIQ algorithm.</p>","PeriodicalId":51314,"journal":{"name":"Big Data","volume":" ","pages":"446-455"},"PeriodicalIF":2.6,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9492270","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Big DataPub Date : 2024-12-01Epub Date: 2022-02-10DOI: 10.1089/big.2021.0013
Rajalakshmi Gurusamy, Siva Ranjani Seenivasan
{"title":"DGSLSTM: Deep Gated Stacked Long Short-Term Memory Neural Network for Traffic Flow Forecasting of Transportation Networks on Big Data Environment.","authors":"Rajalakshmi Gurusamy, Siva Ranjani Seenivasan","doi":"10.1089/big.2021.0013","DOIUrl":"10.1089/big.2021.0013","url":null,"abstract":"<p><p>Deep learning and big data techniques have become increasingly popular in traffic flow forecasting. Deep neural networks have also been applied to traffic flow forecasting. Furthermore, it is difficult to determine whether neural networks can be used for accurate traffic flow prediction. Moreover, since the network model is poorly structured and the parameter optimization technique is inappropriate, the traffic flow prediction is inaccurate because of the lack of certainty. The proposed system overcomes these problems by combining multiple simple recurrent long short-term memory (LSTM) neural networks with time traits to predict traffic flow using a deep gated stacked neural network. To deepen the model, the hidden layers have been trained using an unsupervised layer-by-layer approach. This approach provides a systematic representation of the time series data. A systematic representation of hidden layers improves the accuracy of time series forecasting by capturing information at multiple levels. Furthermore, it emphasizes the importance of model structure, random weight initialization, and hyperparameters used in stacked LSTM to enhance predictive performance. The prediction efficacy of the deep gated stacked LSTM model is compared with that of the gated recurrent unit model and the stacked autoencoder model.</p>","PeriodicalId":51314,"journal":{"name":"Big Data","volume":" ","pages":"504-517"},"PeriodicalIF":2.6,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"39906258","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Big DataPub Date : 2024-12-01Epub Date: 2022-02-08DOI: 10.1089/big.2021.0200
Sima Attar-Khorasani, Ricardo Chalmeta
{"title":"Internet of Things Data Visualization for Business Intelligence.","authors":"Sima Attar-Khorasani, Ricardo Chalmeta","doi":"10.1089/big.2021.0200","DOIUrl":"10.1089/big.2021.0200","url":null,"abstract":"<p><p>This study contributes to the research on Internet of Things data visualization for business intelligence processes, an area of growing interest to scholars, by conducting a systematic review of the literature. A total of 237 articles published over the past 11 years were obtained and compared. This made it possible to identify the top contributing and most influential authors, countries, publishers, institutions, papers, and research findings, together with the challenges facing current research. Based on these results, this work provides a thorough insight into the field by proposing four research categories (Technology infrastructure, Case examples, Final-user experience, and Big Data tools), together with the development of these research streams over time and their future research directions.</p>","PeriodicalId":51314,"journal":{"name":"Big Data","volume":" ","pages":"478-503"},"PeriodicalIF":2.6,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"39899264","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}