{"title":"C3E: A framework for chart classification and content extraction","authors":"Muhammad Suhaib Kanroo , Hadia Showkat Kawoosa , Kapil Rana , Puneet Goyal","doi":"10.1016/j.compeleceng.2024.109861","DOIUrl":"10.1016/j.compeleceng.2024.109861","url":null,"abstract":"<div><div>Incorporating charts into technical documents enhances richness by simplifying complex data representation and improving comprehension. However, automated chart content extraction (CCE) presents a significant challenge within the domain of document analysis and understanding. The CCE problem can be viewed through a series of six sub-tasks: chart classification (CC), text detection and recognition (TDR), text role classification (TRC), axis analysis, legend analysis, and data extraction. Improving these sub-tasks is important for enhancing the effectiveness of CCE. This paper introduces the chart classification and content extraction (C3E) framework, with a primary focus on the first three sub-tasks of CCE: CC, TDR, and TRC. We propose a ChartVision model for the CC, an EfficientNet-based model coupled with a dual-branch architecture incorporating a novel hybrid convolutional and dilated attention module. For text detection and TRC, we introduce a novel CCE method based on YOLOv5, CCE-YOLO, designed for localizing and classifying textual components of varying sizes. Further, for text recognition, we employ a convolutional recurrent neural network with connectionist temporal classification loss. We conducted experimental analysis on benchmark datasets to assess the effectiveness of our approach across each sub-task. Specifically, we evaluated CC, TDR, and TRC methods using the UB-PMC 2020 and UB-PMC 2022 datasets from the ICPR2020 and ICPR2022 CHART-Infographics competitions. The C3E framework achieved notable F1-scores of 94.26%, 92.44%, and 80.64% for CC, TDR, and TRC, respectively on the UB-PMC 2020 dataset and 94.0%, 91.98%, and 84.48% for CC, TDR, and TRC, respectively on the UB-PMC 2022 dataset.</div></div>","PeriodicalId":50630,"journal":{"name":"Computers & Electrical Engineering","volume":"121 ","pages":"Article 109861"},"PeriodicalIF":4.0,"publicationDate":"2024-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142701821","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Nonlinear robust integral based actor–critic reinforcement learning control for a perturbed three-wheeled mobile robot with mecanum wheels","authors":"Phuong Nam Dao, Minh Hiep Phung","doi":"10.1016/j.compeleceng.2024.109870","DOIUrl":"10.1016/j.compeleceng.2024.109870","url":null,"abstract":"<div><div>In this article, a novel Robust Integral of the Sign of the Error (RISE)-based Actor/Critic reinforcement learning control structure is established, which addresses the trajectory tracking control problem, optimality performance and observer effectiveness of a three mecanum wheeled mobile robot to be subject to slipping effect. The Actor–Critic Reinforcement Learning algorithm with a discount factor is introduced in integration with the Nonlinear RISE feedback term, which is designated to eliminate the dynamic uncertainties/disturbances from the affine nominal system. On the other hand, the persistence of excitation (PE) condition can be ignored due to the presence of RISE term. Stability analyses in two proposed theorems demonstrate all the signals in the closed-loop system and learning weights would be Uniformly Ultimate Boundedness (UUB) and the consideration of the system under the impact of RISE that can promote the tracking effectiveness. In conclusion, simulation results are shown in conjunction with the comparison to illustrate the powerful capability as well as the economy in control resources of the proposed algorithm.</div></div>","PeriodicalId":50630,"journal":{"name":"Computers & Electrical Engineering","volume":"121 ","pages":"Article 109870"},"PeriodicalIF":4.0,"publicationDate":"2024-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142701819","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A monocular three-dimensional object detection model based on uncertainty-guided depth combination for autonomous driving","authors":"Xin Zhou , Xiaolong Xu","doi":"10.1016/j.compeleceng.2024.109864","DOIUrl":"10.1016/j.compeleceng.2024.109864","url":null,"abstract":"<div><div>Three-Dimensional (3D) object detection is a crucial task for enhancing safety and efficiency in autonomous driving. However, estimating depth from monocular images remains a challenging task. Most existing monocular 3D object detection methods rely on additional auxiliary data sources to compensate for the lack of spatial information in monocular images. Nevertheless, these methods bring substantial computational overhead and time-consuming preprocessing steps. To address this issue, we propose a novel depth estimation method for monocular images that does not rely on any auxiliary information. Leveraging both the texture and geometric cues of detected objects, our method generates two depth estimates for each object based on the extracted Region of Interest (RoI) features: a direct depth estimate and a height-based depth estimate with uncertainty modeling. Our model dynamically assigns weights to these depth estimates based on their respective uncertainties and combines them to obtain the final depth. During the training process, the model assigns higher weights to depth branches with higher uncertainties, as these estimates exhibit greater tolerance to errors. As the combined depth network introduces increased complexity, we utilize Group Normalization (GN) to better capture spatial information in the prediction branch outputs. Furthermore, we leverage the Two-Dimensional (2D) information of objects to predict the residual in the 2D center after downsampling, aiding in the regression of 3D center. On the KITTI benchmark, our model achieves an average precision (AP) of 16.65 % and 23.19 % on 3D and bird's-eye view (BEV) detection for the moderate category, surpassing the state-of-the-art (SOTA) models in each category.</div></div>","PeriodicalId":50630,"journal":{"name":"Computers & Electrical Engineering","volume":"120 ","pages":"Article 109864"},"PeriodicalIF":4.0,"publicationDate":"2024-11-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142698895","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A smart contract solution for transparent auctions on permissioned blockchain platform","authors":"Sujata Swain , Vikas Chouhan","doi":"10.1016/j.compeleceng.2024.109859","DOIUrl":"10.1016/j.compeleceng.2024.109859","url":null,"abstract":"<div><div>The rapid and advanced development of the Internet and Technology in recent years has led to the increased popularity of online electronic auctioning (e-auction) systems like eBay. These e-auctions are becoming increasingly essential to the global economy and hold a promising future as they increase user participation and provide advantages such as time-saving, low cost, and ubiquity (not limited to geographic location). However, existing real-time e-auction systems are centralized and rely on intermediaries, leading to issues such as lack of transparency, corruption, security risks, interoperability, and low trust from bidders and auctioneers. To address these issues, we present a permissioned blockchain-based transparent auction system that allows participants to participate in the auction and bidder organization in a single decentralized platform. Auctioneer announces the auction then bidders submit an individual bid for the auction during the auction period. In this paper, we present the implementation of smart contracts over the Hyperledger Fabric platform, conduct testbeds using the caliper benchmark tool, and report the results.</div></div>","PeriodicalId":50630,"journal":{"name":"Computers & Electrical Engineering","volume":"121 ","pages":"Article 109859"},"PeriodicalIF":4.0,"publicationDate":"2024-11-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142701818","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Guoqing Wang , Bo Qiu , Ali Luo , Xiao Kong , Zhiren Pan , Qi Li , Fuji Ren , Guanlong Cao
{"title":"Detection and restoration of abnormal band data in photometric images","authors":"Guoqing Wang , Bo Qiu , Ali Luo , Xiao Kong , Zhiren Pan , Qi Li , Fuji Ren , Guanlong Cao","doi":"10.1016/j.compeleceng.2024.109871","DOIUrl":"10.1016/j.compeleceng.2024.109871","url":null,"abstract":"<div><div>Addressing the issue of abnormal band data processing in photometric surveys is imperative. Restoring of abnormal band data not only salvages a significant amount of existing astronomical observation data, but also has profound implications on the data processing of new optical telescopes in the future. This paper firstly designs Band Data MogaNet(BDMogaNet) classification model for normal or abnormal band data, which can automatically identify abnormal data. Then, for the restoration of abnormal band data, Global–Local Recursive Generalization(GLRG) restoration network is designed. The experiment used the SDSS image library, and the results proved that the classification accuracy of normal band data and abnormal band data using BDMogaNet reached 99.2% in the training set and 98.0% in the validation set, which had a better classification comparing to some newest methods. Moreover, PSNR of restoring abnormal band data using GLRG reached 33.96 dB, SSIM reached 0.73, and CM reached 6.09, which are all better compared to some newest methods.</div></div>","PeriodicalId":50630,"journal":{"name":"Computers & Electrical Engineering","volume":"121 ","pages":"Article 109871"},"PeriodicalIF":4.0,"publicationDate":"2024-11-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142701817","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Shreya Shree Das , Priyanka Singh , Jayendra Kumar , Subhojit Dawn , Anumoy Ghosh
{"title":"A machine learning-based approach for maximizing system profit in a power system by imbalance price curtailment","authors":"Shreya Shree Das , Priyanka Singh , Jayendra Kumar , Subhojit Dawn , Anumoy Ghosh","doi":"10.1016/j.compeleceng.2024.109874","DOIUrl":"10.1016/j.compeleceng.2024.109874","url":null,"abstract":"<div><div>The integration of wind farms into the power grid is difficult due to unpredictable wind speed fluctuations. This variation has an impact on power generation profitability, demanding effective forecasting to lessen pricing risks. A novel optimization approach is proposed in this paper to expand social welfare and profitability while increasing revenue for power generators. This method is crucial for avoiding financial risks related to variable wind patterns. Narrowing the gap between anticipated and actual wind speeds (WS<sub>AN</sub>, WS<sub>AC</sub>) can lessen the negative impact of imbalanced prices on profitability. This reduction is necessary to enhance the economic performance of the power system. The paper endorses the use of machine learning (ML) techniques, specifically Long Short-Term Memory (LSTM) and Random Forest (RF) methods, to precisely anticipate wind speed. These models serve as analytical tools for enlightening decision-making and resource allocation in wind energy generation. According to the study, pricing imbalances have a major impression on profit calculations in deregulated systems. The empirical data show that effective forecasting can expand financial outcomes for energy companies, reducing risks and maximizing revenue. Finally, the empirical results highlight the significance of accurate wind speed forecasts and the use of advanced optimization approaches in growing the profitability and efficiency of renewable energy-dependent power systems. These findings offer a strong foundation for further research and use of machine learning techniques in the energy sector. The optimization model was accomplished with modified IEEE 14 bus test systems in this work.</div></div>","PeriodicalId":50630,"journal":{"name":"Computers & Electrical Engineering","volume":"121 ","pages":"Article 109874"},"PeriodicalIF":4.0,"publicationDate":"2024-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142701816","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Identifying social concerns in virtual reality technology through text mining and large language models, and prioritizing them with the fuzzy hierarchized analytic network process","authors":"Esmaeil Rezaei , Behzad Mosallanezhad","doi":"10.1016/j.compeleceng.2024.109770","DOIUrl":"10.1016/j.compeleceng.2024.109770","url":null,"abstract":"<div><div>Virtual reality technology has rapidly gained popularity as an entertainment medium, drawing interest from diverse age groups. However, its widespread adoption depends on effectively addressing public concerns and achieving market acceptance. While some studies have acknowledged these concerns, a significant gap persists in comprehensive research that incorporates both individual and expert perspectives. Consequently, certain underlying social issues related to virtual reality systems remain unexplored and unprioritized. To address this gap, this paper proposes a methodology that utilizes Latent Semantic Analysis (LSA) to identify and assess social concerns from various sources, including user perspectives. Large Language Models (LLMs) assist in retrieving relevant chunks of articles during analysis, enhancing data quality. Furthermore, we introduce a novel decision-making tool, the Hierarchized Analytic Network Process (HANP) and its fuzzy form, to effectively rank these concerns. This approach addresses a limitation of the traditional Analytic Network Process (ANP), which can overemphasize dependent attributes, potentially leading to zero-weighted, less important attributes and making comparisons impossible. By prioritizing social concerns based on their significance, our approach aims to facilitate broader social acceptance of virtual reality technologies among the general public. To further demonstrate the advantages of our proposed approach, the results obtained from F-HANP (in situations where fuzzy judgments are available) and HANP are compared with other popular decision-making methods.</div></div>","PeriodicalId":50630,"journal":{"name":"Computers & Electrical Engineering","volume":"120 ","pages":"Article 109770"},"PeriodicalIF":4.0,"publicationDate":"2024-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142699938","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Artificial neural network-based virtual synchronous generator for frequency stability improving of grid integrating distributed generators","authors":"Abderrahmane Smahi , Salim Makhloufi","doi":"10.1016/j.compeleceng.2024.109877","DOIUrl":"10.1016/j.compeleceng.2024.109877","url":null,"abstract":"<div><div>The integration of renewable energy sources (RESs) is becoming increasingly prevalent in contemporary power grids. RESs, including distributed generators (DGs), utilize power electronics converters to interface with the grid, contributing to a reduction in grid inertia and an increase in vulnerability to stability issues. This shift has led to a gradual displacement of the traditional role of synchronous generators (SGs) in providing frequency regulation, with power electronics converters such as inverters taking on a more prominent role. Virtual synchronous generators (VSGs) or virtual synchronous machines (VSMs) offer a solution by emulating SG behavior in power electronics converters. However, these techniques encounter limitations in mathematical calculations and precision. This article proposes an artificial intelligent based VSM controller (AIVSM) designed to overcome these limitations. The AIVSM system leverages artificial neural networks (ANNs) to emulate real SGs. The ANN is trained using a substantial dataset derived from a SG of a diesel generator. Simulation results demonstrate the performance superiority of the AIVSM when compared to a conventional proportional integral (PI) VSM controller and an adaptive VSM controller.</div></div>","PeriodicalId":50630,"journal":{"name":"Computers & Electrical Engineering","volume":"120 ","pages":"Article 109877"},"PeriodicalIF":4.0,"publicationDate":"2024-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142699500","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yongjuan Yang , Jie Cao , Hong Zhao , Zhaobin Chang , Weijie Wang
{"title":"High frequency domain enhancement and channel attention module for multi-view stereo","authors":"Yongjuan Yang , Jie Cao , Hong Zhao , Zhaobin Chang , Weijie Wang","doi":"10.1016/j.compeleceng.2024.109855","DOIUrl":"10.1016/j.compeleceng.2024.109855","url":null,"abstract":"<div><div>Multi-view stereo based on deep learning is increasingly popular as a method for 3D reconstruction. Existing methods have made significant advancements in pixel-level depth estimation. However, challenges such as occlusions and non-Lambertian surfaces in images hinder accurate confidence estimation. Moreover, cost volume regularization often results in excessive smoothing at object boundaries. To tackle these challenges, we propose integrating the High Frequency Information Compensator and 3D Channel Attention Module into the Multi-View Stereo Network, termed HFCA-MVS. Firstly, in the feature volume aggregation stage, we introduce a high-frequency information compensator module to enhance the correlation between 2D semantics and 3D space. Subsequently, in the cost volume regularization stage, a 3D channel attention module is introduced to enhance the representation of channel features by capturing relationships among different channels. Lastly, the 3DCNN network employs the GELU activation function to boost the activation response and mitigate excessive object boundary smoothing. HFCA-MVS demonstrates competitive performance in 3D reconstruction across three benchmark datasets: DTU, BlendMVS, and Tanks&Temples. Particularly, compared to CasMVSNet, MVSTER, and Geo-MVSNet on the DTU benchmark, HFCA-MVS achieves a relative improvement in completeness of 33%, 6.5%, and 0.4%, respectively, and an enhancement in overall performance of 15% and 4.2% compared to CasMVSNet and MVSTER. Furthermore, our model yields comparable reconstruction results to existing models on the Tanks&Temples dataset.</div></div>","PeriodicalId":50630,"journal":{"name":"Computers & Electrical Engineering","volume":"121 ","pages":"Article 109855"},"PeriodicalIF":4.0,"publicationDate":"2024-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142701814","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Cries Avian , Jenq-Shiou Leu , Hang Song , Jun-ichi Takada , Nur Achmad Sulistyo Putro , Muhammad Izzuddin Mahali , Setya Widyawan Prakosa
{"title":"RCTrans-Net: A spatiotemporal model for fast-time human detection behind walls using ultrawideband radar","authors":"Cries Avian , Jenq-Shiou Leu , Hang Song , Jun-ichi Takada , Nur Achmad Sulistyo Putro , Muhammad Izzuddin Mahali , Setya Widyawan Prakosa","doi":"10.1016/j.compeleceng.2024.109873","DOIUrl":"10.1016/j.compeleceng.2024.109873","url":null,"abstract":"<div><div>Ultrawideband (UWB) radar systems are becoming increasingly popular for detecting human presence, even through walls. Recent advancements in signal processing use deep learning techniques, which are known for their accuracy. While earlier methods focused on spatial information using Convolutional Neural Networks (CNNs), newer research highlights the importance of temporal information, such as how data peaks shift over time. This study introduces RCTrans-Net, a deep-learning architecture that combines RCNet (a Residual CNN) for spatial features with TransNet (a Transformer) for temporal features. This fusion improves human presence classification in fast-time signal processing. Tested under various conditions—different materials, body orientations, ranges, and radar heights—RCTrans-Net achieved high performance with F1-scores of 0.997±0.000 for static, 0.967±0.004 for dynamic, and 0.978±0.001 for combined scenarios. The architecture outperforms previous methods and offers real-time processing with an inference time of about one millisecond.</div></div>","PeriodicalId":50630,"journal":{"name":"Computers & Electrical Engineering","volume":"120 ","pages":"Article 109873"},"PeriodicalIF":4.0,"publicationDate":"2024-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142699502","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}