{"title":"Research on the generation and evaluation of bridge defect datasets for underwater environments utilizing CycleGAN networks","authors":"","doi":"10.1016/j.eswa.2024.125576","DOIUrl":"10.1016/j.eswa.2024.125576","url":null,"abstract":"<div><div>The surface cracks on the underwater structures critically damages the overall reliability of the structures and reduces their strength. It is significant to monitor these cracks in timely manner. Recently, deep learning algorithms have been used for large scale data study and predictions. However, deep supervised learning algorithms need to get training on large scale data set which is time consuming and difficult to apply on the underwater structures. Therefore, it is highly needed to address these issues. Current research proposes an improved cycle-constraint generative adversarial algorithm for the timely detection of surface cracks in underwater structures. It utilizes an enhanced cycle-consistent generative adversarial network (CycleGAN). The proposed algorithm uses image processing techniques including DeblurGAN and Dark channel prior methods to get quality of dataset from underwater structures. The proposed Algorithm introduces a novel cross-domain VGG-cosine similarity assessment to precisely evaluate the performance of proposed algorithm to retain crack information etc. Moreover, performance of proposed algorithm is evaluated through both qualitative and quantitative methods. The quantitative results are directly obtained from the visual results are presented which are generated by the proposed Algorithm. Whereas, the performance of proposed algorithm based on quantitative results is obtained from metrics including PSNR, SSIM, and FID. Experimental results indicates that the proposed algorithm outperforms the original CycleGAN. End results indicate that the proposed algorithm decreased the value of FID by 20 % and increased the values of PSNR and SSIM by 2.37 % and 3.33 % respectively. Quantitative and qualitative results of the proposed algorithm give significant advantages during creating of surface crack images.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":null,"pages":null},"PeriodicalIF":7.5,"publicationDate":"2024-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142593683","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Hybrid genetic algorithm with Wiener process for multi-scale colored balanced traveling salesman problem","authors":"","doi":"10.1016/j.eswa.2024.125610","DOIUrl":"10.1016/j.eswa.2024.125610","url":null,"abstract":"<div><div>Colored traveling salesman problem (CTSP) can be applied to Multi-machine Engineering Systems (MES) in industry, colored balanced traveling salesman problem (CBTSP) is a variant of CTSP, which can be used to model the optimization problems with partially overlapped workspace such as the planning optimization (For example, process planning, assembly planning, productions scheduling). The traditional algorithms have been used to solve CBTSP, however, they are limited both in solution quality and solving speed, and the scale of CBTSP is also restricted. Moreover, the traditional algorithms still have the problems such as lacking theoretical support of mathematical physics. In order to improve these, this paper proposes a novel hybrid genetic algorithm (NHGA) based on Wiener process (ITÖ process) and generating neighborhood solution (GNS) to solve multi-scale CBTSP problem. NHGA firstly uses dual-chromosome coding to construct the solutions of CBTSP, then they are updated by the crossover operator, mutation operator and GNS. The crossover length of the crossover operator and the city number of the mutation operator are controlled by activity intensity based on ITÖ process, while the city keeping probability of GNS can be learned or obtained by Wiener process. The experiments show that NHGA can demonstrate an improvement over the state-of-art algorithms for multi-scale CBTSP in term of solution quality.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":null,"pages":null},"PeriodicalIF":7.5,"publicationDate":"2024-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142586046","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"GCENet: A geometric correspondence estimation network for tracking and loop detection in visual–inertial SLAM","authors":"","doi":"10.1016/j.eswa.2024.125659","DOIUrl":"10.1016/j.eswa.2024.125659","url":null,"abstract":"<div><div>Establishing robust and effective data correlation has been one of the core problems in visual based SLAM (Simultaneous Localization and Mapping). In this paper, we propose a geometric correspondence estimation network, GCENet, tailored for visual tracking and loop detection in visual–inertial SLAM. GCENet considers both local and global correlation in frames, enabling deep feature matching in scenarios involving noticeable displacement. Building upon this, we introduce a tightly-coupled visual–inertial state estimation system. To address challenges in extreme environments, such as strong illumination and weak texture, where manual feature matching tends to fail, a compensatory deep optical flow tracker is incorporated into our system. In such cases, our approach utilizes GCENet for dense optical flow tracking, replacing manual pipelines to conduct visual tracking. Furthermore, a deep loop detector based on GCENet is constructed, which utilizes estimated flow to represent scene similarity. Spatial consistency discrimination on candidate loops is conducted with GCENet to establish long-term data association, effectively suppressing false negatives and false positives in loop closure. Dedicated experiments are conducted in EuRoC drone, TUM-4Seasons and private robot datasets to evaluate the proposed method. The results demonstrate that our system exhibits superior robustness and accuracy in extreme environments compared to the state-of-the-art methods.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":null,"pages":null},"PeriodicalIF":7.5,"publicationDate":"2024-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142593550","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"CE-DCVSI: Multimodal relational extraction based on collaborative enhancement of dual-channel visual semantic information","authors":"","doi":"10.1016/j.eswa.2024.125608","DOIUrl":"10.1016/j.eswa.2024.125608","url":null,"abstract":"<div><div>Visual information implied by the images in multimodal relation extraction (MRE) usually contains details that are difficult to describe in text sentences. Integrating textual and visual information is the mainstream method to enhance the understanding and extraction of relations between entities. However, existing MRE methods neglect the semantic gap caused by data heterogeneity. Besides, some approaches map the relations between target objects in image scene graphs to text, but massive invalid visual relations introduce noise. To alleviate the above problems, we propose a novel multimodal relation extraction method based on cooperative enhancement of dual-channel visual semantic information (CE-DCVSI). Specifically, to mitigate the semantic gap between modalities, we realize fine-grained semantic alignment between entities and target objects through multimodal heterogeneous graphs, aligning feature representations of different modalities into the same semantic space using the heterogeneous graph Transformer, thus promoting the consistency and accuracy of feature representations. To eliminate the effect of useless visual relations, we perform multi-scale feature fusion between different levels of visual information and textual representations to increase the complementarity between features, improving the comprehensiveness and robustness of the multimodal representation. Finally, we utilize the information bottleneck principle to filter out invalid information from the multimodal representation to mitigate the negative impact of irrelevant noise. The experiments demonstrate that the method achieves 86.08% of the F1 score on the publicly available MRE dataset, which outperforms other baseline methods.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":null,"pages":null},"PeriodicalIF":7.5,"publicationDate":"2024-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142578463","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Identification of gene regulatory networks associated with breast cancer patient survival using an interpretable deep neural network model","authors":"","doi":"10.1016/j.eswa.2024.125632","DOIUrl":"10.1016/j.eswa.2024.125632","url":null,"abstract":"<div><div>Artificial neural networks have recently gained significant attention in biomedical research. However, their utility in survival analysis still faces many challenges. In addition to designing models for high accuracy, it is essential to optimize models that provide biologically meaningful insights. With these considerations in mind, we developed a deep neural network model, MaskedNet, to identify genes and pathways whose expression at the time of diagnosis is associated with overall survival. MaskedNet was trained using TCGA breast cancer transcriptome and clinical data, and the model’s final output was the predicted logarithm of the hazard ratio for death. The trained model was interpreted using SHapley Additive exPlanations (SHAP), a technique grounded in robust mathematical principles that assigns importance scores to input features. Compared to traditional Cox proportional hazards regression, MaskedNet had higher accuracy, as measured by Harrell’s C-index. We also found that aggregating outputs from several model runs identified multiple genes and pathways associated with overall survival, including <em>IFNG</em> and <em>PIK3CA</em> genes<em>,</em> along with their related pathways. To further elucidate the role of the <em>IFNG</em> gene, tumors were partitioned into two groups based on low and high <em>IFNG</em> SHAP values, respectively. Tumors with lower <em>IFNG</em> SHAP values exhibited higher <em>IFNG</em> expression and better overall survival, which were linked to more abundant presence of M1 macrophages and activated CD4+ and CD8+ T cells in the tumor microenvironment. The association of the <em>IFNG</em> pathway with overall survival was validated in the trastuzumab arm of the NCCTG-N9831 trial, an independent breast cancer study.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":null,"pages":null},"PeriodicalIF":7.5,"publicationDate":"2024-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142578466","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Learning face super-resolution through identity features and distilling facial prior knowledge","authors":"","doi":"10.1016/j.eswa.2024.125625","DOIUrl":"10.1016/j.eswa.2024.125625","url":null,"abstract":"<div><div>Deep learning techniques in electronic surveillance have shown impressive performance for super-resolution (SR) of captured low-quality face images. Most of these techniques adopt facial priors to improve the feature details in the resultant super-resolved images. However, the estimation of facial priors from the captured low-quality images is often inaccurate in real-life situations because of their tiny, noisy, and blurry nature. Thus, the fusion of such priors badly affects the performance of these models. Therefore, this work presents a teacher–student-based face SR framework that efficiently preserves the personal facial structure information in the super-resolved faces. In the proposed framework, the teacher network exploits the facial heatmap-based ground-truth-prior to learn the facial structure that is utilized by the student network. The student network is trained with the identity feature loss for maintaining the identity and facial structure information in reconstructed high-resolution (HR) face images. The performance of the proposed framework is evaluated by conducting the experimental study on standard datasets namely CelebA-HQ and LFW face. The experimental results reveal that the proposed technique conquers the existing methods for the face SR task.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":null,"pages":null},"PeriodicalIF":7.5,"publicationDate":"2024-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142578464","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Vehicle trajectory extraction with interacting multiple model for low-channel roadside LiDAR","authors":"","doi":"10.1016/j.eswa.2024.125662","DOIUrl":"10.1016/j.eswa.2024.125662","url":null,"abstract":"<div><div>High-precision and consistent vehicle trajectories encompass microscopic traffic parameters, mesoscopic traffic flow characteristics, and macroscopic traffic flow features, which is the cornerstone of innovation in data-driven traffic management and control applications. However, occlusion and trajectory interruption remain challenging in multivehicle tracking under complex traffic environments using low-channel roadside LiDAR. To address the challenge, a novel framework for vehicle trajectory extraction using low-channel roadside LiDAR was proposed. First, the geometric features of the cluster and its L-shape bounding box were used to address the over-segmentation in vehicle detection arising from occlusion and point cloud sparse. Then, objects within adjacent point cloud frames were associated by developing an improved Hungarian algorithm integrated with an adaptive distance threshold to solve the mismatching problem caused by objects entrancing and exiting in a new point cloud frame. Finally, an improved interacting multiple model by considering vehicle driving patterns was deployed to predict the location of missing vehicles and connect the interrupted trajectories. Experimental results showed that the proposed methods achieve 98.76 % of vehicle detection accuracy and 97.40 % of data association precision. The mean absolute error (MAE) and mean square error (MSE) of the vehicle position estimation are 0.2252 m and 0.0729 m<sup>2</sup>, respectively. The trajectory extraction precision outperforms most of the state-of-the-art algorithms.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":null,"pages":null},"PeriodicalIF":7.5,"publicationDate":"2024-11-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142586044","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A new look of dispatching for multi-objective interbay AMHS in semiconductor wafer manufacturing: A T–S fuzzy-based learning approach","authors":"","doi":"10.1016/j.eswa.2024.125615","DOIUrl":"10.1016/j.eswa.2024.125615","url":null,"abstract":"<div><div>Semiconductor wafer fabrication systems (SWFS) are among the most intricate discrete processing environments globally. Since the costs associated with automated material handling systems (AMHS) within fabs account for 20%–50% of manufacturing expenses, it is crucial to enhance the efficiency of material handling in semiconductor production lines. However, optimizing AMHS is difficult due to the complexities inherent in large-scale, nonlinear, dynamic, and stochastic production settings, as well as differing objectives and goals. To overcome these challenges, this paper presents a novel fuzzy-based learning algorithm to enhance the multi-objective dispatching model, which incorporates both transportation and production aspects for interbay AMHS in wafer fabrication manufacturing, aligning it more closely with real-world conditions. Furthermore, we formulate a new constrained nonlinear dispatching problem. To tackle the inherent nonlinearity, a Takagi-Sugeno (T–S) fuzzy modeling approach is developed, which transforms nonlinear terms into a fuzzy linear dispatching model and optimizes the weight in multi-objective problems to obtain the optimal solution. The effectiveness and superiority of the proposed approach are demonstrated through extensive simulations and comparative analysis with existing methods. As a result, the proposed method significantly improves transport efficiency, increases wafer throughput, and reduces processing cycle times.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":null,"pages":null},"PeriodicalIF":7.5,"publicationDate":"2024-11-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142572124","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Development of A deep Learning-based algorithm for High-Pitch helical computed tomography imaging","authors":"","doi":"10.1016/j.eswa.2024.125663","DOIUrl":"10.1016/j.eswa.2024.125663","url":null,"abstract":"<div><div>High-pitch X-ray helical computed tomography (HCT) imaging has been recently drawing considerable attention in biomedical fields due to its ability to reduce the scanning time and thus lower the radiation dose that objects (being imagined) may receive. However, the issue of compromised reconstruction quality caused by incomplete data in these high-pitch CT scans remains, thus limiting its applications. By addressing the aforementioned issue, this paper presents our study on the development of a novel deep leaning (DL)-based algorithm, ViT-U, for high-pitch X-ray propagation-based imaging HCT (PBI-HCT) reconstruction. ViT-U consists of two key process modules of a vision transformer (ViT) and a convolutional neural network (i.e., U-Net), where ViT addresses the missing information in the data domain and U-Net enhances the post data-processing in the reconstruction domain. For verification, we designed and conducted simulations and experiments with both low-density-biomaterial samples and biological-tissue samples to exemplify the biomedical applications, and then examined the ViT-U performance with varying pitches of 3, 3.5, 4, and 4.5, respectively, for comparison in term of radiation does and reconstruction quality. Our results showed that the high-pitch PBI-HCT allowed for the dose reduction from 72% to 93%. Importantly, our results demonstrated that the ViT-U exhibited outstanding performance by effectively removing the missing wedge artifacts thus enhancing the reconstruction quality of high-pitch PBI-HCT imaging. Also, our results showed the superior capability of ViT-U to achieve high quality of reconstruction from the high-pitch images with the helical pitch value up to 4 (which allowed for the substantial reduction of radiation doses). Taken together, our DL-based ViT-U algorithm not only enables high-speed imaging with low radiation dose, but also maintains the high quality of imaging reconstruction, thereby offering significant potentials for biomedical imaging applications.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":null,"pages":null},"PeriodicalIF":7.5,"publicationDate":"2024-11-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142578467","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Informer-FDR: A short-term vehicle speed prediction model in car-following scenario based on traffic environment","authors":"","doi":"10.1016/j.eswa.2024.125655","DOIUrl":"10.1016/j.eswa.2024.125655","url":null,"abstract":"<div><div>Drivers’ car-following behaviors on urban roads are influenced by various factors, including pedestrians, cyclists, adjacent vehicles, and roadside parking. However, few models consider these factors’ influence on drivers’ speed selections during car-following, limiting the human-like driving capability of advanced driver assistance systems (ADAS). This paper proposes a vehicle speed prediction model in car-following scenario that considers the influences of the traffic environment. The vehicle speed is predicted using Informer-FDR (Informer with fusion features, dilated causal convolution, and residual connection), which adopts an improved encoder-decoder structure based on the Informer model. Fusing features of traffic environment characteristics and vehicle dynamics parameters enables the dynamic interaction characteristics between drivers and the traffic environment and potential traffic conflicts to be effectively reflected, which enhances the model’s understanding of the complex driving environment. Moreover, the high computational complexity is reduced by using the ProbSparse self-attention mechanism, which will help to address the difficulty of applying Transformer class models to on-board platforms. Totally 3,980 car-following cases were extracted from naturalistic driving data (NDD), vehicle dynamics parameters and traffic environment characteristics in the car-following scenarios were obtained through target detection and ranging algorithm. The optimal feature set was mined using the combined feature selection method. The dilated causal convolution and average pooling layer are introduced to expand the receptive field of the model, enhance global feature extraction, and ensure the causality of temporal predictions. Furthermore, the residual connection was added to the encoder, realizing the direct deep transfer of cross-layer information. Verifications on the test set show that Informer-FDR has the lowest MAE (0.583), MSE (2.942), RMSE (1.715), and the highest speed prediction accuracy (97.76%), spacing gap accuracy (94.27%), acceleration accuracy (95.35%), which outperforms other baseline models in terms of prediction performance. The ablation study confirms the importance of the improved distilling layer module, residual connection module, and fusion features for predictive performance improvement. Additionally, the road-type experiment reveals performance differences of the model on different road types, emphasizing the importance of incorporating traffic environment on urban road.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":null,"pages":null},"PeriodicalIF":7.5,"publicationDate":"2024-11-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142593549","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}