Pengju Lyu, Wenjian Liu, Tingyi Lin, Jie Zhang, Yao Liu, Cheng Wang, Jianjun Zhu
{"title":"Semi-Supervised Segmentation of Abdominal Organs and Liver Tumor: Uncertainty Rectified Curriculum Labeling Meets X-Fuse","authors":"Pengju Lyu, Wenjian Liu, Tingyi Lin, Jie Zhang, Yao Liu, Cheng Wang, Jianjun Zhu","doi":"10.1088/2632-2153/ad4c38","DOIUrl":"https://doi.org/10.1088/2632-2153/ad4c38","url":null,"abstract":"\u0000 Precise liver tumors and associated organ segmentation hold immense value for surgical and radiological intervention, enabling anatomical localization for pre-operative planning and intra-operative guidance. Modern deep learning models for medical image segmentation have evolved from convolution neural networks to transformer architectures, significantly boosting global context understanding. However, accurate delineation especially of hepatic lesions remains an enduring challenge due to models’ predominant focus solely on spatial feature extraction failing to adequately characterize complex medical anatomies. Moreover, the relative paucity of expertly annotated medical imaging data restricts model exposure to diverse pathological presentations. In this paper, we present a three-phrased cascaded segmentation framework featuring an X-Fuse model that synergistically integrates spatial and frequency domain’s complementary information in dual encoders to enrich latent feature representation. To enhance model generalizability, building upon X Fuse topology and taking advantage of additional unlabeled pathological data, our proposed integration of curriculum pseudo-labeling with Jensen-Shannon variance-based uncertainty rectification promotes optimized pseudo-supervision in the context of semi-supervised learning. We further introduce a tumor-focus augmentation technique including training-free copy-paste and knowledge-based synthesis that show efficacy in simplicity, contributing to the substantial elevation of model adaptability on diverse lesional morphologies. Extensive experiments and modular evaluations on a holdout test set demonstrate that our methods significantly outperform existing state-of-the-art segmentation models in both supervised and semi-supervised settings, as measured by the Dice similarity coefficient, achieving superior delineation of bones (95.42%), liver (96.26%), and liver tumors (89.53%) with 16.41% increase comparing to V-Net on supervised-only and augmented-absent scenario. Our method marks a significant step toward the realization of more reliable and robust AI-assisted diagnostic tools for liver tumor intervention. We have made the codes publicly available.","PeriodicalId":503691,"journal":{"name":"Machine Learning: Science and Technology","volume":"57 7","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140975988","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Physics-inspired spatiotemporal-graph AI ensemble for the detection of higher order wave mode signals of spinning binary black hole mergers","authors":"Minyang Tian, Eliu Huerta, Huihuo Zheng, Prayush Kumar","doi":"10.1088/2632-2153/ad4c37","DOIUrl":"https://doi.org/10.1088/2632-2153/ad4c37","url":null,"abstract":"\u0000 We present a new class of AI models for the detection of quasi-circular, spinning, non-precessing binary black hole mergers whose waveforms include the higher order gravitational wave modes $(ell, |m|)={(2, 2), (2, 1), (3, 3), (3, 2), (4, 4)}$, and mode mixing effects in the (ell = 3, |m| = 2) harmonics. These AI models combine hybrid dilated convolution neural networks to accurately model both short- and long-range temporal sequential information of gravitational waves; and graph neural networks to capture spatial correlations among gravitational wave observatories to consistently describe and identify the presence of a signal in a three detector network encompassing the Advanced LIGO and Virgo detectors. We first trained these spatiotemporal-graph AI models using synthetic noise, using 1.2 million modeled waveforms to densely sample this signal manifold, within 1.7 hours using 256 NVIDIA A100 GPUs in the Polaris supercomputer at the Argonne Leadership Computing Facility. This distributed training approach exhibited optimal classification performance, and strong scaling up to 512 NVIDIA A100 GPUs. With these AI ensembles we processed data from a three detector network, and found that an ensemble of 4 AI models achieves state-of-the-art performance for signal detection, and reports two misclassifications for every decade of searched data. We distributed AI inference over 128 GPUs in the Polaris supercomputer and 128 nodes in the Theta supercomputer, and completed the processing of a decade of gravitational wave data from a three detector network within 3.5 hours. Finally, we fine-tuned these AI ensembles to process the entire month of February 2020, which is part of the O3b LIGO/Virgo observation run, and found 6 gravitational waves, concurrently identified in Advanced LIGO and Advanced Virgo data, and zero false positives. This analysis was completed in one hour using one NVIDIA A100 GPU.","PeriodicalId":503691,"journal":{"name":"Machine Learning: Science and Technology","volume":"59 9","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140972241","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Interpolation of Environmental Data Using Deep Learning and Model Inference","authors":"C. Ibebuchi, Itohan-Osa Abu","doi":"10.1088/2632-2153/ad4b94","DOIUrl":"https://doi.org/10.1088/2632-2153/ad4b94","url":null,"abstract":"\u0000 The temporal resolution of environmental data sets plays a major role in the granularity of the information that can be derived from the data. In most cases, it is required that different data sets have a common temporal resolution to enable their consistent evaluations and applications in making informed decisions. This study leverages deep learning with long short-term memory (LSTM) neural networks and model inference to enhance the temporal resolution of climate datasets, specifically temperature, and precipitation, from daily to sub-daily scales. We trained our model to learn the relationship between daily and sub-daily data, subsequently applying this knowledge to increase the resolution of a separate dataset with a coarser (daily) temporal resolution. Our findings reveal a high degree of accuracy for temperature predictions, evidenced by a correlation of 0.99 and a mean absolute error of 0.21 °C, between the actual and predicted sub-daily values. In contrast, the approach was less effective for precipitation, achieving an explained variance of only 37%, compared to 98% for temperature. Further, besides the sub-daily interpolation of the climate data sets, we adapted our approach to increase the resolution of the Normalized difference vegetation index of Landsat (from 16-day to 5-day interval) using the LSTM model pre-trained from the Sentinel 2 Normalized difference vegetation index - that exists at a relatively higher temporal resolution. The explained variance between the predicted Landsat and Sentinel 1 data is 70% with a mean absolute error of 0.03. These results suggest that our method is particularly suitable for environmental datasets with less pronounced short-term variability, offering a promising tool for improving the resolution and utility of the data.","PeriodicalId":503691,"journal":{"name":"Machine Learning: Science and Technology","volume":"12 26","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140980742","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Aya Messai, Ahlem Drif, A. Ouyahia, Meriem Guechi, Mounira Rais, Lars Kaderali, Hocine Cherifi
{"title":"Towards XAI agnostic explainability to assess differential diagnosis for Meningitis diseases","authors":"Aya Messai, Ahlem Drif, A. Ouyahia, Meriem Guechi, Mounira Rais, Lars Kaderali, Hocine Cherifi","doi":"10.1088/2632-2153/ad4a1f","DOIUrl":"https://doi.org/10.1088/2632-2153/ad4a1f","url":null,"abstract":"\u0000 Meningitis, characterized by meninges and cerebrospinal fluid (CSF) inflammation, poses diagnostic challenges due to diverse clinical manifestations. This work introduces an explainable AI automatic medical decision methodology that determines critical features and their relevant values for the differential diagnosis of various meningitis cases. We proceed with knowledge acquisition to define the rules for this research. Currently, we have established the etiological diagnosis of Meningococcaemia, Meningococcal Meningitis, Tuberculous Meningitis, Aseptic Meningitis, Haemophilus influenzae Meningitis, and Pneumococcal Meningitis. The data preprocessing was conducted after collecting data from samples with meningitis diseases at Setif Hospital in Algeria. Tree-based ensemble methods were then applied to assess the model’s performance. Finally, we implement an XAI agnostic explainability approach based on the SHapley Additive exPlanations technique to attribute each feature’s contribution to the model’s output. Experiments were conducted on the collected dataset and the SINAN database, obtained from the Brazilian Government’s Health Information System on Notifiable Diseases, which comprises 6729 patients aged over 18 years. The Extreme Gradient Boosting model was chosen for its superior performance metrics (Accuracy: 0.90, AUROC: 0.94, and F1-score: 0.98). Setif’s hospital data revealed notable performance metrics (Accuracy: 0.7143, F1-Score: 0.7857). This study's findings showcase each feature's contribution to the model’s predictions and diagnosis. It also reveals critical biomarker ranges associated with distinct types of Meningitis. Significant diagnostic effect was found for Meningococcal Meningitis with elevated neutrophil levels (>40%) and balanced lymphocyte levels (40-60%). Tuberculous Meningitis demonstrated low neutrophil levels (<60%) and elevated lymphocyte levels (>60%). Haemophilus influenzae meningitis exhibited a predominance of neutrophils (>80%), while Aseptic meningitis showed lower neutrophil levels (<40%) and lymphocyte levels within the range of 50-60%. The majority of the AI automatic medical decision results are twinned with validation by our team of infectious disease experts, confirming the alignment of algorithmic diagnoses with clinical practices.","PeriodicalId":503691,"journal":{"name":"Machine Learning: Science and Technology","volume":" 38","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-05-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140993021","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Antonii Belyshev, Alexander Kovrigin, Andrey Ustyuzhanin
{"title":"Beyond Dynamics: Learning to Discover Conservation Principles","authors":"Antonii Belyshev, Alexander Kovrigin, Andrey Ustyuzhanin","doi":"10.1088/2632-2153/ad4a20","DOIUrl":"https://doi.org/10.1088/2632-2153/ad4a20","url":null,"abstract":"\u0000 The discovery of conservation principles is crucial for understanding the fundamental behavior of both classical and quantum physical systems across numerous domains. This paper introduces an innovative method that merges representation learning and topological analysis to explore the topology of conservation law spaces. Notably, the robustness of our approach to noise makes it suitable for complex experimental setups and its aptitude extends to the analysis of quantum systems, as successfully demonstrated in our paper. We exemplify our method’s potential to unearth previously unknown conservation principles and endorse interdisciplinary research through a variety of physical simulations. In conclusion, this work emphasizes the significance of data-driven techniques in deepening our comprehension of the principles governing classical and quantum physical systems.","PeriodicalId":503691,"journal":{"name":"Machine Learning: Science and Technology","volume":" 40","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-05-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140993444","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Hussain Ahmad Madni, Rao Muhammad Umer, G. Foresti
{"title":"Exploiting Data Diversity in Multi-Domain Federated Learning","authors":"Hussain Ahmad Madni, Rao Muhammad Umer, G. Foresti","doi":"10.1088/2632-2153/ad4768","DOIUrl":"https://doi.org/10.1088/2632-2153/ad4768","url":null,"abstract":"\u0000 Federated Learning (FL) is an evolving machine learning technique that allows collaborative model training without sharing the original data among participants. In real-world scenarios, data residing at multiple clients are often heterogeneous in terms of different resolutions, magnifications, scanners, or imaging protocols, and thus challenging for global FL model convergence in collaborative training. Most of the existing FL methods consider data heterogeneity within one domain by assuming same data variation in each client site. In this paper, we consider data heterogeneity in FL with different domains of heterogeneous data by raising the problems of domain-shift, class-imbalance, and missing data. We propose a method, MDFL (Multi-Domain Federated Learning) as a solution to heterogeneous training data from multiple domains by training robust Transformer model. We use two loss functions, one for correctly predicting class labels and other for encouraging similarity and dissimilarity over latent features, to optimize the global FL model. We perform various experiments using different convolution-based networks and non-convolutional Transformer architectures on multi-domain datasets. We evaluate the proposed approach on benchmark datasets and compare with the existing FL methods. Our results show the superiority of the proposed approach which performs better in term of robust FL global model than the exiting methods.","PeriodicalId":503691,"journal":{"name":"Machine Learning: Science and Technology","volume":"16 2","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-05-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141016885","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Analysis of Machine Learning Prediction Reliability based on Sampling Distance Evaluation with Feature Decorrelation","authors":"evan askanazi, Ilya Grinberg","doi":"10.1088/2632-2153/ad4231","DOIUrl":"https://doi.org/10.1088/2632-2153/ad4231","url":null,"abstract":"\u0000 Despite successful use in a wide variety of disciplines for data analysis and prediction, machine learning (ML) methods suffer from a lack of understanding of the reliability of predictions due to the lack of transparency and black-box nature of ML models. In materials science and other fields, typical ML model results include a significant number of low-quality predictions. This problem is known to be particularly acute for target systems which differ significantly from the data used for ML model training. However, to date, a general method for uncertainty quantification (UQ) of ML predictions has not been available. Focusing on the intuitive and computationally efficient similarity-based UQ, we show that a simple metric based on Euclidean feature space distance and sampling density together with the decorrelation of the features using Gram-Schmidt orthogonalization allows effective separation of the accurately predicted data points from data points with poor prediction accuracy. To demonstrate the generality of the method, we apply it to support vector regression models for various small data sets for materials science and other fields. We also show that the proposed metric is a more effective UQ tool than the standard approach of using the average distance of k nearest neighbors (k=1-10) in features space for similarity evaluation. Our method is computationally simple, can be used with any ML learning method and enables analysis of the sources of the ML prediction errors. Therefore, it is suitable for use as a standard technique for the estimation of ML prediction reliability for small data sets and as a tool for data set design.","PeriodicalId":503691,"journal":{"name":"Machine Learning: Science and Technology","volume":"28 41","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140672092","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
C. Cakir, Can Bogoclu, Franziska Emmerling, Christina Streli, A. Guilherme Buzanich, Martin Radtke
{"title":"Machine Learning for Efficient Grazing-Exit X-ray Absorption Near Edge Structure Spectroscopy Analysis : Bayesian Optimization Approach","authors":"C. Cakir, Can Bogoclu, Franziska Emmerling, Christina Streli, A. Guilherme Buzanich, Martin Radtke","doi":"10.1088/2632-2153/ad4253","DOIUrl":"https://doi.org/10.1088/2632-2153/ad4253","url":null,"abstract":"\u0000 In materials science, traditional techniques for analysing layered structures are essential for obtaining information about local structure, electronic properties and chemical states. While valuable, these methods often require high vacuum environments and have limited depth profiling capabilities. The Grazing Exit X-ray Absorption Near-Edge Structure (GE-XANES) technique addresses these limitations by providing depth-resolved insight at ambient conditions, facilitating in situ material analysis without special sample preparation. However, GE-XANES is limited by long data acquisition times, which hinders its practicality for various applications. To overcome this, we have incorporated Bayesian Optimization (BO) into the GE-XANES data acquisition process. This innovative approach significantly reduces data acquisition time from 20 hours to 25 minutes. We have used standard GE-XANES experiment, which serve as reference, to validate the effectiveness and accuracy of the BO-informed experimental setup. Our results show that this optimized approach maintains data quality while significantly improving efficiency, making GE-XANES more accessible to a wider range of materials science applications.","PeriodicalId":503691,"journal":{"name":"Machine Learning: Science and Technology","volume":"35 12","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140667613","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Quadratic hyper-surface kernel-free large margin distribution machine-based regression and its least-square form","authors":"Hao He, Kuaini Wang, Yuzhu Jiang, Huimin Pei","doi":"10.1088/2632-2153/ad40fc","DOIUrl":"https://doi.org/10.1088/2632-2153/ad40fc","url":null,"abstract":"\u0000 ǫ-Support vector regression (ǫ-SVR) is a powerful machine learning approach that focuses on minimizing the margin, which represents the tolerance range between predicted and actual values. However, recent theoretical studies have highlighted that simply minimizing structural risk does not necessarily result in well margin distribution. Instead, it has been shown that the distribution of margins plays a more crucial role in achieving better generalization performance. Furthermore, the kernel-free technique offers a significant advantage as it effectively reduces the overall running time and simplifies the parameter selection process compared to the kernel trick. Based on existing kernel-free regression methods, we present two efficient and robust approaches named quadratic hyper-surface kernel-free large margin distribution machine-based regression(QLDMR) and quadratic hyper-surface kernel-free least squares large margin distribution machine-based regression(QLSLDMR). The QLDMR optimizes the margin distribution by considering both ǫ-insensitive loss and quadratic loss function similar to the large-margin distribution machine-based regression (LDMR). QLSLDMR aims to reduce the cost of the computing process of QLDMR, which transforms inequality constraints into an equality constraint inspired by least squares support vector machines (LSSVR). Both models are combined the spirit of optimal margin distribution with kernel-free technique and after simplification are convex so that they can be solved by some classical methods. Experimental results demonstrate the superiority of the optimal margin distribution combined with the kernel-free technique in robustness, generalization, and efficiency.","PeriodicalId":503691,"journal":{"name":"Machine Learning: Science and Technology","volume":" 2","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140685194","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Multimodal Protein Representation Learning and Target-aware Variational Auto-encoders for Protein-binding Ligand Generation","authors":"Nhat-Khang Ngô, T. Hy","doi":"10.1088/2632-2153/ad3ee4","DOIUrl":"https://doi.org/10.1088/2632-2153/ad3ee4","url":null,"abstract":"\u0000 Without knowledge of specific pockets, generating ligands based on the global structure of a protein target plays a crucial role in drug discovery as it helps reduce the search space for potential drug-like candidates in the pipeline. However, contemporary methods require optimizing tailored networks for each protein, which is arduous and costly. To address this issue, we introduce TargetVAE, a target-aware variational auto-encoder that generates ligands with desirable properties including high binding affinity and high synthesizability to arbitrary target proteins, guided by a multimodal deep neural network built based on geometric and sequence models, named Protein Multimodal Network (PMN), as the prior for the generative model. PMN unifies different representations of proteins (e.g., primary structure - sequence of amino acids, 3D tertiary structure, and residue-level graph) into a single representation. Our multimodal architecture learns from the entire protein structure and is able to capture their sequential, topological, and geometrical information by utilizing language modeling, graph neural networks, and geometric deep learning. We showcase the superiority of our approach by conducting extensive experiments and evaluations, including predicting protein-ligand binding affinity in the PBDBind v2020 dataset as well as the assessment of generative model quality, ligand generation for unseen targets, and docking score computation. Empirical results demonstrate the promising and competitive performance of our proposed approach. Our software package is publicly available at https://github.com/HySonLab/Ligand_Generation","PeriodicalId":503691,"journal":{"name":"Machine Learning: Science and Technology","volume":"43 36","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-04-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140701581","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}