IEEE AccessPub Date : 2023-08-22DOI: 10.1109/ACCESS.2023.3307577
Jiahao Lin;Qi Miao;Chuthaporn Surawech;Steven S. Raman;Kai Zhao;Holden H. Wu;Kyunghyun Sung
{"title":"High-Resolution 3D MRI With Deep Generative Networks via Novel Slice-Profile Transformation Super-Resolution","authors":"Jiahao Lin;Qi Miao;Chuthaporn Surawech;Steven S. Raman;Kai Zhao;Holden H. Wu;Kyunghyun Sung","doi":"10.1109/ACCESS.2023.3307577","DOIUrl":"10.1109/ACCESS.2023.3307577","url":null,"abstract":"High-resolution magnetic resonance imaging (MRI) sequences, such as 3D turbo or fast spin-echo (TSE/FSE) imaging, are clinically desirable but suffer from long scanning time-related blurring when reformatted into preferred orientations. Instead, multi-slice two-dimensional (2D) TSE imaging is commonly used because of its high in-plane resolution but is limited clinically by poor through-plane resolution due to elongated voxels and the inability to generate multi-planar reformations due to staircase artifacts. Therefore, multiple 2D TSE scans are acquired in various orthogonal imaging planes, increasing the overall MRI scan time. In this study, we propose a novel slice-profile transformation super-resolution (SPTSR) framework with deep generative learning for through-plane super-resolution (SR) of multi-slice 2D TSE imaging. The deep generative networks were trained by synthesized low-resolution training input via slice-profile downsampling (SP-DS), and the trained networks inferred on the slice profile convolved (SP-conv) testing input for 5.5x through-plane SR. The network output was further slice-profile deconvolved (SP-deconv) to achieve an isotropic super-resolution. Compared to SMORE SR method and the networks trained by conventional downsampling, our SPTSR framework demonstrated the best overall image quality from 50 testing cases, evaluated by two abdominal radiologists. The quantitative analysis cross-validated the expert reader study results. 3D simulation experiments confirmed the quantitative improvement of the proposed SPTSR and the effectiveness of the SP-deconv step, compared to 3D ground-truths. Ablation studies were conducted on the individual contributions of SP-DS and SP-conv, networks structure, training dataset size, and different slice profiles.","PeriodicalId":13079,"journal":{"name":"IEEE Access","volume":"11 ","pages":"95022-95036"},"PeriodicalIF":3.9,"publicationDate":"2023-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10226181","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10653939","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
IEEE AccessPub Date : 2023-07-27DOI: 10.1109/ACCESS.2023.3299242
Yongkang Liu;Mohamad Omar Al Kalaa
{"title":"Testbed as a RegUlatory Science Tool (TRUST): A Testbed Design for Evaluating 5G-Enabled Medical Devices","authors":"Yongkang Liu;Mohamad Omar Al Kalaa","doi":"10.1109/ACCESS.2023.3299242","DOIUrl":"10.1109/ACCESS.2023.3299242","url":null,"abstract":"The fifth-generation (5G) cellular communication technology introduces technical advances that can expand medical device access to connectivity services. However, assessing the safety and effectiveness of emerging 5G-enabled medical devices is challenging as relevant evaluation methods have not yet been established. In this paper, we propose a design model for 5G testbed as a regulatory science tool (TRUST) for assessing 5G connectivity enablers of medical device functions. Specifically, we first identify application specific testing needs and general testing protocols. Next, we outline the selection and customization of key system components to create a 5G testbed. A TRUST demonstration is documented through a realistic 5G testbed implementation along with the deployment of a custom-built example use-case for 5G-enabled medical extended reality (MXR). Detailed configurations, example collected data, and implementation challenges are presented. The openness of the TRUST design model allows a TRUST testbed to be easily extended and customized to incorporate available resources and address the evaluation needs of different stakeholders.","PeriodicalId":13079,"journal":{"name":"IEEE Access","volume":"11 ","pages":"81563-81576"},"PeriodicalIF":3.9,"publicationDate":"2023-07-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10196310","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10241376","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
IEEE AccessPub Date : 2023-07-25DOI: 10.1109/ACCESS.2023.3298569
Boyu Zhang;Aleksandar Vakanski;Min Xian
{"title":"BI-RADS-NET-V2: A Composite Multi-Task Neural Network for Computer-Aided Diagnosis of Breast Cancer in Ultrasound Images With Semantic and Quantitative Explanations","authors":"Boyu Zhang;Aleksandar Vakanski;Min Xian","doi":"10.1109/ACCESS.2023.3298569","DOIUrl":"10.1109/ACCESS.2023.3298569","url":null,"abstract":"Computer-aided Diagnosis (CADx) based on explainable artificial intelligence (XAI) can gain the trust of radiologists and effectively improve diagnosis accuracy and consultation efficiency. This paper proposes BI-RADS-Net-V2, a novel machine learning approach for fully automatic breast cancer diagnosis in ultrasound images. The BI-RADS-Net-V2 can accurately distinguish malignant tumors from benign ones and provides both semantic and quantitative explanations. The explanations are provided in terms of clinically proven morphological features used by clinicians for diagnosis and reporting mass findings, i.e., Breast Imaging Reporting and Data System (BI-RADS). The experiments on 1,192 Breast Ultrasound (BUS) images indicate that the proposed method improves the diagnosis accuracy by taking full advantage of the medical knowledge in BI-RADS while providing both semantic and quantitative explanations for the decision.","PeriodicalId":13079,"journal":{"name":"IEEE Access","volume":"11 ","pages":"79480-79494"},"PeriodicalIF":3.9,"publicationDate":"2023-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ftp.ncbi.nlm.nih.gov/pub/pmc/oa_pdf/9c/90/nihms-1922478.PMC10443928.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10114845","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
IEEE AccessPub Date : 2023-07-25DOI: 10.1109/ACCESS.2023.3298750
Vy C. B. Bui;Ziv Yaniv;Michael Harris;Feng Yang;Karthik Kantipudi;Darrell Hurt;Alex Rosenthal;Stefan Jaeger
{"title":"Combining Radiological and Genomic TB Portals Data for Drug Resistance Analysis","authors":"Vy C. B. Bui;Ziv Yaniv;Michael Harris;Feng Yang;Karthik Kantipudi;Darrell Hurt;Alex Rosenthal;Stefan Jaeger","doi":"10.1109/ACCESS.2023.3298750","DOIUrl":"10.1109/ACCESS.2023.3298750","url":null,"abstract":"Tuberculosis (TB) drug resistance is a worldwide public health problem. It decreases the likelihood of a positive outcome for the individual patient and increases the likelihood of disease spread. Therefore, early detection of TB drug resistance is crucial for improving outcomes and controlling disease transmission. While drug-sensitive tuberculosis cases are declining worldwide because of effective treatment, the threat of drug-resistant tuberculosis is growing, and the success rate of drug-resistant tuberculosis treatment is only around 60%. The TB Portals program provides a publicly accessible repository of TB case data with an emphasis on collecting drug-resistant cases. The dataset includes multi-modal information such as socioeconomic/geographic data, clinical characteristics, pathogen genomics, and radiological features. The program is an international collaboration whose participants are typically under a substantial burden of drug-resistant tuberculosis, with data collected from standard clinical care provided to the patients. Consequentially, the TB Portals dataset is heterogenous in nature, with data representing multiple treatment centers in different countries and containing cross-domain information. This study presents the challenges and methods used to address them when working with this real-world dataset. Our goal was to evaluate whether combining radiological features derived from a chest X-ray of the host and genomic features from the pathogen can potentially improve the identification of the drug susceptibility type, drug-sensitive (DS-TB) or drug-resistant (DR-TB), and the length of the first successful drug regimen. To perform these studies, significantly imbalanced data needed to be processed, which included a much larger number of DR-TB cases than DS-TB, many more cases with radiological findings than genomic ones, and the sparse high dimensional nature of the genomic information. Three evaluation studies were carried out. First, the DR-TB/DS-TB classification model achieved an average accuracy of 92.4% when using genomic features alone or when combining radiological and genomic features. Second, the regression model for the length of the first successful treatment had a relative error of 53.5% using radiological features, 25.6% using genomic features, and 22.0% using both radiological and genomic features. Finally, the relative error of the third regression model predicting the length of the first treatment using the most common drug combination varied depending on the feature type used. When using radiological features alone, the relative error was 17.8%. For geno- mic features alone, the relative error increased to 19.9%. The model had a relative error of 19.0% when both radiological and genomic features were combined. Although combining radiological and genomic features did not improve upon the use of genomic features when classifying DR-TB/DS-TB, the combination of the two feature types improved the relative error of ","PeriodicalId":13079,"journal":{"name":"IEEE Access","volume":"11 ","pages":"84228-84240"},"PeriodicalIF":3.9,"publicationDate":"2023-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ftp.ncbi.nlm.nih.gov/pub/pmc/oa_pdf/72/e4/nihms-1924913.PMC10473876.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10149871","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Assessing Inter-Annotator Agreement for Medical Image Segmentation","authors":"Feng Yang;Ghada Zamzmi;Sandeep Angara;Sivaramakrishnan Rajaraman;André Aquilina;Zhiyun Xue;Stefan Jaeger;Emmanouil Papagiannakis;Sameer K. Antani","doi":"10.1109/ACCESS.2023.3249759","DOIUrl":"10.1109/ACCESS.2023.3249759","url":null,"abstract":"Artificial Intelligence (AI)-based medical computer vision algorithm training and evaluations depend on annotations and labeling. However, variability between expert annotators introduces noise in training data that can adversely impact the performance of AI algorithms. This study aims to assess, illustrate and interpret the inter-annotator agreement among multiple expert annotators when segmenting the same lesion(s)/abnormalities on medical images. We propose the use of three metrics for the qualitative and quantitative assessment of inter-annotator agreement: 1) use of a common agreement heatmap and a ranking agreement heatmap; 2) use of the extended Cohen’s kappa and Fleiss’ kappa coefficients for a quantitative evaluation and interpretation of inter-annotator reliability; and 3) use of the Simultaneous Truth and Performance Level Estimation (STAPLE) algorithm, as a parallel step, to generate ground truth for training AI models and compute Intersection over Union (IoU), sensitivity, and specificity to assess the inter-annotator reliability and variability. Experiments are performed on two datasets, namely cervical colposcopy images from 30 patients and chest X-ray images from 336 tuberculosis (TB) patients, to demonstrate the consistency of inter-annotator reliability assessment and the importance of combining different metrics to avoid bias assessment.","PeriodicalId":13079,"journal":{"name":"IEEE Access","volume":"11 ","pages":"21300-21312"},"PeriodicalIF":3.9,"publicationDate":"2023-02-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ftp.ncbi.nlm.nih.gov/pub/pmc/oa_pdf/68/67/nihms-1880749.PMC10062409.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9336135","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
IEEE AccessPub Date : 2023-01-11DOI: 10.1109/ACCESS.2023.3235948
Anton Kovalyov;Kashyap Patel;Issa Panahi
{"title":"DSENet: Directional Signal Extraction Network for Hearing Improvement on Edge Devices","authors":"Anton Kovalyov;Kashyap Patel;Issa Panahi","doi":"10.1109/ACCESS.2023.3235948","DOIUrl":"10.1109/ACCESS.2023.3235948","url":null,"abstract":"In this paper, we propose a directional signal extraction network (DSENet). DSENet is a low-latency, real-time neural network that, given a reverberant mixture of signals captured by a microphone array, aims at extracting the reverberant signal whose source is located within a directional region of interest. If there are multiple sources situated within the directional region of interest, DSENet will aim at extracting a combination of their reverberant signals. As such, the formulation of DSENet circumvents the well-known crosstalk problem in beamforming while providing an alternative and perhaps more practical approach to other spatially constrained signal extraction methods proposed in the literature. DSENet is based on a computationally efficient and low-distortion linear model formulated in the time domain. As a result, an important application of our work is hearing improvement on edge devices. Simulation results show that DSENet outperforms oracle beamformers, as well as state-of-the-art in low-latency causal speech separation, while incurring a system latency of only 4 ms. Additionally, DSENet has been successfully deployed as a real-time application on a smartphone.","PeriodicalId":13079,"journal":{"name":"IEEE Access","volume":"11 ","pages":"4350-4358"},"PeriodicalIF":3.9,"publicationDate":"2023-01-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10015009","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10095125","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
IEEE AccessPub Date : 2022-12-15DOI: 10.1109/ACCESS.2022.3230003
Susanna Mosleh;Jason B. Coder;Christopher G. Scully;Keith Forsyth;Mohamad Omar Al Kalaa
{"title":"Monitoring Respiratory Motion With Wi-Fi CSI: Characterizing Performance and the BreatheSmart Algorithm","authors":"Susanna Mosleh;Jason B. Coder;Christopher G. Scully;Keith Forsyth;Mohamad Omar Al Kalaa","doi":"10.1109/ACCESS.2022.3230003","DOIUrl":"10.1109/ACCESS.2022.3230003","url":null,"abstract":"Respiratory motion (i.e., motion pattern and rate) can provide valuable information for many medical situations. This information may help in the diagnosis of different health disorders and diseases. Wi-Fi-based respiratory monitoring schemes utilizing commercial off-the-shelf (COTS) devices can provide contactless, low-cost, simple, and scalable respiratory monitoring without requiring specialized hardware. Despite intense research efforts, an in-depth investigation on how to evaluate this type of technology is missing. We demonstrated and assessed the feasibility of monitoring and extracting human respiratory motion from Wi-Fi channel state information (CSI) data. This demonstration involves implementing an end-to-end system for a COTS-based hardware platform, control software, data acquisition, and a proposed processing algorithm. The processing algorithm is a novel deep-learning-based approach that exploits small changes in both CSI amplitude and phase information to learn high-level abstractions of breathing-induced chest movements and to reveal the unique characteristics of their difference. We also conducted extensive laboratory experiments demonstrating an assessment technique that can be replicated when quantifying the performance of similar systems. The results indicate that the proposed scheme can classify respiratory patterns and rates with an accuracy of 99.54% and 98.69%, respectively, in moderately degraded RF channels. Comprehensive data acquisition revealed the capability of the proposed system in detecting and classifying respiratory motions. Understanding the feasible limits and potential failure factors of Wi-Fi CSI-based respiratory monitoring scheme — and how to evaluate them — is an essential step toward the practical deployment of this technology. This study discusses ideas for further expansion of this technology.","PeriodicalId":13079,"journal":{"name":"IEEE Access","volume":"10 ","pages":"131932-131951"},"PeriodicalIF":3.9,"publicationDate":"2022-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=9989347","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10528283","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
IEEE AccessPub Date : 2022-12-12DOI: 10.1109/ACCESS.2022.3228439
Arindam Biswas;Ashwin B. Parthasarathy
{"title":"Lossless Compressed Sensing of Photon Counts for Fast Diffuse Correlation Spectroscopy","authors":"Arindam Biswas;Ashwin B. Parthasarathy","doi":"10.1109/ACCESS.2022.3228439","DOIUrl":"10.1109/ACCESS.2022.3228439","url":null,"abstract":"Diffuse Correlation Spectroscopy (DCS), a noninvasive optical technique, measures deep tissue blood flow using avalanche photon counting modules and data acquisition devices such as FPGAs or correlator boards. Conventional DCS instruments use in-processor counter modules that consume 32 bits/channel which is inefficient for low-photon budget situations prevalent in diffuse optics. Scaling these photon counters for large-scale imaging applications is difficult due to bandwidth and processing time considerations. Here, we introduce a new, lossless compressed sensing approach for fast and efficient detection of photon counts. The compressed DCS method uses an array of binary-coded-decimal counters to record photon counts from 8 channels simultaneously as a single 32-bit number. We validate the compressed DCS approach by comparisons with conventional DCS in experiments on tissue simulating phantoms and in-vivo arm cuff occlusion. Lossless compressed DCS was implemented with 87.5% compression efficiency. In tissue simulating phantoms, it was able to accurately estimate a tissue blood flow index, with no statistically significant difference compared to conventional DCS. Compressed DCS also recorded blood flow in vivo, in human forearm, with signal-to-noise ratio and dynamic range comparable to conventional DCS. Lossless 87.5% efficient compressed sensing counting of photon counts meets and exceeds benchmarks set by conventional DCS systems, offering a low-cost alternative for fast (~100 Hz) deep tissue blood flow measurement with optics.","PeriodicalId":13079,"journal":{"name":"IEEE Access","volume":"10 ","pages":"129754-129762"},"PeriodicalIF":3.9,"publicationDate":"2022-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=9980382","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10541343","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
IEEE AccessPub Date : 2022-11-10DOI: 10.1109/ACCESS.2022.3221436
Mehrad Sarmashghi;Shantanu P. Jadhav;Uri T. Eden
{"title":"Integrating Statistical and Machine Learning Approaches for Neural Classification","authors":"Mehrad Sarmashghi;Shantanu P. Jadhav;Uri T. Eden","doi":"10.1109/ACCESS.2022.3221436","DOIUrl":"10.1109/ACCESS.2022.3221436","url":null,"abstract":"Neurons can code for multiple variables simultaneously and neuroscientists are often interested in classifying neurons based on their receptive field properties. Statistical models provide powerful tools for determining the factors influencing neural spiking activity and classifying individual neurons. However, as neural recording technologies have advanced to produce simultaneous spiking data from massive populations, classical statistical methods often lack the computational efficiency required to handle such data. Machine learning (ML) approaches are known for enabling efficient large scale data analyses; however, they typically require massive training sets with balanced data, along with accurate labels to fit well. Additionally, model assessment and interpretation are often more challenging for ML than for classical statistical methods. To address these challenges, we develop an integrated framework, combining statistical modeling and machine learning approaches to identify the coding properties of neurons from large populations. In order to demonstrate this framework, we apply these methods to data from a population of neurons recorded from rat hippocampus to characterize the distribution of spatial receptive fields in this region.","PeriodicalId":13079,"journal":{"name":"IEEE Access","volume":"10 ","pages":"119106-119118"},"PeriodicalIF":3.9,"publicationDate":"2022-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10205093/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9892792","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
IEEE AccessPub Date : 2022-11-01DOI: 10.1109/ACCESS.2022.3218800
Shahina Rahman;Valen E. Johnson;Suhasini Subba Rao
{"title":"A Hyperparameter-Free, Fast and Efficient Framework to Detect Clusters From Limited Samples Based on Ultra High-Dimensional Features","authors":"Shahina Rahman;Valen E. Johnson;Suhasini Subba Rao","doi":"10.1109/ACCESS.2022.3218800","DOIUrl":"10.1109/ACCESS.2022.3218800","url":null,"abstract":"Clustering is a challenging problem in machine learning in which one attempts to group \u0000<inline-formula> <tex-math>$N$ </tex-math></inline-formula>\u0000 objects into \u0000<inline-formula> <tex-math>$K_{0}$ </tex-math></inline-formula>\u0000 groups based on \u0000<inline-formula> <tex-math>$P$ </tex-math></inline-formula>\u0000 features measured on each object. In this article, we examine the case where \u0000<inline-formula> <tex-math>$N ll P$ </tex-math></inline-formula>\u0000 and \u0000<inline-formula> <tex-math>$K_{0}$ </tex-math></inline-formula>\u0000 is not known. Clustering in such high dimensional, small sample size settings has numerous applications in biology, medicine, the social sciences, clinical trials, and other scientific and experimental fields. Whereas most existing clustering algorithms either require the number of clusters to be known a priori or are sensitive to the choice of tuning parameters, our method does not require the prior specification of \u0000<inline-formula> <tex-math>$K_{0}$ </tex-math></inline-formula>\u0000 or any tuning parameters. This represents an important advantage for our method because training data are not available in the applications we consider (i.e., in unsupervised learning problems). Without training data, estimating \u0000<inline-formula> <tex-math>$K_{0}$ </tex-math></inline-formula>\u0000 and other hyperparameters–and thus applying alternative clustering algorithms–can be difficult and lead to inaccurate results. Our method is based on a simple transformation of the Gram matrix and application of the strong law of large numbers to the transformed matrix. If the correlation between features decays as the number of features grows, we show that the transformed feature vectors concentrate tightly around their respective cluster expectations in a low-dimensional space. This result simplifies the detection and visualization of the unknown cluster configuration. We illustrate the algorithm by applying it to 32 benchmarked microarray datasets, each containing thousands of genomic features measured on a relatively small number of tissue samples. Compared to 21 other commonly used clustering methods, we find that the proposed algorithm is faster and twice as accurate in determining the “best” cluster configuration.","PeriodicalId":13079,"journal":{"name":"IEEE Access","volume":"10 ","pages":"116844-116857"},"PeriodicalIF":3.9,"publicationDate":"2022-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ftp.ncbi.nlm.nih.gov/pub/pmc/oa_pdf/0f/90/nihms-1849399.PMC10237044.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9582641","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}