Stefan Stiller, Juan F Dueñas, Stefan Hempel, Matthias C Rillig, Masahiro Ryo
{"title":"Deep learning image analysis for filamentous fungi taxonomic classification: Dealing with small datasets with class imbalance and hierarchical grouping.","authors":"Stefan Stiller, Juan F Dueñas, Stefan Hempel, Matthias C Rillig, Masahiro Ryo","doi":"10.1093/biomethods/bpae063","DOIUrl":"https://doi.org/10.1093/biomethods/bpae063","url":null,"abstract":"<p><p>Deep learning applications in taxonomic classification for animals and plants from images have become popular, while those for microorganisms are still lagging behind. Our study investigated the potential of deep learning for the taxonomic classification of hundreds of filamentous fungi from colony images, which is typically a task that requires specialized knowledge. We isolated soil fungi, annotated their taxonomy using standard molecular barcode techniques, and took images of the fungal colonies grown in petri dishes (<i>n</i> = 606). We applied a convolutional neural network with multiple training approaches and model architectures to deal with some common issues in ecological datasets: small amounts of data, class imbalance, and hierarchically structured grouping. Model performance was overall low, mainly due to the relatively small dataset, class imbalance, and the high morphological plasticity exhibited by fungal colonies. However, our approach indicates that morphological features like color, patchiness, and colony extension rate could be used for the recognition of fungal colonies at higher taxonomic ranks (i.e. phylum, class, and order). Model explanation implies that image recognition characters appear at different positions within the colony (e.g. outer or inner hyphae) depending on the taxonomic resolution. Our study suggests the potential of deep learning applications for a better understanding of the taxonomy and ecology of filamentous fungi amenable to axenic culturing. Meanwhile, our study also highlights some technical challenges in deep learning image analysis in ecology, highlighting that the domain of applicability of these methods needs to be carefully considered.</p>","PeriodicalId":36528,"journal":{"name":"Biology Methods and Protocols","volume":"9 1","pages":"bpae063"},"PeriodicalIF":2.5,"publicationDate":"2024-08-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11387011/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142297454","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"DLKcat cannot predict meaningful <i>k</i> <sub>cat</sub> values for mutants and unfamiliar enzymes.","authors":"Alexander Kroll, Martin J Lercher","doi":"10.1093/biomethods/bpae061","DOIUrl":"https://doi.org/10.1093/biomethods/bpae061","url":null,"abstract":"<p><p>The recently published DLKcat model, a deep learning approach for predicting enzyme turnover numbers (<i>k</i> <sub>cat</sub>), claims to enable high-throughput <i>k</i> <sub>cat</sub> predictions for metabolic enzymes from any organism and to capture <i>k</i> <sub>cat</sub> changes for mutated enzymes. Here, we critically evaluate these claims. We show that for enzymes with <60% sequence identity to the training data DLKcat predictions become worse than simply assuming a constant average <i>k</i> <sub>cat</sub> value for all reactions. Furthermore, DLKcat's ability to predict mutation effects is much weaker than implied, capturing none of the experimentally observed variation across mutants not included in the training data. These findings highlight significant limitations in DLKcat's generalizability and its practical utility for predicting <i>k</i> <sub>cat</sub> values for novel enzyme families or mutants, which are crucial applications in fields such as metabolic modeling.</p>","PeriodicalId":36528,"journal":{"name":"Biology Methods and Protocols","volume":"9 1","pages":"bpae061"},"PeriodicalIF":2.5,"publicationDate":"2024-08-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11427335/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142355762","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Advanced image generation for cancer using diffusion models.","authors":"Benjamin L Kidder","doi":"10.1093/biomethods/bpae062","DOIUrl":"https://doi.org/10.1093/biomethods/bpae062","url":null,"abstract":"<p><p>Deep neural networks have significantly advanced the field of medical image analysis, yet their full potential is often limited by relatively small dataset sizes. Generative modeling, particularly through diffusion models, has unlocked remarkable capabilities in synthesizing photorealistic images, thereby broadening the scope of their application in medical imaging. This study specifically investigates the use of diffusion models to generate high-quality brain MRI scans, including those depicting low-grade gliomas, as well as contrast-enhanced spectral mammography (CESM) and chest and lung X-ray images. By leveraging the DreamBooth platform, we have successfully trained stable diffusion models utilizing text prompts alongside class and instance images to generate diverse medical images. This approach not only preserves patient anonymity but also substantially mitigates the risk of patient re-identification during data exchange for research purposes. To evaluate the quality of our synthesized images, we used the Fréchet inception distance metric, demonstrating high fidelity between the synthesized and real images. Our application of diffusion models effectively captures oncology-specific attributes across different imaging modalities, establishing a robust framework that integrates artificial intelligence in the generation of oncological medical imagery.</p>","PeriodicalId":36528,"journal":{"name":"Biology Methods and Protocols","volume":"9 1","pages":"bpae062"},"PeriodicalIF":2.5,"publicationDate":"2024-08-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11387006/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142297452","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Methods in cancer research: Assessing therapy response of spheroid cultures by life cell imaging using a cost-effective live-dead staining protocol.","authors":"Jaison Phour, Erik Vassella","doi":"10.1093/biomethods/bpae060","DOIUrl":"10.1093/biomethods/bpae060","url":null,"abstract":"<p><p>Spheroid cultures of cancer cell lines or primary cells represent a more clinically relevant model for predicting therapy response compared to two-dimensional cell culture. However, current live-dead staining protocols used for treatment response in spheroid cultures are often expensive, toxic to the cells, or limited in their ability to monitor therapy response over an extended period due to reduced stability. In our study, we have developed a cost-effective method utilizing calcein-AM and Helix NP™ Blue for live-dead staining, enabling the monitoring of therapy response of spheroid cultures for up to 10 days. Additionally, we used ICY BioImage Analysis and Z-stacks projection to calculate viability, which is a more accurate method for assessing treatment response compared to traditional methods on spheroid size. Using the example of glioblastoma cell lines and primary glioblastoma cells, we show that spheroid cultures typically exhibit a green outer layer of viable cells, a turquoise mantle of hypoxic quiescent cells, and a blue core of necrotic cells when visualized using confocal microscopy. Upon treatment of spheroids with the alkylating agent temozolomide, we observed a reduction in the viability of glioblastoma cells after an incubation period of 7 days. This method can also be adapted for monitoring therapy response in different cancer systems, offering a versatile and cost-effective approach for assessing therapy efficacy in three-dimensional culture models.</p>","PeriodicalId":36528,"journal":{"name":"Biology Methods and Protocols","volume":"9 1","pages":"bpae060"},"PeriodicalIF":2.5,"publicationDate":"2024-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11374025/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142134110","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
John M Ryniawec, Anastasia Amoiroglou, Gregory C Rogers
{"title":"Generating CRISPR-edited clonal lines of cultured <i>Drosophila</i> S2 cells.","authors":"John M Ryniawec, Anastasia Amoiroglou, Gregory C Rogers","doi":"10.1093/biomethods/bpae059","DOIUrl":"https://doi.org/10.1093/biomethods/bpae059","url":null,"abstract":"<p><p>CRISPR/Cas9 genome editing is a pervasive research tool due to its relative ease of use. However, some systems are not amenable to generating edited clones due to genomic complexity and/or difficulty in establishing clonal lines. For example, <i>Drosophila</i> Schneider 2 (S2) cells possess a segmental aneuploid genome and are challenging to single-cell select. Here, we describe a streamlined CRISPR/Cas9 methodology for knock-in and knock-out experiments in S2 cells, whereby an antibiotic resistance gene is inserted in-frame with the coding region of a gene-of-interest. By using selectable markers, we have improved the ease and efficiency for the positive selection of null cells using antibiotic selection in feeder layers followed by cell expansion to generate clonal lines. Using this method, we generated the first acentrosomal S2 cell lines by knocking-out centriole genes Polo-like Kinase 4/Plk4 or Ana2 as proof of concept. These strategies for generating gene-edited clonal lines will add to the collection of CRISPR tools available for cultured <i>Drosophila</i> cells by making CRISPR more practical and therefore improving gene function studies.</p>","PeriodicalId":36528,"journal":{"name":"Biology Methods and Protocols","volume":"9 1","pages":"bpae059"},"PeriodicalIF":2.5,"publicationDate":"2024-08-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11357795/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142113011","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mauro Pazmiño-Betancourth, Ivan Casas Gómez-Uribarri, Karina Mondragon-Shem, Simon A Babayan, Francesco Baldini, Lee Rafuse Haines
{"title":"Advancing age grading techniques for <i>Glossina morsitans morsitans</i>, vectors of African trypanosomiasis, through mid-infrared spectroscopy and machine learning.","authors":"Mauro Pazmiño-Betancourth, Ivan Casas Gómez-Uribarri, Karina Mondragon-Shem, Simon A Babayan, Francesco Baldini, Lee Rafuse Haines","doi":"10.1093/biomethods/bpae058","DOIUrl":"10.1093/biomethods/bpae058","url":null,"abstract":"<p><p>Tsetse are the insects responsible for transmitting African trypanosomes, which cause sleeping sickness in humans and animal trypanosomiasis in wildlife and livestock. Knowing the age of these flies is important when assessing the effectiveness of vector control programs and modelling disease risk. Current methods to assess fly age are, however, labour-intensive, slow, and often inaccurate as skilled personnel are in short supply. Mid-infrared spectroscopy (MIRS), a fast and cost-effective tool to accurately estimate several biological traits of insects, offers a promising alternative. This is achieved by characterising the biochemical composition of the insect cuticle using infrared light coupled with machine-learning (ML) algorithms to estimate the traits of interest. We tested the performance of MIRS in estimating tsetse sex and age for the first-time using spectra obtained from their cuticle. We used 541 insectary-reared <i>Glossina m. morsitans</i> of two different age groups for males (5 and 7 weeks) and three age groups for females (3 days, 5 weeks, and 7 weeks). Spectra were collected from the head, thorax, and abdomen of each sample. ML models differentiated between male and female flies with a 96% accuracy and predicted the age group with 94% and 87% accuracy for males and females, respectively. The key infrared regions important for discriminating sex and age classification were characteristic of lipid and protein content. Our results support the use of MIRS as a rapid and accurate way to identify tsetse sex and age with minimal pre-processing. Further validation using wild-caught tsetse could pave the way for this technique to be implemented as a routine surveillance tool in vector control programmes.</p>","PeriodicalId":36528,"journal":{"name":"Biology Methods and Protocols","volume":"9 1","pages":"bpae058"},"PeriodicalIF":2.5,"publicationDate":"2024-08-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11407438/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142297453","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Malte B Hallgren, Philip T L C Clausen, Frank M Aarestrup
{"title":"NanoMGT: Marker gene typing of low complexity mono-species metagenomic samples using noisy long reads.","authors":"Malte B Hallgren, Philip T L C Clausen, Frank M Aarestrup","doi":"10.1093/biomethods/bpae057","DOIUrl":"https://doi.org/10.1093/biomethods/bpae057","url":null,"abstract":"<p><p>Rapid advancements in sequencing technologies have led to significant progress in microbial genomics, yet challenges persist in accurately identifying microbial strain diversity in metagenomic samples, especially when working with noisy long-read data from platforms like Oxford Nanopore Technologies (ONT). In this article, we introduce NanoMGT, a tool designed to enhance marker gene typing in low-complexity mono-species samples, leveraging the unique properties of long reads. NanoMGT excels in its ability to accurately identify mutations amidst high error rates, ensuring the reliable detection of multiple strain-specific marker genes. Our tool implements a novel scoring system that rewards mutations co-occurring across different reads and penalizes densely grouped, likely erroneous variants, thereby achieving a good balance between sensitivity and precision. A comparative evaluation of NanoMGT, using a simulated multi-strain sample of seven bacterial species, demonstrated superior performance relative to existing tools and the advantages of using a threshold-based filtering approach to calling minority variants in ONT's sequencing data. NanoMGT's potential as a post-binning tool in metagenomic pipelines is particularly notable, enabling researchers to more accurately determine specific alleles and understand strain diversity in microbial communities. Our findings have significant implications for clinical diagnostics, environmental microbiology, and the broader field of genomics. The findings offer a reliable and efficient approach to marker gene typing in complex metagenomic samples.</p>","PeriodicalId":36528,"journal":{"name":"Biology Methods and Protocols","volume":"9 1","pages":"bpae057"},"PeriodicalIF":2.5,"publicationDate":"2024-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11387619/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142297456","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Nora N Hanson, James P Ounsley, Jason Henry, Kasim Terzić, Bruno Caneco
{"title":"Automatic detection of fish scale circuli using deep learning.","authors":"Nora N Hanson, James P Ounsley, Jason Henry, Kasim Terzić, Bruno Caneco","doi":"10.1093/biomethods/bpae056","DOIUrl":"10.1093/biomethods/bpae056","url":null,"abstract":"<p><p>Teleost fish scales form distinct growth rings deposited in proportion to somatic growth in length, and are routinely used in fish ageing and growth analyses. Extraction of incremental growth data from scales is labour intensive. We present a fully automated method to retrieve this data from fish scale images using Convolutional Neural Networks (CNNs). Our pipeline of two CNNs automatically detects the centre of the scale and individual growth rings (circuli) along multiple radial transect emanating from the centre. The focus detector was trained on 725 scale images and achieved an average precision of 99%; the circuli detector was trained on 40 678 circuli annotations and achieved an average precision of 95.1%. Circuli detections were made with less confidence in the freshwater zone of the scale image where the growth bands are most narrowly spaced. However, the performance of the circuli detector was similar to that of another human labeller, highlighting the inherent ambiguity of the labelling process. The system predicts the location of scale growth rings rapidly and with high accuracy, enabling the calculation of spacings and thereby growth inferences from salmon scales. The success of our method suggests its potential for expansion to other species.</p>","PeriodicalId":36528,"journal":{"name":"Biology Methods and Protocols","volume":"9 1","pages":"bpae056"},"PeriodicalIF":2.5,"publicationDate":"2024-07-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11330318/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142000884","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Harmonizing immune cell sequences for computational analysis with large language models.","authors":"Areej Alsaafin, Hamid R Tizhoosh","doi":"10.1093/biomethods/bpae055","DOIUrl":"https://doi.org/10.1093/biomethods/bpae055","url":null,"abstract":"<p><p>We present SEQuence Weighted Alignment for Sorting and Harmonization (Seqwash), an algorithm designed to process sequencing profiles utilizing large language models. Seqwash <i>harmonizes</i> immune cell sequences into a unified representation, empowering LLMs to embed meaningful patterns while eliminating irrelevant information. Evaluations using immune cell sequencing data showcase Seqwash's efficacy in standardizing profiles, leading to improved feature quality and enhanced performance in both supervised and unsupervised downstream tasks for sequencing data.</p>","PeriodicalId":36528,"journal":{"name":"Biology Methods and Protocols","volume":"9 1","pages":"bpae055"},"PeriodicalIF":2.5,"publicationDate":"2024-07-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11407694/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142297455","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
David Fröhlich, Michaela Bodner, Günther Raspotnig, Christoph Hahn
{"title":"Simple protocol for combined extraction of exocrine secretions and RNA in small arthropods.","authors":"David Fröhlich, Michaela Bodner, Günther Raspotnig, Christoph Hahn","doi":"10.1093/biomethods/bpae054","DOIUrl":"10.1093/biomethods/bpae054","url":null,"abstract":"<p><p>The integration of data from multiple sources and analytical techniques to obtain novel insights and answer challenging questions is a hallmark of modern science. In arthropods, exocrine secretions may act as pheromones, defensive substances, antibiotics, as well as surface protectants, and as such they play a crucial role in ecology and evolution. Exocrine chemical compounds are frequently characterized by gas chromatography-mass spectrometry. Technological advances of recent years now allow us to routinely characterize the total gene complement transcribed in a particular biological tissue, often in the context of experimental treatment, via RNAseq. We here introduce a novel methodological approach to successfully characterize exocrine secretions <i>and</i> full transcriptomes of one and the same individual of oribatid mites. We found that chemical extraction prior to RNA extraction had only minor effects on the total RNA integrity. De novo transcriptomes obtained from such combined extractions were of comparable quality to those assembled for samples that were subject to RNA extraction only, indicating that combined chemical/RNA extraction is perfectly suitable for phylotranscriptomic studies. However, in-depth analysis of RNA expression analysis indicates that chemical extraction prior to RNAseq may affect transcript degradation rates, similar to the effects reported in previous studies comparing RNA extraction protocols. With this pilot study, we demonstrate that profiling chemical secretions and RNA expression levels from the same individual is methodologically feasible, paving the way for future research to understand the genes and pathways underlying the syntheses of biogenic chemical compounds. Our approach should be applicable broadly to most arachnids, insects, and other arthropods.</p>","PeriodicalId":36528,"journal":{"name":"Biology Methods and Protocols","volume":"9 1","pages":"bpae054"},"PeriodicalIF":2.5,"publicationDate":"2024-07-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11316613/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141917564","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}