Dan Eiju, Yoichi Hashida, Taro Maeda, Koji Iwayama, Atsushi J Nagano
{"title":"Simulation study of factors affecting the accuracy of transcriptome models under complex environments.","authors":"Dan Eiju, Yoichi Hashida, Taro Maeda, Koji Iwayama, Atsushi J Nagano","doi":"10.1007/s11103-025-01578-6","DOIUrl":null,"url":null,"abstract":"<p><p>Characterization of molecular responses in real and complex field environments is essential for understanding the environmental response of plants. Field transcriptomics prediction consists of modeling of transcriptomes in outdoor fields with various environmental variables: Meteorological parameters, atmospheric gases, soil conditions, herbivores, management, etc. It is the most comprehensive method of studying gene expression dynamics in complex environments. However, it is not clear what factors influence the accuracy of field transcriptome models. In this study, a novel simulation system was developed. Using the system, we performed a large-scale simulation to reveal the factors affecting the accuracy of the models. We found that the factors that had the greatest impact on the accuracy are, in order of importance, the expression pattern of the gene, the number of samples in the training data, the diurnal coverage of the training data, and the temperature coverage of the training data. Validation using actually measured transcriptome data showed similar results to the simulations. Our simulation system and the analysis results will be helpful for developing efficient sampling strategies for training data and for generating simulated data for benchmarking new modelling methods. It will also be valuable to dissect the relative importance of various factors behind transcriptome dynamics in the real environment.</p>","PeriodicalId":20064,"journal":{"name":"Plant Molecular Biology","volume":"115 2","pages":"52"},"PeriodicalIF":3.9000,"publicationDate":"2025-03-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Plant Molecular Biology","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1007/s11103-025-01578-6","RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"BIOCHEMISTRY & MOLECULAR BIOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Characterization of molecular responses in real and complex field environments is essential for understanding the environmental response of plants. Field transcriptomics prediction consists of modeling of transcriptomes in outdoor fields with various environmental variables: Meteorological parameters, atmospheric gases, soil conditions, herbivores, management, etc. It is the most comprehensive method of studying gene expression dynamics in complex environments. However, it is not clear what factors influence the accuracy of field transcriptome models. In this study, a novel simulation system was developed. Using the system, we performed a large-scale simulation to reveal the factors affecting the accuracy of the models. We found that the factors that had the greatest impact on the accuracy are, in order of importance, the expression pattern of the gene, the number of samples in the training data, the diurnal coverage of the training data, and the temperature coverage of the training data. Validation using actually measured transcriptome data showed similar results to the simulations. Our simulation system and the analysis results will be helpful for developing efficient sampling strategies for training data and for generating simulated data for benchmarking new modelling methods. It will also be valuable to dissect the relative importance of various factors behind transcriptome dynamics in the real environment.
期刊介绍:
Plant Molecular Biology is an international journal dedicated to rapid publication of original research articles in all areas of plant biology.The Editorial Board welcomes full-length manuscripts that address important biological problems of broad interest, including research in comparative genomics, functional genomics, proteomics, bioinformatics, computational biology, biochemical and regulatory networks, and biotechnology. Because space in the journal is limited, however, preference is given to publication of results that provide significant new insights into biological problems and that advance the understanding of structure, function, mechanisms, or regulation. Authors must ensure that results are of high quality and that manuscripts are written for a broad plant science audience.