A. van den Brandt, E. Ståhlbom, F.J.M. van Workum, H. van de Wetering, C. Lundström, S. Smit, A. Vilanova
{"title":"Multipla: Multiscale Pangenomic Locus Analysis","authors":"A. van den Brandt, E. Ståhlbom, F.J.M. van Workum, H. van de Wetering, C. Lundström, S. Smit, A. Vilanova","doi":"10.1111/cgf.70147","DOIUrl":null,"url":null,"abstract":"<div>\n \n <p>Comparing gene organization across genomic sequences reveals insights into evolutionary and functional diversity among different organisms and varieties. Performing this task across many sequences, such as from a pangenome, is challenging because of the scale, the density of information, and the inherent variation. Often, analyses are centered on a genomic region of interest—a locus that might be associated with a trait or contain genes within the same family or biological pathway. Within these regions, researchers examine the conservation of gene order and orientation across organisms and assess sequence similarity, along with other gene content features such as gene size, to find biological variations or potential errors in the data. Automated methods in comparative genomics struggle to identify meaningful patterns due to varying and often unknown features of interest, leaving manual, time-intensive, and scalability-challenged visualization as the primary alternative. To address these challenges, we present a multiscale design for studying gene organization within pangenomes, developed in close collaboration with domain experts. Our tool, <i>M<span>ultipla</span></i>, enables users to explore organization at multiple levels of detail in a decluttered manner through layout abstractions, semantic zooming, and layouts with flexible distance definitions and feature selections, combining the advantages of manual and automated methods used in practice. We evaluate the design of <i>M<span>ultipla</span></i> through two pangenomic use cases and conclude with lessons learned from designing multiscale views for pangenomic locus analysis.</p>\n </div>","PeriodicalId":10687,"journal":{"name":"Computer Graphics Forum","volume":"44 3","pages":""},"PeriodicalIF":2.9000,"publicationDate":"2025-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/cgf.70147","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer Graphics Forum","FirstCategoryId":"94","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1111/cgf.70147","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}
引用次数: 0
Abstract
Comparing gene organization across genomic sequences reveals insights into evolutionary and functional diversity among different organisms and varieties. Performing this task across many sequences, such as from a pangenome, is challenging because of the scale, the density of information, and the inherent variation. Often, analyses are centered on a genomic region of interest—a locus that might be associated with a trait or contain genes within the same family or biological pathway. Within these regions, researchers examine the conservation of gene order and orientation across organisms and assess sequence similarity, along with other gene content features such as gene size, to find biological variations or potential errors in the data. Automated methods in comparative genomics struggle to identify meaningful patterns due to varying and often unknown features of interest, leaving manual, time-intensive, and scalability-challenged visualization as the primary alternative. To address these challenges, we present a multiscale design for studying gene organization within pangenomes, developed in close collaboration with domain experts. Our tool, Multipla, enables users to explore organization at multiple levels of detail in a decluttered manner through layout abstractions, semantic zooming, and layouts with flexible distance definitions and feature selections, combining the advantages of manual and automated methods used in practice. We evaluate the design of Multipla through two pangenomic use cases and conclude with lessons learned from designing multiscale views for pangenomic locus analysis.
期刊介绍:
Computer Graphics Forum is the official journal of Eurographics, published in cooperation with Wiley-Blackwell, and is a unique, international source of information for computer graphics professionals interested in graphics developments worldwide. It is now one of the leading journals for researchers, developers and users of computer graphics in both commercial and academic environments. The journal reports on the latest developments in the field throughout the world and covers all aspects of the theory, practice and application of computer graphics.