R Dustin Schaeffer, Rui Guo, Jing Zhang, Qian Cong, Nick V Grishin
{"title":"蛋白质结构域插入结构的家族级专业化。","authors":"R Dustin Schaeffer, Rui Guo, Jing Zhang, Qian Cong, Nick V Grishin","doi":"10.1002/pro.70586","DOIUrl":null,"url":null,"abstract":"<p><p>Domain insertion creates architectures where one domain interrupts another's sequence. Analysis across 2.7 million classified domains reveals that insertions occur in 20% of multidomain proteins, with 331 families exhibiting consistent architectural roles: 162 function exclusively as hosts, while 169 exclusively serve as inserted modules, such as zinc-binding dehydrogenases appearing as insertions across 450 events. The remaining 1116 families with sufficient insertion activity demonstrate versatile behavior, adopting different roles depending on partnership context. Size analysis shows inserted domains are consistently smaller than their hosts (median 115 vs. 199 residues), with role-consistent families exhibiting 1.7-fold size differences. Insertions frequently involve domains from different structural superfamilies: 31,925 events (65.8% of total) occur between families from different H-groups, such as P-loop hydrolases with tRNA modification domains. While most insertions are simple single-level architectures, insertion mechanisms can create complex organizations, including six-level nested structures in cyanobacterial RNA polymerase. This work provides a comprehensive dataset of 48,551 insertion events across 5701 families, with quantitative characterization of size relationships and partnership patterns that can inform structure prediction and protein design efforts.</p>","PeriodicalId":20761,"journal":{"name":"Protein Science","volume":"35 5","pages":"e70586"},"PeriodicalIF":5.2000,"publicationDate":"2026-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC13114784/pdf/","citationCount":"0","resultStr":"{\"title\":\"Family-level specialization in protein domain insertion architectures.\",\"authors\":\"R Dustin Schaeffer, Rui Guo, Jing Zhang, Qian Cong, Nick V Grishin\",\"doi\":\"10.1002/pro.70586\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Domain insertion creates architectures where one domain interrupts another's sequence. Analysis across 2.7 million classified domains reveals that insertions occur in 20% of multidomain proteins, with 331 families exhibiting consistent architectural roles: 162 function exclusively as hosts, while 169 exclusively serve as inserted modules, such as zinc-binding dehydrogenases appearing as insertions across 450 events. The remaining 1116 families with sufficient insertion activity demonstrate versatile behavior, adopting different roles depending on partnership context. Size analysis shows inserted domains are consistently smaller than their hosts (median 115 vs. 199 residues), with role-consistent families exhibiting 1.7-fold size differences. Insertions frequently involve domains from different structural superfamilies: 31,925 events (65.8% of total) occur between families from different H-groups, such as P-loop hydrolases with tRNA modification domains. While most insertions are simple single-level architectures, insertion mechanisms can create complex organizations, including six-level nested structures in cyanobacterial RNA polymerase. This work provides a comprehensive dataset of 48,551 insertion events across 5701 families, with quantitative characterization of size relationships and partnership patterns that can inform structure prediction and protein design efforts.</p>\",\"PeriodicalId\":20761,\"journal\":{\"name\":\"Protein Science\",\"volume\":\"35 5\",\"pages\":\"e70586\"},\"PeriodicalIF\":5.2000,\"publicationDate\":\"2026-05-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC13114784/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Protein Science\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://doi.org/10.1002/pro.70586\",\"RegionNum\":3,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"BIOCHEMISTRY & MOLECULAR BIOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Protein Science","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1002/pro.70586","RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOCHEMISTRY & MOLECULAR BIOLOGY","Score":null,"Total":0}
Family-level specialization in protein domain insertion architectures.
Domain insertion creates architectures where one domain interrupts another's sequence. Analysis across 2.7 million classified domains reveals that insertions occur in 20% of multidomain proteins, with 331 families exhibiting consistent architectural roles: 162 function exclusively as hosts, while 169 exclusively serve as inserted modules, such as zinc-binding dehydrogenases appearing as insertions across 450 events. The remaining 1116 families with sufficient insertion activity demonstrate versatile behavior, adopting different roles depending on partnership context. Size analysis shows inserted domains are consistently smaller than their hosts (median 115 vs. 199 residues), with role-consistent families exhibiting 1.7-fold size differences. Insertions frequently involve domains from different structural superfamilies: 31,925 events (65.8% of total) occur between families from different H-groups, such as P-loop hydrolases with tRNA modification domains. While most insertions are simple single-level architectures, insertion mechanisms can create complex organizations, including six-level nested structures in cyanobacterial RNA polymerase. This work provides a comprehensive dataset of 48,551 insertion events across 5701 families, with quantitative characterization of size relationships and partnership patterns that can inform structure prediction and protein design efforts.
期刊介绍:
Protein Science, the flagship journal of The Protein Society, is a publication that focuses on advancing fundamental knowledge in the field of protein molecules. The journal welcomes original reports and review articles that contribute to our understanding of protein function, structure, folding, design, and evolution.
Additionally, Protein Science encourages papers that explore the applications of protein science in various areas such as therapeutics, protein-based biomaterials, bionanotechnology, synthetic biology, and bioelectronics.
The journal accepts manuscript submissions in any suitable format for review, with the requirement of converting the manuscript to journal-style format only upon acceptance for publication.
Protein Science is indexed and abstracted in numerous databases, including the Agricultural & Environmental Science Database (ProQuest), Biological Science Database (ProQuest), CAS: Chemical Abstracts Service (ACS), Embase (Elsevier), Health & Medical Collection (ProQuest), Health Research Premium Collection (ProQuest), Materials Science & Engineering Database (ProQuest), MEDLINE/PubMed (NLM), Natural Science Collection (ProQuest), and SciTech Premium Collection (ProQuest).