Melissa A Stephen, Chris R Burke, Jennie E Pryce, Nicole M Steele, Peter R Amer, Susanne Meier, Claire V C Phyn, Dorian J Garrick
{"title":"从不完全观察数据中获得表型的方法与应用于奶牛青春期年龄的比较。","authors":"Melissa A Stephen, Chris R Burke, Jennie E Pryce, Nicole M Steele, Peter R Amer, Susanne Meier, Claire V C Phyn, Dorian J Garrick","doi":"10.1186/s40104-023-00921-5","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Many phenotypes in animal breeding are derived from incomplete measures, especially if they are challenging or expensive to measure precisely. Examples include time-dependent traits such as reproductive status, or lifespan. Incomplete measures for these traits result in phenotypes that are subject to left-, interval- and right-censoring, where phenotypes are only known to fall below an upper bound, between a lower and upper bound, or above a lower bound respectively. Here we compare three methods for deriving phenotypes from incomplete data using age at first elevation (> 1 ng/mL) in blood plasma progesterone (AGEP4), which generally coincides with onset of puberty, as an example trait.</p><p><strong>Methods: </strong>We produced AGEP4 phenotypes from three blood samples collected at about 30-day intervals from approximately 5,000 Holstein-Friesian or Holstein-Friesian × Jersey cross-bred dairy heifers managed in 54 seasonal-calving, pasture-based herds in New Zealand. We used these actual data to simulate 7 different visit scenarios, increasing the extent of censoring by disregarding data from one or two of the three visits. Three methods for deriving phenotypes from these data were explored: 1) ordinal categorical variables which were analysed using categorical threshold analysis; 2) continuous variables, with a penalty of 31 d assigned to right-censored phenotypes; and 3) continuous variables, sampled from within a lower and upper bound using a data augmentation approach.</p><p><strong>Results: </strong>Credibility intervals for heritability estimations overlapped across all methods and visit scenarios, but estimated heritabilities tended to be higher when left censoring was reduced. For sires with at least 5 daughters, the correlations between estimated breeding values (EBVs) from our three-visit scenario and each reduced data scenario varied by method, ranging from 0.65 to 0.95. The estimated breed effects also varied by method, but breed differences were smaller as phenotype censoring increased.</p><p><strong>Conclusion: </strong>Our results indicate that using some methods, phenotypes derived from one observation per offspring for a time-dependent trait such as AGEP4 may provide comparable sire rankings to three observations per offspring. This has implications for the design of large-scale phenotyping initiatives where animal breeders aim to estimate variance parameters and estimated breeding values (EBVs) for phenotypes that are challenging to measure or prohibitively expensive.</p>","PeriodicalId":14928,"journal":{"name":"Journal of Animal Science and Biotechnology","volume":"14 1","pages":"119"},"PeriodicalIF":7.0000,"publicationDate":"2023-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10492402/pdf/","citationCount":"0","resultStr":"{\"title\":\"Comparison of methods for deriving phenotypes from incomplete observation data with an application to age at puberty in dairy cattle.\",\"authors\":\"Melissa A Stephen, Chris R Burke, Jennie E Pryce, Nicole M Steele, Peter R Amer, Susanne Meier, Claire V C Phyn, Dorian J Garrick\",\"doi\":\"10.1186/s40104-023-00921-5\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Background: </strong>Many phenotypes in animal breeding are derived from incomplete measures, especially if they are challenging or expensive to measure precisely. Examples include time-dependent traits such as reproductive status, or lifespan. Incomplete measures for these traits result in phenotypes that are subject to left-, interval- and right-censoring, where phenotypes are only known to fall below an upper bound, between a lower and upper bound, or above a lower bound respectively. Here we compare three methods for deriving phenotypes from incomplete data using age at first elevation (> 1 ng/mL) in blood plasma progesterone (AGEP4), which generally coincides with onset of puberty, as an example trait.</p><p><strong>Methods: </strong>We produced AGEP4 phenotypes from three blood samples collected at about 30-day intervals from approximately 5,000 Holstein-Friesian or Holstein-Friesian × Jersey cross-bred dairy heifers managed in 54 seasonal-calving, pasture-based herds in New Zealand. We used these actual data to simulate 7 different visit scenarios, increasing the extent of censoring by disregarding data from one or two of the three visits. Three methods for deriving phenotypes from these data were explored: 1) ordinal categorical variables which were analysed using categorical threshold analysis; 2) continuous variables, with a penalty of 31 d assigned to right-censored phenotypes; and 3) continuous variables, sampled from within a lower and upper bound using a data augmentation approach.</p><p><strong>Results: </strong>Credibility intervals for heritability estimations overlapped across all methods and visit scenarios, but estimated heritabilities tended to be higher when left censoring was reduced. For sires with at least 5 daughters, the correlations between estimated breeding values (EBVs) from our three-visit scenario and each reduced data scenario varied by method, ranging from 0.65 to 0.95. The estimated breed effects also varied by method, but breed differences were smaller as phenotype censoring increased.</p><p><strong>Conclusion: </strong>Our results indicate that using some methods, phenotypes derived from one observation per offspring for a time-dependent trait such as AGEP4 may provide comparable sire rankings to three observations per offspring. This has implications for the design of large-scale phenotyping initiatives where animal breeders aim to estimate variance parameters and estimated breeding values (EBVs) for phenotypes that are challenging to measure or prohibitively expensive.</p>\",\"PeriodicalId\":14928,\"journal\":{\"name\":\"Journal of Animal Science and Biotechnology\",\"volume\":\"14 1\",\"pages\":\"119\"},\"PeriodicalIF\":7.0000,\"publicationDate\":\"2023-09-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10492402/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Animal Science and Biotechnology\",\"FirstCategoryId\":\"97\",\"ListUrlMain\":\"https://doi.org/10.1186/s40104-023-00921-5\",\"RegionNum\":1,\"RegionCategory\":\"农林科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"Agricultural and Biological Sciences\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Animal Science and Biotechnology","FirstCategoryId":"97","ListUrlMain":"https://doi.org/10.1186/s40104-023-00921-5","RegionNum":1,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"Agricultural and Biological Sciences","Score":null,"Total":0}
Comparison of methods for deriving phenotypes from incomplete observation data with an application to age at puberty in dairy cattle.
Background: Many phenotypes in animal breeding are derived from incomplete measures, especially if they are challenging or expensive to measure precisely. Examples include time-dependent traits such as reproductive status, or lifespan. Incomplete measures for these traits result in phenotypes that are subject to left-, interval- and right-censoring, where phenotypes are only known to fall below an upper bound, between a lower and upper bound, or above a lower bound respectively. Here we compare three methods for deriving phenotypes from incomplete data using age at first elevation (> 1 ng/mL) in blood plasma progesterone (AGEP4), which generally coincides with onset of puberty, as an example trait.
Methods: We produced AGEP4 phenotypes from three blood samples collected at about 30-day intervals from approximately 5,000 Holstein-Friesian or Holstein-Friesian × Jersey cross-bred dairy heifers managed in 54 seasonal-calving, pasture-based herds in New Zealand. We used these actual data to simulate 7 different visit scenarios, increasing the extent of censoring by disregarding data from one or two of the three visits. Three methods for deriving phenotypes from these data were explored: 1) ordinal categorical variables which were analysed using categorical threshold analysis; 2) continuous variables, with a penalty of 31 d assigned to right-censored phenotypes; and 3) continuous variables, sampled from within a lower and upper bound using a data augmentation approach.
Results: Credibility intervals for heritability estimations overlapped across all methods and visit scenarios, but estimated heritabilities tended to be higher when left censoring was reduced. For sires with at least 5 daughters, the correlations between estimated breeding values (EBVs) from our three-visit scenario and each reduced data scenario varied by method, ranging from 0.65 to 0.95. The estimated breed effects also varied by method, but breed differences were smaller as phenotype censoring increased.
Conclusion: Our results indicate that using some methods, phenotypes derived from one observation per offspring for a time-dependent trait such as AGEP4 may provide comparable sire rankings to three observations per offspring. This has implications for the design of large-scale phenotyping initiatives where animal breeders aim to estimate variance parameters and estimated breeding values (EBVs) for phenotypes that are challenging to measure or prohibitively expensive.
期刊介绍:
Journal of Animal Science and Biotechnology is an open access, peer-reviewed journal that encompasses all aspects of animal science and biotechnology. That includes domestic animal production, animal genetics and breeding, animal reproduction and physiology, animal nutrition and biochemistry, feed processing technology and bioevaluation, animal biotechnology, and meat science.