{"title":"利用锚点改进线性预测及其在疾病进展预测中的应用。","authors":"Alex Karanevich, Jianghua He, Byron J Gajewski","doi":"10.15446/rce.v41n2.68535","DOIUrl":null,"url":null,"abstract":"<p><p>Linear models are some of the most straightforward and commonly used modelling approaches. Consider modelling approximately monotonic response data arising from a time-related process. If one has knowledge as to when the process began or ended, then one may be able to leverage additional assumed data to reduce prediction error. This assumed data, referred to as the \"anchor,\" is treated as an additional data-point generated at either the beginning or end of the process. The response value of the anchor is equal to an intelligently selected value of the response (such as the upper bound, lower bound, or 99<sup>th</sup> percentile of the response, as appropriate). The anchor reduces the variance of prediction at the cost of a possible increase in prediction bias, resulting in a potentially reduced overall mean-square prediction error. This can be extremely effective when few individual data-points are available, allowing one to make linear predictions using as little as a single observed data-point. We develop the mathematics showing the conditions under which an anchor can improve predictions, and also demonstrate using this approach to reduce prediction error when modelling the disease progression of patients with amyotrophic lateral sclerosis.</p>","PeriodicalId":54477,"journal":{"name":"Revista Colombiana De Estadistica","volume":"41 2","pages":"137-155"},"PeriodicalIF":0.0000,"publicationDate":"2018-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6345390/pdf/nihms978249.pdf","citationCount":"2","resultStr":"{\"title\":\"Using an Anchor to Improve Linear Predictions with Application to Predicting Disease Progression.\",\"authors\":\"Alex Karanevich, Jianghua He, Byron J Gajewski\",\"doi\":\"10.15446/rce.v41n2.68535\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Linear models are some of the most straightforward and commonly used modelling approaches. Consider modelling approximately monotonic response data arising from a time-related process. If one has knowledge as to when the process began or ended, then one may be able to leverage additional assumed data to reduce prediction error. This assumed data, referred to as the \\\"anchor,\\\" is treated as an additional data-point generated at either the beginning or end of the process. The response value of the anchor is equal to an intelligently selected value of the response (such as the upper bound, lower bound, or 99<sup>th</sup> percentile of the response, as appropriate). The anchor reduces the variance of prediction at the cost of a possible increase in prediction bias, resulting in a potentially reduced overall mean-square prediction error. This can be extremely effective when few individual data-points are available, allowing one to make linear predictions using as little as a single observed data-point. We develop the mathematics showing the conditions under which an anchor can improve predictions, and also demonstrate using this approach to reduce prediction error when modelling the disease progression of patients with amyotrophic lateral sclerosis.</p>\",\"PeriodicalId\":54477,\"journal\":{\"name\":\"Revista Colombiana De Estadistica\",\"volume\":\"41 2\",\"pages\":\"137-155\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-07-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6345390/pdf/nihms978249.pdf\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Revista Colombiana De Estadistica\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.15446/rce.v41n2.68535\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"Mathematics\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Revista Colombiana De Estadistica","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.15446/rce.v41n2.68535","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"Mathematics","Score":null,"Total":0}
Using an Anchor to Improve Linear Predictions with Application to Predicting Disease Progression.
Linear models are some of the most straightforward and commonly used modelling approaches. Consider modelling approximately monotonic response data arising from a time-related process. If one has knowledge as to when the process began or ended, then one may be able to leverage additional assumed data to reduce prediction error. This assumed data, referred to as the "anchor," is treated as an additional data-point generated at either the beginning or end of the process. The response value of the anchor is equal to an intelligently selected value of the response (such as the upper bound, lower bound, or 99th percentile of the response, as appropriate). The anchor reduces the variance of prediction at the cost of a possible increase in prediction bias, resulting in a potentially reduced overall mean-square prediction error. This can be extremely effective when few individual data-points are available, allowing one to make linear predictions using as little as a single observed data-point. We develop the mathematics showing the conditions under which an anchor can improve predictions, and also demonstrate using this approach to reduce prediction error when modelling the disease progression of patients with amyotrophic lateral sclerosis.
期刊介绍:
The Colombian Journal of Statistics publishes original articles of theoretical, methodological and educational kind in any branch of Statistics. Purely theoretical papers should include illustration of the techniques presented with real data or at least simulation experiments in order to verify the usefulness of the contents presented. Informative articles of high quality methodologies or statistical techniques applied in different fields of knowledge are also considered. Only articles in English language are considered for publication.
The Editorial Committee assumes that the works submitted for evaluation
have not been previously published and are not being given simultaneously for publication elsewhere, and will not be without prior consent of the Committee, unless, as a result of the assessment, decides not publish in the journal. It is further assumed that when the authors deliver a document for publication in the Colombian Journal of Statistics, they know the above conditions and agree with them.