Andrey Veprikov, Alexander Afanasiev, Anton Khritankov
{"title":"A Mathematical Model of the Hidden Feedback Loop Effect in Machine Learning Systems","authors":"Andrey Veprikov, Alexander Afanasiev, Anton Khritankov","doi":"arxiv-2405.02726","DOIUrl":null,"url":null,"abstract":"Widespread deployment of societal-scale machine learning systems necessitates\na thorough understanding of the resulting long-term effects these systems have\non their environment, including loss of trustworthiness, bias amplification,\nand violation of AI safety requirements. We introduce a repeated learning\nprocess to jointly describe several phenomena attributed to unintended hidden\nfeedback loops, such as error amplification, induced concept drift, echo\nchambers and others. The process comprises the entire cycle of obtaining the\ndata, training the predictive model, and delivering predictions to end-users\nwithin a single mathematical model. A distinctive feature of such repeated\nlearning setting is that the state of the environment becomes causally\ndependent on the learner itself over time, thus violating the usual assumptions\nabout the data distribution. We present a novel dynamical systems model of the\nrepeated learning process and prove the limiting set of probability\ndistributions for positive and negative feedback loop modes of the system\noperation. We conduct a series of computational experiments using an exemplary\nsupervised learning problem on two synthetic data sets. The results of the\nexperiments correspond to the theoretical predictions derived from the\ndynamical model. Our results demonstrate the feasibility of the proposed\napproach for studying the repeated learning processes in machine learning\nsystems and open a range of opportunities for further research in the area.","PeriodicalId":501062,"journal":{"name":"arXiv - CS - Systems and Control","volume":"16 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Systems and Control","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2405.02726","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Widespread deployment of societal-scale machine learning systems necessitates
a thorough understanding of the resulting long-term effects these systems have
on their environment, including loss of trustworthiness, bias amplification,
and violation of AI safety requirements. We introduce a repeated learning
process to jointly describe several phenomena attributed to unintended hidden
feedback loops, such as error amplification, induced concept drift, echo
chambers and others. The process comprises the entire cycle of obtaining the
data, training the predictive model, and delivering predictions to end-users
within a single mathematical model. A distinctive feature of such repeated
learning setting is that the state of the environment becomes causally
dependent on the learner itself over time, thus violating the usual assumptions
about the data distribution. We present a novel dynamical systems model of the
repeated learning process and prove the limiting set of probability
distributions for positive and negative feedback loop modes of the system
operation. We conduct a series of computational experiments using an exemplary
supervised learning problem on two synthetic data sets. The results of the
experiments correspond to the theoretical predictions derived from the
dynamical model. Our results demonstrate the feasibility of the proposed
approach for studying the repeated learning processes in machine learning
systems and open a range of opportunities for further research in the area.