{"title":"0.1介绍","authors":"K. Bergen, K. Chavez, A. Ioannidis, S. Schmit","doi":"10.1515/9783112402245-003","DOIUrl":null,"url":null,"abstract":"Consider a function F (w) that we seek to optimize, min w F (w), which is the sum of constituent functions, F (w) = ∑n i=1 fi(w). We will be assuming n is large, w ∈ Rd, and d fits in memory on a single machine. Now, we can calculate the gradient of F (w) as a simple sum of the gradients of the constituent fi(w) functions, ∇F (w) = ∑n i=1∇fi(w), which we can then compute in O(nd). For example, if we have a least squares objective, i.e. fi(w) = (x > i w−yi), then∇fi(w) = 2(wxi−yi)xi, which is just a re-weighting of the original vector fi(w).","PeriodicalId":188022,"journal":{"name":"Foreign Investment in the Sultanate of Oman","volume":"17 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"0.1 Introduction\",\"authors\":\"K. Bergen, K. Chavez, A. Ioannidis, S. Schmit\",\"doi\":\"10.1515/9783112402245-003\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Consider a function F (w) that we seek to optimize, min w F (w), which is the sum of constituent functions, F (w) = ∑n i=1 fi(w). We will be assuming n is large, w ∈ Rd, and d fits in memory on a single machine. Now, we can calculate the gradient of F (w) as a simple sum of the gradients of the constituent fi(w) functions, ∇F (w) = ∑n i=1∇fi(w), which we can then compute in O(nd). For example, if we have a least squares objective, i.e. fi(w) = (x > i w−yi), then∇fi(w) = 2(wxi−yi)xi, which is just a re-weighting of the original vector fi(w).\",\"PeriodicalId\":188022,\"journal\":{\"name\":\"Foreign Investment in the Sultanate of Oman\",\"volume\":\"17 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-12-31\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Foreign Investment in the Sultanate of Oman\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1515/9783112402245-003\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Foreign Investment in the Sultanate of Oman","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1515/9783112402245-003","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
摘要
考虑我们寻求优化的函数F (w),最小w F (w),它是组成函数的和,F (w) =∑n i=1 fi(w)。我们将假设n很大,w∈Rd,并且d适合一台机器的内存。现在,我们可以计算F (w)的梯度,作为组成fi(w)函数的梯度的简单和,∇F (w) =∑n i=1∇fi(w),然后我们可以在O(nd)中计算。例如,如果我们有一个最小二乘目标,即fi(w) = (x > i w - yi),则∇fi(w) = 2(wxi - yi)xi,这只是对原始向量fi(w)的重新加权。
Consider a function F (w) that we seek to optimize, min w F (w), which is the sum of constituent functions, F (w) = ∑n i=1 fi(w). We will be assuming n is large, w ∈ Rd, and d fits in memory on a single machine. Now, we can calculate the gradient of F (w) as a simple sum of the gradients of the constituent fi(w) functions, ∇F (w) = ∑n i=1∇fi(w), which we can then compute in O(nd). For example, if we have a least squares objective, i.e. fi(w) = (x > i w−yi), then∇fi(w) = 2(wxi−yi)xi, which is just a re-weighting of the original vector fi(w).