{"title":"子模块性和适应性","authors":"J. Bilmes","doi":"10.1109/ASRU.2007.4430118","DOIUrl":null,"url":null,"abstract":"Summary form only given. Convexity is a property of real-valued functions that enable their efficient optimization. Convex optimization moreover is a problem onto which an amazing variety of practical problems can be cast. Having strong analogs to convexity, submodularity is a property of functions on discrete sets that allows their optimization to be done in only polynomial time. Submodularity generalizes the common notion of diminishing returns. Like convexity, a large variety of discrete optimization problems can be cast in terms of submodular optimization. The first part of this talk will survey recent work taking place in our lab on the application of submodularity to machine learning, which includes discriminative structure learning and word clustering for language models. The second part of the talk will discuss recent work on a technique that for many years has been widely successful in speech recognition, namely adaptation. We will view adaptation in a setting where the training and testing time distributions are not assumed identical (unlike typical Bayes risk theory). We will derive generalization error and sample complexity bounds for adaptation which are specified in terms of a natural divergence between the train/test distributions. These bounds, moreover, lead to practical and effective adaptation strategies for both generative models (e.g., GMMs, HMMs) and discriminative models (e.g., MLPs, SVMs). Joint work with Mukund Narasimhan and Xiao Li.","PeriodicalId":371729,"journal":{"name":"2007 IEEE Workshop on Automatic Speech Recognition & Understanding (ASRU)","volume":"44 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2007-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Submodularity and adaptation\",\"authors\":\"J. Bilmes\",\"doi\":\"10.1109/ASRU.2007.4430118\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Summary form only given. Convexity is a property of real-valued functions that enable their efficient optimization. Convex optimization moreover is a problem onto which an amazing variety of practical problems can be cast. Having strong analogs to convexity, submodularity is a property of functions on discrete sets that allows their optimization to be done in only polynomial time. Submodularity generalizes the common notion of diminishing returns. Like convexity, a large variety of discrete optimization problems can be cast in terms of submodular optimization. The first part of this talk will survey recent work taking place in our lab on the application of submodularity to machine learning, which includes discriminative structure learning and word clustering for language models. The second part of the talk will discuss recent work on a technique that for many years has been widely successful in speech recognition, namely adaptation. We will view adaptation in a setting where the training and testing time distributions are not assumed identical (unlike typical Bayes risk theory). We will derive generalization error and sample complexity bounds for adaptation which are specified in terms of a natural divergence between the train/test distributions. These bounds, moreover, lead to practical and effective adaptation strategies for both generative models (e.g., GMMs, HMMs) and discriminative models (e.g., MLPs, SVMs). Joint work with Mukund Narasimhan and Xiao Li.\",\"PeriodicalId\":371729,\"journal\":{\"name\":\"2007 IEEE Workshop on Automatic Speech Recognition & Understanding (ASRU)\",\"volume\":\"44 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2007-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2007 IEEE Workshop on Automatic Speech Recognition & Understanding (ASRU)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ASRU.2007.4430118\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2007 IEEE Workshop on Automatic Speech Recognition & Understanding (ASRU)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ASRU.2007.4430118","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Summary form only given. Convexity is a property of real-valued functions that enable their efficient optimization. Convex optimization moreover is a problem onto which an amazing variety of practical problems can be cast. Having strong analogs to convexity, submodularity is a property of functions on discrete sets that allows their optimization to be done in only polynomial time. Submodularity generalizes the common notion of diminishing returns. Like convexity, a large variety of discrete optimization problems can be cast in terms of submodular optimization. The first part of this talk will survey recent work taking place in our lab on the application of submodularity to machine learning, which includes discriminative structure learning and word clustering for language models. The second part of the talk will discuss recent work on a technique that for many years has been widely successful in speech recognition, namely adaptation. We will view adaptation in a setting where the training and testing time distributions are not assumed identical (unlike typical Bayes risk theory). We will derive generalization error and sample complexity bounds for adaptation which are specified in terms of a natural divergence between the train/test distributions. These bounds, moreover, lead to practical and effective adaptation strategies for both generative models (e.g., GMMs, HMMs) and discriminative models (e.g., MLPs, SVMs). Joint work with Mukund Narasimhan and Xiao Li.