{"title":"利用贝叶斯概率模型揭示对撞机事件中隐藏的新物理模式","authors":"D. Faroughy","doi":"10.22323/1.390.0238","DOIUrl":null,"url":null,"abstract":"Individual events at high-energy colliders like the LHC can be represented by a sequence of measurements, or ‘point patterns’. Starting from this generic data representation, we build a simple Bayesian probabilistic model for event measurements useful for unsupervised event classification in beyond the standard model (BSM) studies. In order to arrive to this model we assume that the event measurements are exchangeable (and apply De Finetti’s representation theorem), the data is discrete, and measurements are generated frommultiple ‘latent’ distributions (called themes). The resulting probabilistic model for collider events is a mixed-membership model known as Latent Dirichlet Allocation (LDA), a model extensively used in natural language processing applications. By training on mixed dijet samples of QCD and BSM, we demonstrate that a two-theme LDA model can learn to distinguish in (unlabelled) jet substructure data the hidden new physics patterns produced by a non-trivial BSM signature from a much larger QCD background.","PeriodicalId":20428,"journal":{"name":"Proceedings of 40th International Conference on High Energy physics — PoS(ICHEP2020)","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2021-02-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"9","resultStr":"{\"title\":\"Uncovering hidden new physics patterns in collider events using Bayesian probabilistic models\",\"authors\":\"D. Faroughy\",\"doi\":\"10.22323/1.390.0238\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Individual events at high-energy colliders like the LHC can be represented by a sequence of measurements, or ‘point patterns’. Starting from this generic data representation, we build a simple Bayesian probabilistic model for event measurements useful for unsupervised event classification in beyond the standard model (BSM) studies. In order to arrive to this model we assume that the event measurements are exchangeable (and apply De Finetti’s representation theorem), the data is discrete, and measurements are generated frommultiple ‘latent’ distributions (called themes). The resulting probabilistic model for collider events is a mixed-membership model known as Latent Dirichlet Allocation (LDA), a model extensively used in natural language processing applications. By training on mixed dijet samples of QCD and BSM, we demonstrate that a two-theme LDA model can learn to distinguish in (unlabelled) jet substructure data the hidden new physics patterns produced by a non-trivial BSM signature from a much larger QCD background.\",\"PeriodicalId\":20428,\"journal\":{\"name\":\"Proceedings of 40th International Conference on High Energy physics — PoS(ICHEP2020)\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-02-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"9\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of 40th International Conference on High Energy physics — PoS(ICHEP2020)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.22323/1.390.0238\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of 40th International Conference on High Energy physics — PoS(ICHEP2020)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.22323/1.390.0238","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Uncovering hidden new physics patterns in collider events using Bayesian probabilistic models
Individual events at high-energy colliders like the LHC can be represented by a sequence of measurements, or ‘point patterns’. Starting from this generic data representation, we build a simple Bayesian probabilistic model for event measurements useful for unsupervised event classification in beyond the standard model (BSM) studies. In order to arrive to this model we assume that the event measurements are exchangeable (and apply De Finetti’s representation theorem), the data is discrete, and measurements are generated frommultiple ‘latent’ distributions (called themes). The resulting probabilistic model for collider events is a mixed-membership model known as Latent Dirichlet Allocation (LDA), a model extensively used in natural language processing applications. By training on mixed dijet samples of QCD and BSM, we demonstrate that a two-theme LDA model can learn to distinguish in (unlabelled) jet substructure data the hidden new physics patterns produced by a non-trivial BSM signature from a much larger QCD background.