Vishnu Monn Baskaran, Yoong Choon Chang, J. Loo, Koksheik Wong, Ming-Tao Gan
{"title":"基于离散马尔可夫链的多用户视频会议优势说话人检测","authors":"Vishnu Monn Baskaran, Yoong Choon Chang, J. Loo, Koksheik Wong, Ming-Tao Gan","doi":"10.1109/ICCE-TW.2015.7217016","DOIUrl":null,"url":null,"abstract":"This paper puts forward a discrete-time Markov chain algorithm in predicting a pair of active or dominant speakers in an ultra-high definition multi-user video conferencing system. The applied Markov chain minimizes false dominant speaker classification due to transient noise during a video conferencing session. This algorithm also includes a set of linear weights-based assignment for both the initial state vector and transition probability matrix, which improves the response of the algorithm towards changing dominant speakers. Experimental results suggests that this algorithm accurately predicts the most dominant speaker at a rate of 83% for 11 clients in a combined video with 86% reduction in false dominant speaker classification, based on given a set of artificial speaker data.","PeriodicalId":340402,"journal":{"name":"2015 IEEE International Conference on Consumer Electronics - Taiwan","volume":"51 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Dominant speaker detection using discrete Markov chain for multi-user video conferencing\",\"authors\":\"Vishnu Monn Baskaran, Yoong Choon Chang, J. Loo, Koksheik Wong, Ming-Tao Gan\",\"doi\":\"10.1109/ICCE-TW.2015.7217016\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper puts forward a discrete-time Markov chain algorithm in predicting a pair of active or dominant speakers in an ultra-high definition multi-user video conferencing system. The applied Markov chain minimizes false dominant speaker classification due to transient noise during a video conferencing session. This algorithm also includes a set of linear weights-based assignment for both the initial state vector and transition probability matrix, which improves the response of the algorithm towards changing dominant speakers. Experimental results suggests that this algorithm accurately predicts the most dominant speaker at a rate of 83% for 11 clients in a combined video with 86% reduction in false dominant speaker classification, based on given a set of artificial speaker data.\",\"PeriodicalId\":340402,\"journal\":{\"name\":\"2015 IEEE International Conference on Consumer Electronics - Taiwan\",\"volume\":\"51 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-06-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2015 IEEE International Conference on Consumer Electronics - Taiwan\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICCE-TW.2015.7217016\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 IEEE International Conference on Consumer Electronics - Taiwan","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCE-TW.2015.7217016","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Dominant speaker detection using discrete Markov chain for multi-user video conferencing
This paper puts forward a discrete-time Markov chain algorithm in predicting a pair of active or dominant speakers in an ultra-high definition multi-user video conferencing system. The applied Markov chain minimizes false dominant speaker classification due to transient noise during a video conferencing session. This algorithm also includes a set of linear weights-based assignment for both the initial state vector and transition probability matrix, which improves the response of the algorithm towards changing dominant speakers. Experimental results suggests that this algorithm accurately predicts the most dominant speaker at a rate of 83% for 11 clients in a combined video with 86% reduction in false dominant speaker classification, based on given a set of artificial speaker data.