{"title":"跨异构电话系统的语音同步:问题和解决方案","authors":"Hsiao-Pu Lin, Hung-Yun Hsieh","doi":"10.1109/ICC.2010.5502433","DOIUrl":null,"url":null,"abstract":"As IP telephony gains more popularity, interworking with conventional PSTN telephony has also gained more importance. In particular, an increasing number of new telephony services now involves both packet-switched (IP telephony) and circuit-switched (PSTN telephony) voice legs in one call session. One common problem that arises for enabling such new services is the need for synchronization of voice streams that traverse through heterogeneous telephony systems. In this paper, we first identify the key role of voice synchronization across heterogeneous telephony systems for services such as seamless handover between WLAN and cellular networks and multi-party audio conferencing with video overlay. We then explain the challenges in synchronizing circuit-switched and packet-switched voice streams, including codec distortion, packet losses, line noises, and overlapping utterances. To achieve voice synchronization, we proceed to investigate three different approaches based on digital speech processing techniques in the waveform, cepstrum, and spectrum domains. Finally, we compare the performance benefits and tradeoffs of different approaches, thus motivating further research along this direction.","PeriodicalId":6405,"journal":{"name":"2010 IEEE International Conference on Communications","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2010-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Voice Synchronization across Heterogeneous Telephony Systems: Problem and Solutions\",\"authors\":\"Hsiao-Pu Lin, Hung-Yun Hsieh\",\"doi\":\"10.1109/ICC.2010.5502433\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"As IP telephony gains more popularity, interworking with conventional PSTN telephony has also gained more importance. In particular, an increasing number of new telephony services now involves both packet-switched (IP telephony) and circuit-switched (PSTN telephony) voice legs in one call session. One common problem that arises for enabling such new services is the need for synchronization of voice streams that traverse through heterogeneous telephony systems. In this paper, we first identify the key role of voice synchronization across heterogeneous telephony systems for services such as seamless handover between WLAN and cellular networks and multi-party audio conferencing with video overlay. We then explain the challenges in synchronizing circuit-switched and packet-switched voice streams, including codec distortion, packet losses, line noises, and overlapping utterances. To achieve voice synchronization, we proceed to investigate three different approaches based on digital speech processing techniques in the waveform, cepstrum, and spectrum domains. Finally, we compare the performance benefits and tradeoffs of different approaches, thus motivating further research along this direction.\",\"PeriodicalId\":6405,\"journal\":{\"name\":\"2010 IEEE International Conference on Communications\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2010-05-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2010 IEEE International Conference on Communications\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICC.2010.5502433\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2010 IEEE International Conference on Communications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICC.2010.5502433","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Voice Synchronization across Heterogeneous Telephony Systems: Problem and Solutions
As IP telephony gains more popularity, interworking with conventional PSTN telephony has also gained more importance. In particular, an increasing number of new telephony services now involves both packet-switched (IP telephony) and circuit-switched (PSTN telephony) voice legs in one call session. One common problem that arises for enabling such new services is the need for synchronization of voice streams that traverse through heterogeneous telephony systems. In this paper, we first identify the key role of voice synchronization across heterogeneous telephony systems for services such as seamless handover between WLAN and cellular networks and multi-party audio conferencing with video overlay. We then explain the challenges in synchronizing circuit-switched and packet-switched voice streams, including codec distortion, packet losses, line noises, and overlapping utterances. To achieve voice synchronization, we proceed to investigate three different approaches based on digital speech processing techniques in the waveform, cepstrum, and spectrum domains. Finally, we compare the performance benefits and tradeoffs of different approaches, thus motivating further research along this direction.