{"title":"实时面部表情识别和语音转录在一个内部部署视频会议应用程序","authors":"Sally Ahmed, N. Areed, M. Obayya, F. Khalifa","doi":"10.21608/ijt.2022.266291","DOIUrl":null,"url":null,"abstract":": Since Covid-19 pandemic outbreak, organizations and individuals have had to use video conference applications increasingly. However, the commercial video conference applications are expensive, and feature limited. This paper discusses how to enable organizations to host on-premise video conference applications. Then, it explores assisting organization’s stakeholders with making decisions based on facial expressions of video conference attendees. Moreover, it facili-tates transcribing speech into text to enable deaf persons to participate in online conferences. Technologies and tools used in addressing these challenges respectively are: (i) Web Real Time Communication (WebRTC) project, (ii) Tensorflow.js library, (iii) and Web Speech Application Programming Interface (API). This paper depends on integration between a collection of technologies, libraries, standards, and protocols. Most of them can be managed using JavaScript framework. Hence, load of the performance is distributed on each client-side device. The proposed on-premise video conference application has been enhanced through including facial expression recognition with 66% high accuracy while the speech-into-text feature with Word Error Rates (WER) are 0 and 0.12 for British English and Egyptian Arabic, respectively .","PeriodicalId":42285,"journal":{"name":"International Journal of Interdisciplinary Telecommunications and Networking","volume":"110 1","pages":""},"PeriodicalIF":0.4000,"publicationDate":"2022-08-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Real-Time Facial Expression Recognition and Speech Tran-scripts over an on-premise Video Conference Application\",\"authors\":\"Sally Ahmed, N. Areed, M. Obayya, F. Khalifa\",\"doi\":\"10.21608/ijt.2022.266291\",\"DOIUrl\":null,\"url\":null,\"abstract\":\": Since Covid-19 pandemic outbreak, organizations and individuals have had to use video conference applications increasingly. However, the commercial video conference applications are expensive, and feature limited. This paper discusses how to enable organizations to host on-premise video conference applications. Then, it explores assisting organization’s stakeholders with making decisions based on facial expressions of video conference attendees. Moreover, it facili-tates transcribing speech into text to enable deaf persons to participate in online conferences. Technologies and tools used in addressing these challenges respectively are: (i) Web Real Time Communication (WebRTC) project, (ii) Tensorflow.js library, (iii) and Web Speech Application Programming Interface (API). This paper depends on integration between a collection of technologies, libraries, standards, and protocols. Most of them can be managed using JavaScript framework. Hence, load of the performance is distributed on each client-side device. The proposed on-premise video conference application has been enhanced through including facial expression recognition with 66% high accuracy while the speech-into-text feature with Word Error Rates (WER) are 0 and 0.12 for British English and Egyptian Arabic, respectively .\",\"PeriodicalId\":42285,\"journal\":{\"name\":\"International Journal of Interdisciplinary Telecommunications and Networking\",\"volume\":\"110 1\",\"pages\":\"\"},\"PeriodicalIF\":0.4000,\"publicationDate\":\"2022-08-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Journal of Interdisciplinary Telecommunications and Networking\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.21608/ijt.2022.266291\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"TELECOMMUNICATIONS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Interdisciplinary Telecommunications and Networking","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.21608/ijt.2022.266291","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"TELECOMMUNICATIONS","Score":null,"Total":0}
Real-Time Facial Expression Recognition and Speech Tran-scripts over an on-premise Video Conference Application
: Since Covid-19 pandemic outbreak, organizations and individuals have had to use video conference applications increasingly. However, the commercial video conference applications are expensive, and feature limited. This paper discusses how to enable organizations to host on-premise video conference applications. Then, it explores assisting organization’s stakeholders with making decisions based on facial expressions of video conference attendees. Moreover, it facili-tates transcribing speech into text to enable deaf persons to participate in online conferences. Technologies and tools used in addressing these challenges respectively are: (i) Web Real Time Communication (WebRTC) project, (ii) Tensorflow.js library, (iii) and Web Speech Application Programming Interface (API). This paper depends on integration between a collection of technologies, libraries, standards, and protocols. Most of them can be managed using JavaScript framework. Hence, load of the performance is distributed on each client-side device. The proposed on-premise video conference application has been enhanced through including facial expression recognition with 66% high accuracy while the speech-into-text feature with Word Error Rates (WER) are 0 and 0.12 for British English and Egyptian Arabic, respectively .
期刊介绍:
The International Journal of Interdisciplinary Telecommunications and Networking (IJITN) examines timely and important telecommunications and networking issues, problems, and solutions from a multidimensional, interdisciplinary perspective for researchers and practitioners. IJITN emphasizes the cross-disciplinary viewpoints of electrical engineering, computer science, information technology, operations research, business administration, economics, sociology, and law. The journal publishes theoretical and empirical research findings, case studies, and surveys, as well as the opinions of leaders and experts in the field. The journal''s coverage of telecommunications and networking is broad, ranging from cutting edge research to practical implementations. Published articles must be from an interdisciplinary, rather than a narrow, discipline-specific viewpoint. The context may be industry-wide, organizational, individual user, or societal. Topics Covered: -Emerging telecommunications and networking technologies -Global telecommunications industry business modeling and analysis -Network management and security -New telecommunications applications, products, and services -Social and societal aspects of telecommunications and networking -Standards and standardization issues for telecommunications and networking -Strategic telecommunications management -Telecommunications and networking cultural issues and education -Telecommunications and networking hardware and software design -Telecommunications investments and new ventures -Telecommunications network modeling and design -Telecommunications regulation and policy issues -Telecommunications systems economics