{"title":"新型冠状病毒咳嗽和打喷嚏识别的集成互联网音视频传感器的设计与开发","authors":"Sina Kiaei, S. Honarparvar, S. Saeedi, S. Liang","doi":"10.1109/iemcon53756.2021.9623141","DOIUrl":null,"url":null,"abstract":"There are a lot of ongoing efforts to combat the COVID-19 pandemic using different combinations of low-cost sensing technologies, information/communication technologies, and smart computation. To provide COVID-19 situational awareness and early warnings, a scalable, real-time sensing solution is needed to recognize risky behaviors in COVID-19 virus spreading such as coughing and sneezing. Various coughing and sneezing recognition methods use audio-only or video-only sensors and Deep Learning (DL) algorithms for smart event recognition. However, each of these recognition processes experiences several types of failure behaviors due to false detection. Sensor integration is a solution to overcome such failures. Moreover, it improves event recognition precision. With the wide availability of low-cost audio and video sensors, we proposed a real-time integrated Internet of Things (IoT) architecture to improve the results of coughing and sneezing recognition. Implemented architecture joins edge and cloud computing. In edge computing, the microphone and camera are connected to the internet and embedded with a DL engine. Audio and video streams are fed to edge computing to detect coughing and sneezing actions in realtime. Cloud computing, which is developed based on the Amazon Web Service (AWS), combines the results of audio and video processing. In this paper, a scenario of a person coughing and sneezing was developed to demonstrate the capabilities of the proposed architecture. The experimental results show that the proposed architecture improved the reliability of coughing and sneezing recognition in the integrated cloud system compared to audio-only and video-only detectors. Three factors have been considered to compare the results of the proposed architecture: F-score, precision, and recall. The precision and recall of the cloud detector are improved on average by %43 and %15, respectively, compared to audio-only and video-only detectors. The F-score improved on average 1.24 times.","PeriodicalId":272590,"journal":{"name":"2021 IEEE 12th Annual Information Technology, Electronics and Mobile Communication Conference (IEMCON)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-10-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Design and Development of an Integrated Internet of Audio and Video Sensors for COVID-19 Coughing and Sneezing Recognition\",\"authors\":\"Sina Kiaei, S. Honarparvar, S. Saeedi, S. Liang\",\"doi\":\"10.1109/iemcon53756.2021.9623141\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"There are a lot of ongoing efforts to combat the COVID-19 pandemic using different combinations of low-cost sensing technologies, information/communication technologies, and smart computation. To provide COVID-19 situational awareness and early warnings, a scalable, real-time sensing solution is needed to recognize risky behaviors in COVID-19 virus spreading such as coughing and sneezing. Various coughing and sneezing recognition methods use audio-only or video-only sensors and Deep Learning (DL) algorithms for smart event recognition. However, each of these recognition processes experiences several types of failure behaviors due to false detection. Sensor integration is a solution to overcome such failures. Moreover, it improves event recognition precision. With the wide availability of low-cost audio and video sensors, we proposed a real-time integrated Internet of Things (IoT) architecture to improve the results of coughing and sneezing recognition. Implemented architecture joins edge and cloud computing. In edge computing, the microphone and camera are connected to the internet and embedded with a DL engine. Audio and video streams are fed to edge computing to detect coughing and sneezing actions in realtime. Cloud computing, which is developed based on the Amazon Web Service (AWS), combines the results of audio and video processing. In this paper, a scenario of a person coughing and sneezing was developed to demonstrate the capabilities of the proposed architecture. The experimental results show that the proposed architecture improved the reliability of coughing and sneezing recognition in the integrated cloud system compared to audio-only and video-only detectors. Three factors have been considered to compare the results of the proposed architecture: F-score, precision, and recall. The precision and recall of the cloud detector are improved on average by %43 and %15, respectively, compared to audio-only and video-only detectors. The F-score improved on average 1.24 times.\",\"PeriodicalId\":272590,\"journal\":{\"name\":\"2021 IEEE 12th Annual Information Technology, Electronics and Mobile Communication Conference (IEMCON)\",\"volume\":\"6 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-10-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 IEEE 12th Annual Information Technology, Electronics and Mobile Communication Conference (IEMCON)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/iemcon53756.2021.9623141\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE 12th Annual Information Technology, Electronics and Mobile Communication Conference (IEMCON)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/iemcon53756.2021.9623141","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Design and Development of an Integrated Internet of Audio and Video Sensors for COVID-19 Coughing and Sneezing Recognition
There are a lot of ongoing efforts to combat the COVID-19 pandemic using different combinations of low-cost sensing technologies, information/communication technologies, and smart computation. To provide COVID-19 situational awareness and early warnings, a scalable, real-time sensing solution is needed to recognize risky behaviors in COVID-19 virus spreading such as coughing and sneezing. Various coughing and sneezing recognition methods use audio-only or video-only sensors and Deep Learning (DL) algorithms for smart event recognition. However, each of these recognition processes experiences several types of failure behaviors due to false detection. Sensor integration is a solution to overcome such failures. Moreover, it improves event recognition precision. With the wide availability of low-cost audio and video sensors, we proposed a real-time integrated Internet of Things (IoT) architecture to improve the results of coughing and sneezing recognition. Implemented architecture joins edge and cloud computing. In edge computing, the microphone and camera are connected to the internet and embedded with a DL engine. Audio and video streams are fed to edge computing to detect coughing and sneezing actions in realtime. Cloud computing, which is developed based on the Amazon Web Service (AWS), combines the results of audio and video processing. In this paper, a scenario of a person coughing and sneezing was developed to demonstrate the capabilities of the proposed architecture. The experimental results show that the proposed architecture improved the reliability of coughing and sneezing recognition in the integrated cloud system compared to audio-only and video-only detectors. Three factors have been considered to compare the results of the proposed architecture: F-score, precision, and recall. The precision and recall of the cloud detector are improved on average by %43 and %15, respectively, compared to audio-only and video-only detectors. The F-score improved on average 1.24 times.