{"title":"Key frame extraction for falling detection","authors":"Jing Du, Yale Zhao, Shanna Zhuang, Zhengyou Wang","doi":"10.1109/ICITBE54178.2021.00032","DOIUrl":null,"url":null,"abstract":"The continuous advancement of computer vision and deep learning technology provides powerful technical support for supervising the occurrence of falling behaviors. Considering that some frames in the video contribute little to the recognition of falling behaviors, in order to eliminate the video frames irrelevant to falling behaviors, this paper proposes an algorithm to extract key frames in a video by combining LUV local maximum and Mask-RCNN. Firstly, candidate key frames are extracted based on the local maximum of LUV. Video frames containing motion changes can be obtained. Afterwards, in order to further eliminate the video frames that are less relevant to the fall action, Mask-RCNN is used for human body detection. According to the aspect ratio of the bounding box and the motion speed of the human body, the video frames that are more relevant to the fall action are selected as key frames. Experiments and results analysis are carried out on the UR fall detection dataset, Multiple cameras fall dataset, Le2i Fall detection dataset and falling video dataset of real scenes. The effectiveness and accuracy of the proposed method are verified.","PeriodicalId":207276,"journal":{"name":"2021 International Conference on Information Technology and Biomedical Engineering (ICITBE)","volume":"15 6","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 International Conference on Information Technology and Biomedical Engineering (ICITBE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICITBE54178.2021.00032","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
The continuous advancement of computer vision and deep learning technology provides powerful technical support for supervising the occurrence of falling behaviors. Considering that some frames in the video contribute little to the recognition of falling behaviors, in order to eliminate the video frames irrelevant to falling behaviors, this paper proposes an algorithm to extract key frames in a video by combining LUV local maximum and Mask-RCNN. Firstly, candidate key frames are extracted based on the local maximum of LUV. Video frames containing motion changes can be obtained. Afterwards, in order to further eliminate the video frames that are less relevant to the fall action, Mask-RCNN is used for human body detection. According to the aspect ratio of the bounding box and the motion speed of the human body, the video frames that are more relevant to the fall action are selected as key frames. Experiments and results analysis are carried out on the UR fall detection dataset, Multiple cameras fall dataset, Le2i Fall detection dataset and falling video dataset of real scenes. The effectiveness and accuracy of the proposed method are verified.