{"title":"Action Recognition in Video Using Human Keypoint Detection","authors":"Luona Song, Xin Guo, Yiqi Fan","doi":"10.1109/ICCSE49874.2020.9201857","DOIUrl":null,"url":null,"abstract":"With the popularization of the internet and the increase of video facilities, the recognition and segmentation of actions in the video have become research highlights of high application value. Different from images, the information in the video is more complex and also brings time sequences as a new dimension. This paper proposes a video action recognition and segmentation model in the human keypoint detection task. The main contributions are as follows:1) Based on the speech signal processing method, this paper designs an analysis framework for video action, which consists of three steps. The first step is to obtain data from the key point frame of the human body; the second is the action segmentation model; the third is to visualize the model results;2)the dynamic time warping algorithm is used and improved from calculation cost and constraint conditions;3) a distance function is designed to measure the similarity between time series. Four kinds of features are introduced, and the final distance is the weighted sum of the four kinds of features;4) a non-maximum suppression method is designed to filter the overlapped segments to get the final results. Experiment design verifies the validity of the proposed model and the importance of proposed features is illustrated.","PeriodicalId":350703,"journal":{"name":"2020 15th International Conference on Computer Science & Education (ICCSE)","volume":"76 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 15th International Conference on Computer Science & Education (ICCSE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCSE49874.2020.9201857","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
With the popularization of the internet and the increase of video facilities, the recognition and segmentation of actions in the video have become research highlights of high application value. Different from images, the information in the video is more complex and also brings time sequences as a new dimension. This paper proposes a video action recognition and segmentation model in the human keypoint detection task. The main contributions are as follows:1) Based on the speech signal processing method, this paper designs an analysis framework for video action, which consists of three steps. The first step is to obtain data from the key point frame of the human body; the second is the action segmentation model; the third is to visualize the model results;2)the dynamic time warping algorithm is used and improved from calculation cost and constraint conditions;3) a distance function is designed to measure the similarity between time series. Four kinds of features are introduced, and the final distance is the weighted sum of the four kinds of features;4) a non-maximum suppression method is designed to filter the overlapped segments to get the final results. Experiment design verifies the validity of the proposed model and the importance of proposed features is illustrated.