Moritz Einfalt, Charles Dampeyrou, D. Zecha, R. Lienhart
{"title":"基于姿态卷积序列网络的运动视频帧级事件检测","authors":"Moritz Einfalt, Charles Dampeyrou, D. Zecha, R. Lienhart","doi":"10.1145/3347318.3355525","DOIUrl":null,"url":null,"abstract":"In this paper we address the problem of automatic event detection in athlete motion for automated performance analysis in athletics. We specifically consider the detection of stride-, jump- and landing related events from monocular recordings in long and triple jump. Existing work on event detection in sports often uses manually designed features on body and pose configurations of the athlete to infer the occurrence of events. We present a two-step approach, where temporal 2D pose sequences extracted from the videos form the basis for learning an event detection model. We formulate the detection of discrete events as a sequence translation task and propose a convolutional sequence network that can accurately predict the timing of event occurrences. Our best performing architecture achieves a precision/recall of 92.3%/89.0% in detecting start and end of ground contact during the run-up and jump of an athlete at a temporal precision of +/- 1 frame at 200Hz. The results show that 2D pose sequences are a suitable motion representation for learning event detection in a sequence-to-sequence framework.","PeriodicalId":322390,"journal":{"name":"MMSports '19","volume":"3 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-10-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"19","resultStr":"{\"title\":\"Frame-Level Event Detection in Athletics Videos with Pose-Based Convolutional Sequence Networks\",\"authors\":\"Moritz Einfalt, Charles Dampeyrou, D. Zecha, R. Lienhart\",\"doi\":\"10.1145/3347318.3355525\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper we address the problem of automatic event detection in athlete motion for automated performance analysis in athletics. We specifically consider the detection of stride-, jump- and landing related events from monocular recordings in long and triple jump. Existing work on event detection in sports often uses manually designed features on body and pose configurations of the athlete to infer the occurrence of events. We present a two-step approach, where temporal 2D pose sequences extracted from the videos form the basis for learning an event detection model. We formulate the detection of discrete events as a sequence translation task and propose a convolutional sequence network that can accurately predict the timing of event occurrences. Our best performing architecture achieves a precision/recall of 92.3%/89.0% in detecting start and end of ground contact during the run-up and jump of an athlete at a temporal precision of +/- 1 frame at 200Hz. The results show that 2D pose sequences are a suitable motion representation for learning event detection in a sequence-to-sequence framework.\",\"PeriodicalId\":322390,\"journal\":{\"name\":\"MMSports '19\",\"volume\":\"3 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-10-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"19\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"MMSports '19\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3347318.3355525\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"MMSports '19","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3347318.3355525","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Frame-Level Event Detection in Athletics Videos with Pose-Based Convolutional Sequence Networks
In this paper we address the problem of automatic event detection in athlete motion for automated performance analysis in athletics. We specifically consider the detection of stride-, jump- and landing related events from monocular recordings in long and triple jump. Existing work on event detection in sports often uses manually designed features on body and pose configurations of the athlete to infer the occurrence of events. We present a two-step approach, where temporal 2D pose sequences extracted from the videos form the basis for learning an event detection model. We formulate the detection of discrete events as a sequence translation task and propose a convolutional sequence network that can accurately predict the timing of event occurrences. Our best performing architecture achieves a precision/recall of 92.3%/89.0% in detecting start and end of ground contact during the run-up and jump of an athlete at a temporal precision of +/- 1 frame at 200Hz. The results show that 2D pose sequences are a suitable motion representation for learning event detection in a sequence-to-sequence framework.