Karnati Sai Shashank, N. P. Prasad, K. S. Reddy, L. Rao
{"title":"上传板球比赛视频生成音频解说由YOLOv8和变压器","authors":"Karnati Sai Shashank, N. P. Prasad, K. S. Reddy, L. Rao","doi":"10.1109/ICSCSS57650.2023.10169522","DOIUrl":null,"url":null,"abstract":"The main purpose is to post cricket videos and create audio commentary. Make cricket video automatically generate audio commentary. The YOLOv8 model is used to extract the features from the image and is followed by Transformer-LSTM network to generate the response as text, which is then converted to audio. The proposed model serves variable length input data and consecutive outputs. In addition, the model can use timing information for predict the pitch and the length of the bowler's delivery and the batsman's shot selection, and the outcome of the ball. However, there is no standard data to perform those tasks. So, this study performs data collection to classification.","PeriodicalId":217957,"journal":{"name":"2023 International Conference on Sustainable Computing and Smart Systems (ICSCSS)","volume":"72 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-06-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Upload Cricket Match Video to Generate Audio Commentary by YOLOv8 and Transformer\",\"authors\":\"Karnati Sai Shashank, N. P. Prasad, K. S. Reddy, L. Rao\",\"doi\":\"10.1109/ICSCSS57650.2023.10169522\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The main purpose is to post cricket videos and create audio commentary. Make cricket video automatically generate audio commentary. The YOLOv8 model is used to extract the features from the image and is followed by Transformer-LSTM network to generate the response as text, which is then converted to audio. The proposed model serves variable length input data and consecutive outputs. In addition, the model can use timing information for predict the pitch and the length of the bowler's delivery and the batsman's shot selection, and the outcome of the ball. However, there is no standard data to perform those tasks. So, this study performs data collection to classification.\",\"PeriodicalId\":217957,\"journal\":{\"name\":\"2023 International Conference on Sustainable Computing and Smart Systems (ICSCSS)\",\"volume\":\"72 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-06-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2023 International Conference on Sustainable Computing and Smart Systems (ICSCSS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICSCSS57650.2023.10169522\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 International Conference on Sustainable Computing and Smart Systems (ICSCSS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICSCSS57650.2023.10169522","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Upload Cricket Match Video to Generate Audio Commentary by YOLOv8 and Transformer
The main purpose is to post cricket videos and create audio commentary. Make cricket video automatically generate audio commentary. The YOLOv8 model is used to extract the features from the image and is followed by Transformer-LSTM network to generate the response as text, which is then converted to audio. The proposed model serves variable length input data and consecutive outputs. In addition, the model can use timing information for predict the pitch and the length of the bowler's delivery and the batsman's shot selection, and the outcome of the ball. However, there is no standard data to perform those tasks. So, this study performs data collection to classification.