HAVS:基于人类行为的视频总结、分类、挑战和未来展望

Ambreen Sabha, A. Selwal
{"title":"HAVS:基于人类行为的视频总结、分类、挑战和未来展望","authors":"Ambreen Sabha, A. Selwal","doi":"10.1109/ICSES52305.2021.9633804","DOIUrl":null,"url":null,"abstract":"In computer vision, video summarization is a critical research problem as it is related to a more condensed and engaging portrayal of the video's original content. Deep learning models have lately been employed for various approaches to human action recognition. In this paper, we examine the most up-to-date methodologies for summarizing human behaviors in videos, as well as numerous deep learning and hybrid algorithms. We provide an in-depth analysis of the many forms of human activities, including gesture-based, interaction-based, human action-based, and group activity-based activities. Our study goes over the most recent benchmark datasets for recognizing human motion in video sequences. It also discusses the strengths and limitations of the existing methods, open research issues, and future directions for human action-based video summarization (HAVS). This work clearly reveals that majority of HAVS approaches rely upon key-frames selection using Convolution neural network (CNN), which direct research community to explore sequence learning such as Long short-term neural network (LSTM). Furthermore, inadequate datasets for learning HAVS models are an additional challenge. An improvement in existing deep learning models for HAVS may be oriented towards the notion of transfer learning, which results in lower training overhead and higher accuracy.","PeriodicalId":6777,"journal":{"name":"2021 International Conference on Innovative Computing, Intelligent Communication and Smart Electrical Systems (ICSES)","volume":"108 1","pages":"1-9"},"PeriodicalIF":0.0000,"publicationDate":"2021-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":"{\"title\":\"HAVS: Human action-based video summarization, Taxonomy, Challenges, and Future Perspectives\",\"authors\":\"Ambreen Sabha, A. Selwal\",\"doi\":\"10.1109/ICSES52305.2021.9633804\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In computer vision, video summarization is a critical research problem as it is related to a more condensed and engaging portrayal of the video's original content. Deep learning models have lately been employed for various approaches to human action recognition. In this paper, we examine the most up-to-date methodologies for summarizing human behaviors in videos, as well as numerous deep learning and hybrid algorithms. We provide an in-depth analysis of the many forms of human activities, including gesture-based, interaction-based, human action-based, and group activity-based activities. Our study goes over the most recent benchmark datasets for recognizing human motion in video sequences. It also discusses the strengths and limitations of the existing methods, open research issues, and future directions for human action-based video summarization (HAVS). This work clearly reveals that majority of HAVS approaches rely upon key-frames selection using Convolution neural network (CNN), which direct research community to explore sequence learning such as Long short-term neural network (LSTM). Furthermore, inadequate datasets for learning HAVS models are an additional challenge. An improvement in existing deep learning models for HAVS may be oriented towards the notion of transfer learning, which results in lower training overhead and higher accuracy.\",\"PeriodicalId\":6777,\"journal\":{\"name\":\"2021 International Conference on Innovative Computing, Intelligent Communication and Smart Electrical Systems (ICSES)\",\"volume\":\"108 1\",\"pages\":\"1-9\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-09-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"6\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 International Conference on Innovative Computing, Intelligent Communication and Smart Electrical Systems (ICSES)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICSES52305.2021.9633804\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 International Conference on Innovative Computing, Intelligent Communication and Smart Electrical Systems (ICSES)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICSES52305.2021.9633804","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 6

摘要

在计算机视觉中,视频摘要是一个关键的研究问题,因为它关系到对视频原始内容的更浓缩和更吸引人的描绘。深度学习模型最近被用于各种人类行为识别方法。在本文中,我们研究了用于总结视频中人类行为的最新方法,以及许多深度学习和混合算法。我们对多种形式的人类活动进行了深入分析,包括基于手势的、基于互动的、基于人类行为的和基于群体活动的活动。我们的研究通过最新的基准数据集来识别视频序列中的人体运动。讨论了基于人类行为的视频摘要(HAVS)现有方法的优势和局限性、开放的研究问题以及未来的发展方向。这项工作清楚地表明,大多数HAVS方法依赖于使用卷积神经网络(CNN)的关键帧选择,这指导了研究界探索序列学习,如长短期神经网络(LSTM)。此外,用于学习HAVS模型的数据集不足是另一个挑战。现有的HAVS深度学习模型的改进可能是面向迁移学习的概念,这将导致更低的训练开销和更高的准确性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
HAVS: Human action-based video summarization, Taxonomy, Challenges, and Future Perspectives
In computer vision, video summarization is a critical research problem as it is related to a more condensed and engaging portrayal of the video's original content. Deep learning models have lately been employed for various approaches to human action recognition. In this paper, we examine the most up-to-date methodologies for summarizing human behaviors in videos, as well as numerous deep learning and hybrid algorithms. We provide an in-depth analysis of the many forms of human activities, including gesture-based, interaction-based, human action-based, and group activity-based activities. Our study goes over the most recent benchmark datasets for recognizing human motion in video sequences. It also discusses the strengths and limitations of the existing methods, open research issues, and future directions for human action-based video summarization (HAVS). This work clearly reveals that majority of HAVS approaches rely upon key-frames selection using Convolution neural network (CNN), which direct research community to explore sequence learning such as Long short-term neural network (LSTM). Furthermore, inadequate datasets for learning HAVS models are an additional challenge. An improvement in existing deep learning models for HAVS may be oriented towards the notion of transfer learning, which results in lower training overhead and higher accuracy.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信