在多尺度时空特征提取网络中使用分割增强帧预测

Michael Mu-Chien Hsu, Richard Jui-Chun Shyur
{"title":"在多尺度时空特征提取网络中使用分割增强帧预测","authors":"Michael Mu-Chien Hsu, Richard Jui-Chun Shyur","doi":"10.1109/ICPAI51961.2020.00038","DOIUrl":null,"url":null,"abstract":"Designing a machine to predict future events is a challenging problem to even existing state-of-the-art approaches. It require great computation power either in adversarial training and in segmentation and optical flow. By combining conventional segmentation and the DNN we proposed in this paper, we have a simpler architecture which effectively and efficiently predicts both future frames and semantics more precise than the previous approaches. The input is a raw image sequence, and each frame of it is segmented for semantics, extracted for spatial features, analyzed for temporal features at different scales in a top down path; and then the prediction of frames and segmentation are synthesized in the bottom-up path. Results of our model show superiority of prediction to other state-of-the-art ones in (1) precision of frames, and (2) accuracy of segmentation masks.","PeriodicalId":330198,"journal":{"name":"2020 International Conference on Pervasive Artificial Intelligence (ICPAI)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Using Segmentation to Enhance Frame Prediction in a Multi-Scale Spatial-Temporal Feature Extraction Network\",\"authors\":\"Michael Mu-Chien Hsu, Richard Jui-Chun Shyur\",\"doi\":\"10.1109/ICPAI51961.2020.00038\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Designing a machine to predict future events is a challenging problem to even existing state-of-the-art approaches. It require great computation power either in adversarial training and in segmentation and optical flow. By combining conventional segmentation and the DNN we proposed in this paper, we have a simpler architecture which effectively and efficiently predicts both future frames and semantics more precise than the previous approaches. The input is a raw image sequence, and each frame of it is segmented for semantics, extracted for spatial features, analyzed for temporal features at different scales in a top down path; and then the prediction of frames and segmentation are synthesized in the bottom-up path. Results of our model show superiority of prediction to other state-of-the-art ones in (1) precision of frames, and (2) accuracy of segmentation masks.\",\"PeriodicalId\":330198,\"journal\":{\"name\":\"2020 International Conference on Pervasive Artificial Intelligence (ICPAI)\",\"volume\":\"4 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 International Conference on Pervasive Artificial Intelligence (ICPAI)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICPAI51961.2020.00038\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 International Conference on Pervasive Artificial Intelligence (ICPAI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICPAI51961.2020.00038","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

设计一台能够预测未来事件的机器,即使对现有最先进的方法来说,也是一个具有挑战性的问题。无论是对抗性训练还是分割和光流,都需要巨大的计算能力。通过将传统分割和本文提出的深度神经网络相结合,我们有了一个更简单的架构,可以有效地预测未来的帧和语义,比以前的方法更精确。输入为原始图像序列,对其每帧进行语义分割、空间特征提取、不同尺度时间特征分析;然后在自底向上的路径上综合帧的预测和分割。结果表明,我们的模型在(1)帧的精度和(2)分割掩码的精度方面优于其他最先进的预测方法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Using Segmentation to Enhance Frame Prediction in a Multi-Scale Spatial-Temporal Feature Extraction Network
Designing a machine to predict future events is a challenging problem to even existing state-of-the-art approaches. It require great computation power either in adversarial training and in segmentation and optical flow. By combining conventional segmentation and the DNN we proposed in this paper, we have a simpler architecture which effectively and efficiently predicts both future frames and semantics more precise than the previous approaches. The input is a raw image sequence, and each frame of it is segmented for semantics, extracted for spatial features, analyzed for temporal features at different scales in a top down path; and then the prediction of frames and segmentation are synthesized in the bottom-up path. Results of our model show superiority of prediction to other state-of-the-art ones in (1) precision of frames, and (2) accuracy of segmentation masks.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信