A Comparative Study of Encoding, Pooling and Normalization Methods for Action Recognition
文献类型:会议论文
作者 | Xingxing Wang; Limin Wang; Yu Qiao |
出版日期 | 2012 |
会议名称 | ACCV'12 Proceedings of the 11th Asian conference on Computer Vision |
会议地点 | 德国 |
英文摘要 | Bag of visual words (BoVW) models have been widely and successfully used in video based action recognition. One key step in construct BoVW representation is to encode feature with a codebook. Recently, a number of new encoding methods have been developed to improve the performance of BoVW based object recognition and scene classi?cation, such as soft assignment encoding [1], sparse encoding [2], locality-constrained linear encoding [3] and Fisher kernel encoding [4]. However, their e?ects for action recognition are still unknown. The main objective of this paper is to evaluate and compare these new encoding methods in the context of video based action recognition. We also analyze and evaluate the combination of encoding methods with di?erent pooling and normalization strategies. We carry out experiments on KTH dataset [5] and HMDB51 dataset [6]. The results show that new encoding methods can signi?cantly improve the recognition accuracy compared with classical VQ. Among them, Fisher kernel encoding and sparse encoding have the best performance. By properly choosing pooling and normalization method, we achieve the state-of-the-art performance on HMDB51. We will publish the matlab codes used in this paper. |
收录类别 | EI |
语种 | 英语 |
源URL | [http://ir.siat.ac.cn:8080/handle/172644/3796] |
专题 | 深圳先进技术研究院_集成所 |
作者单位 | 2012 |
推荐引用方式 GB/T 7714 | Xingxing Wang,Limin Wang,Yu Qiao. A Comparative Study of Encoding, Pooling and Normalization Methods for Action Recognition[C]. 见:ACCV'12 Proceedings of the 11th Asian conference on Computer Vision. 德国. |
入库方式: OAI收割
来源:深圳先进技术研究院
浏览0
下载0
收藏0
其他版本
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。