中国科学院机构知识库网格
Chinese Academy of Sciences Institutional Repositories Grid
A Comparative Study of Encoding, Pooling and Normalization Methods for Action Recognition

文献类型:会议论文

作者Xingxing Wang; Limin Wang; Yu Qiao
出版日期2012
会议名称ACCV'12 Proceedings of the 11th Asian conference on Computer Vision
会议地点德国
英文摘要Bag of visual words (BoVW) models have been widely and successfully used in video based action recognition. One key step in construct BoVW representation is to encode feature with a codebook. Recently, a number of new encoding methods have been developed to improve the performance of BoVW based object recognition and scene classi?cation, such as soft assignment encoding [1], sparse encoding [2], locality-constrained linear encoding [3] and Fisher kernel encoding [4]. However, their e?ects for action recognition are still unknown. The main objective of this paper is to evaluate and compare these new encoding methods in the context of video based action recognition. We also analyze and evaluate the combination of encoding methods with di?erent pooling and normalization strategies. We carry out experiments on KTH dataset [5] and HMDB51 dataset [6]. The results show that new encoding methods can signi?cantly improve the recognition accuracy compared with classical VQ. Among them, Fisher kernel encoding and sparse encoding have the best performance. By properly choosing pooling and normalization method, we achieve the state-of-the-art performance on HMDB51. We will publish the matlab codes used in this paper.
收录类别EI
语种英语
源URL[http://ir.siat.ac.cn:8080/handle/172644/3796]  
专题深圳先进技术研究院_集成所
作者单位2012
推荐引用方式
GB/T 7714
Xingxing Wang,Limin Wang,Yu Qiao. A Comparative Study of Encoding, Pooling and Normalization Methods for Action Recognition[C]. 见:ACCV'12 Proceedings of the 11th Asian conference on Computer Vision. 德国.

入库方式: OAI收割

来源:深圳先进技术研究院

浏览0
下载0
收藏0
其他版本

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。