Saliency-Informed Spatio-Temporal Vector of Locally Aggregated Descriptors and Fisher Vectors for Visual Action Recognition

Zheming Zuo, Daniel Organisciak, Hubert P. H. Shum and Longzhi Yang
Proceedings of the 2018 British Machine Vision Conference Workshop on Image Analysis for Human Facial and Activity Recognition (IAHFAR), 2018

Saliency-Informed Spatio-Temporal Vector of Locally Aggregated Descriptors and Fisher Vectors for Visual Action Recognition

Abstract

Feature encoding has been extensively studied for the task of visual action recognition (VAR). The recently proposed super vector-based encoding methods, such as the Vector of Locally Aggregated Descriptors (VLAD) and the Fisher Vectors (FV), have significantly improved the recognition performance. Despite of the success, they still struggle with the superfluous information that presents during the training stage, which makes the methods computationally expensive when applied to a large number of extracted features. In order to address such challenge, this paper proposes a Saliency-Informed Spatio-Temporal VLAD (SST-VLAD) approach which selects the extracted features corresponding to small amount of videos in the data set by considering both the spatial and temporal video-wise saliency scores; and the same extension principle has also been applied to the FV approach. The experimental results indicate that the proposed feature encoding schemes consistently outperform the existing ones with significantly lower computational cost.

Downloads

YouTube

Citations

BibTeX

@inproceedings{zuo18saliency,
 author={Zuo, Zheming and Organisciak, Daniel and Shum, Hubert P. H. and Yang, Longzhi},
 booktitle={Proceedings of the 2018 British Machine Vision Conference Workshop on Image Analysis for Human Facial and Activity Recognition},
 series={IAHFAR '18},
 title={Saliency-Informed Spatio-Temporal Vector of Locally Aggregated Descriptors and Fisher Vectors for Visual Action Recognition},
 year={2018},
 month={9},
 numpages={11},
 location={Newcastle upon Tyne, UK},
}

RIS

TY  - CONF
AU  - Zuo, Zheming
AU  - Organisciak, Daniel
AU  - Shum, Hubert P. H.
AU  - Yang, Longzhi
T2  - Proceedings of the 2018 British Machine Vision Conference Workshop on Image Analysis for Human Facial and Activity Recognition
TI  - Saliency-Informed Spatio-Temporal Vector of Locally Aggregated Descriptors and Fisher Vectors for Visual Action Recognition
PY  - 2018
Y1  - 9 2018
ER  - 

Plain Text

Zheming Zuo, Daniel Organisciak, Hubert P. H. Shum and Longzhi Yang, "Saliency-Informed Spatio-Temporal Vector of Locally Aggregated Descriptors and Fisher Vectors for Visual Action Recognition," in IAHFAR '18: Proceedings of the 2018 British Machine Vision Conference Workshop on Image Analysis for Human Facial and Activity Recognition, Newcastle upon Tyne, UK, Sep 2018.

Supporting Grants

Northumbria University

Postgraduate Research Scholarship (Ref: ): £65,000, Principal Investigator ()
Received from Faculty of Engineering and Environment, Northumbria University, UK, 2018-2021
Project Page

Similar Research

Jingtian Zhang, Hubert P. H. Shum, Jungong Han and Ling Shao, "Action Recognition from Arbitrary Views Using Transferable Dictionary Learning", IEEE Transactions on Image Processing (TIP), 2018
Jingtian Zhang, Lining Zhang, Hubert P. H. Shum and Ling Shao, "Arbitrary View Action Recognition via Transfer Dictionary Learning on Synthetic Training Data", Proceedings of the 2016 IEEE International Conference on Robotics and Automation (ICRA), 2016
Meng Li, Howard Leung and Hubert P. H. Shum, "Human Action Recognition via Skeletal and Depth Based Feature Fusion", Proceedings of the 2016 ACM International Conference on Motion in Games (MIG), 2016
Zhengzhi Lu, He Wang, Ziyi Chang, Guoan Yang and Hubert P. H. Shum, "Hard No-Box Adversarial Attack on Skeleton-Based Human Action Recognition with Skeleton-Motion-Informed Gradient", Proceedings of the 2023 IEEE/CVF International Conference on Computer Vision (ICCV), 2023
Qianhui Men, Edmond S. L. Ho, Hubert P. H. Shum and Howard Leung, "Focalized Contrastive View-Invariant Learning for Self-Supervised Skeleton-Based Action Recognition", Neurocomputing, 2023

 

 

Last updated on 25 March 2024
RSS Feed