Focalized Contrastive View-Invariant Learning for Self-Supervised Skeleton-Based Action Recognition

Qianhui Men, Edmond S. L. Ho, Hubert P. H. Shum and Howard Leung
Neurocomputing, 2023

 Impact Factor: 5.5 Top 25% Journal in Computer Science, Artificial Intelligence Citation: 12#

Focalized Contrastive View-Invariant Learning for Self-Supervised Skeleton-Based Action Recognition
# According to Google Scholar 2024

Abstract

Learning view-invariant representation is a key to improving feature discrimination power for skeleton-based action recognition. Existing approaches cannot effectively remove the impact of viewpoint due to the implicit view-dependent representations. In this work, we propose a self-supervised framework called Focalized Contrastive View-invariant Learning (FoCoViL), which significantly suppresses the view-specific information on the representation space where the viewpoints are coarsely aligned. By maximizing mutual information with an effective contrastive loss between multi-view sample pairs, FoCoViL associates actions with common view-invariant properties and simultaneously separates the dissimilar ones. We further propose an adaptive focalization method based on pairwise similarity to enhance contrastive learning for a clearer cluster boundary in the learned space. Different from many existing self-supervised representation learning work that rely heavily on supervised classifiers, FoCoViL performs well on both unsupervised and supervised classifiers with superior recognition performance. Extensive experiments also show that the proposed contrastive-based focalization generates a more discriminative latent representation.


Downloads


YouTube


Cite This Research

Plain Text

Qianhui Men, Edmond S. L. Ho, Hubert P. H. Shum and Howard Leung, "Focalized Contrastive View-Invariant Learning for Self-Supervised Skeleton-Based Action Recognition," Neurocomputing, vol. 537, pp. 198-209, Elsevier, 2023.

BibTeX

@article{men23focalized,
 author={Men, Qianhui and Ho, Edmond S. L. and Shum, Hubert P. H. and Leung, Howard},
 journal={Neurocomputing},
 series={Neurocomputing '24},
 title={Focalized Contrastive View-Invariant Learning for Self-Supervised Skeleton-Based Action Recognition},
 year={2023},
 volume={537},
 pages={198--209},
 numpages={12},
 doi={10.1016/j.neucom.2023.03.070},
 issn={0925-2312},
 publisher={Elsevier},
}

RIS

TY  - JOUR
AU  - Men, Qianhui
AU  - Ho, Edmond S. L.
AU  - Shum, Hubert P. H.
AU  - Leung, Howard
T2  - Neurocomputing
TI  - Focalized Contrastive View-Invariant Learning for Self-Supervised Skeleton-Based Action Recognition
PY  - 2023
VL  - 537
SP  - 198
EP  - 209
DO  - 10.1016/j.neucom.2023.03.070
SN  - 0925-2312
PB  - Elsevier
ER  - 


Supporting Grants


Similar Research

Meng Li, Howard Leung and Hubert P. H. Shum, "Human Action Recognition via Skeletal and Depth Based Feature Fusion", Proceedings of the 2016 ACM International Conference on Motion in Games (MIG), 2016
Ying Huang, Hubert P. H. Shum, Edmond S. L. Ho and Nauman Aslam, "High-Speed Multi-Person Pose Estimation with Deep Feature Transfer", Computer Vision and Image Understanding (CVIU), 2020
Qianhui Men, Howard Leung, Edmond S. L. Ho and Hubert P. H. Shum, "A Two-Stream Recurrent Network for Skeleton-Based Human Interaction Recognition", Proceedings of the 2020 International Conference on Pattern Recognition (ICPR), 2020
Zheming Zuo, Daniel Organisciak, Hubert P. H. Shum and Longzhi Yang, "Saliency-Informed Spatio-Temporal Vector of Locally Aggregated Descriptors and Fisher Vectors for Visual Action Recognition", Proceedings of the 2018 British Machine Vision Conference Workshop on Image Analysis for Human Facial and Activity Recognition (IAHFAR), 2018
Zhengzhi Lu, He Wang, Ziyi Chang, Guoan Yang and Hubert P. H. Shum, "Hard No-Box Adversarial Attack on Skeleton-Based Human Action Recognition with Skeleton-Motion-Informed Gradient", Proceedings of the 2023 IEEE/CVF International Conference on Computer Vision (ICCV), 2023

 

Last updated on 7 September 2024
RSS Feed