HOMEto discover JOINING USto achieve PUBLICATIONSto innovate GRANTSto establish ACTIVITIESto engage PEOPLEto collaborate TEACHINGto inspire CONTACTSto explore

Multi-Task Spatial-Temporal Graph Auto-Encoder for Hand Motion Denoising

Kanglei Zhou, Hubert P. H. Shum, Frederick W. B. Li and Xiaohui Liang
IEEE Transactions on Visualization and Computer Graphics (TVCG), 2024

Impact Factor: 4.7^† Top 25% Journal in Computer Science, Software Engineering^†

Multi-Task Spatial-Temporal Graph Auto-Encoder for Hand Motion Denoising

† According to Journal Citation Reports 2023

Abstract

In many human-computer interaction applications, fast and accurate hand tracking is necessary for an immersive experience. However, raw hand motion data can be flawed due to issues such as joint occlusions and high-frequency noise, hindering the interaction. Using only current motion for interaction can lead to lag, so predicting future movement is crucial for a faster response. Our solution is the Multi-task Spatial-Temporal Graph Auto-Encoder (Multi-STGAE), a model that accurately denoises and predicts hand motion by exploiting the inter-dependency of both tasks. The model ensures a stable and accurate prediction through denoising while maintaining motion dynamics to avoid over-smoothed motion and alleviate time delays through prediction. A gate mechanism is integrated to prevent negative transfer between tasks and further boost multi-task performance. Multi-STGAE also includes a spatial-temporal graph autoencoder block, which models hand structures and motion coherence through graph convolutional networks, reducing noise while preserving hand physiology. Additionally, we design a novel hand partition strategy and hand bone loss to improve natural hand motion generation. We validate the effectiveness of our proposed method by contributing two large-scale datasets with a data corruption algorithm based on two benchmark datasets. To evaluate the natural characteristics of the denoised and predicted hand motion, we propose two structural metrics. Experimental results show that our method outperforms the state-of-the-art, showcasing how the multi-task framework enables mutual benefits between denoising and prediction.

Downloads

Paper (4.1MB)

Video (100.5MB)

GitHub

DOI - Publisher's Page

YouTube

Citations

BibTeX

@article{zhou24multitask,
author={Zhou, Kanglei and Shum, Hubert P. H. and Li, Frederick W. B. and Liang, Xiaohui},
journal={IEEE Transactions on Visualization and Computer Graphics},
title={Multi-Task Spatial-Temporal Graph Auto-Encoder for Hand Motion Denoising},
year={2024},
numpages={17},
doi={10.1109/TVCG.2023.3337868},
publisher={IEEE},
}

RIS

TY  - JOUR
AU  - Zhou, Kanglei
AU  - Shum, Hubert P. H.
AU  - Li, Frederick W. B.
AU  - Liang, Xiaohui
T2  - IEEE Transactions on Visualization and Computer Graphics
TI  - Multi-Task Spatial-Temporal Graph Auto-Encoder for Hand Motion Denoising
PY  - 2024
DO  - 10.1109/TVCG.2023.3337868
PB  - IEEE
ER  -

Plain Text

Kanglei Zhou, Hubert P. H. Shum, Frederick W. B. Li and Xiaohui Liang, "Multi-Task Spatial-Temporal Graph Auto-Encoder for Hand Motion Denoising," IEEE Transactions on Visualization and Computer Graphics, IEEE, 2024.

Similar Research

Kanglei Zhou, Jiaying Chen, Hubert P. H. Shum, Frederick W. B. Li and Xiaohui Liang, "STGAE: Spatial Temporal Graph Auto-Encoder for Hand Motion Denoising", Proceedings of the 2021 IEEE International Symposium on Mixed and Augmented Reality (ISMAR), 2021

Zhiying Leng, Jiaying Chen, Hubert P. H. Shum, Frederick W. B. Li and Xiaohui Liang, "Stable Hand Pose Estimation under Tremor via Graph Neural Network", Proceedings of the 2021 IEEE Conference on Virtual Reality and 3D User Interfaces (VR), 2021

Qi Feng, Hubert P. H. Shum and Shigeo Morishima, "Resolving Hand-Object Occlusion for Mixed Reality with Joint Deep Learning and Model Optimization", Computer Animation and Virtual Worlds (CAVW) - Proceedings of the 2020 International Conference on Computer Animation and Social Agents (CASA), 2020

Last updated on 25 July 2024
RSS Feed