HOMEto discover JOINING USto achieve PUBLICATIONSto innovate GRANTSto establish ACTIVITIESto engage PEOPLEto collaborate TEACHINGto inspire CONTACTSto explore

HINT: High-quality INpainting Transformer with Mask-Aware Encoding and Enhanced Attention

Shuang Chen, Amir Atapour-Abarghouei and Hubert P. H. Shum
IEEE Transactions on Multimedia (TMM), 2024

Impact Factor: 8.4^† Top 10% Journal in Computer Science, Software Engineering^†

HINT: High-quality INpainting Transformer with Mask-Aware Encoding and Enhanced Attention

† According to Journal Citation Reports 2023

Abstract

Existing image inpainting methods leverage convolution-based downsampling approaches to reduce spatial dimensions. This may result in information loss from corrupted images where the available information is inherently sparse, especially for the scenario of large missing regions. Recent advances in self-attention mechanisms within transformers have led to significant improvements in many computer vision tasks including inpainting. However, limited by the computational costs, existing methods cannot fully exploit the efficacy of long-range modelling capabilities of such models. In this paper, we propose an end-to-end High-quality INpainting Transformer, abbreviated as HINT, which consists of a novel mask-aware pixel-shuffle downsampling module (MPD) to preserve the visible information extracted from the corrupted image while maintaining the integrity of the information available for high-level inferences made within the model. Moreover, we propose a Spatially-activated Channel Attention Layer (SCAL), an efficient self-attention mechanism interpreting spatial awareness to model the corrupted image at multiple scales. To further enhance the effectiveness of SCAL, motivated by recent advanced in speech recognition, we introduce a sandwich structure that places feed-forward networks before and after the SCAL module. We demonstrate the superior performance of HINT compared to contemporary state-of-the-art models on four datasets, CelebA, CelebA-HQ, Places2, and Dunhuang.

Downloads

Paper (5.0MB)

Supplementary Material (13.6MB)

Video (70.3MB)

GitHub

arXiv

DOI - Publisher's Page

YouTube

Citations

BibTeX

@article{chen24hint,
author={Chen, Shuang and Atapour-Abarghouei, Amir and Shum, Hubert P. H.},
journal={IEEE Transactions on Multimedia},
title={HINT: High-quality INpainting Transformer with Mask-Aware Encoding and Enhanced Attention},
year={2024},
volume={26},
pages={7649--7660},
numpages={12},
doi={10.1109/TMM.2024.3369897},
issn={1520-9210 },
publisher={IEEE},
}

RIS

TY  - JOUR
AU  - Chen, Shuang
AU  - Atapour-Abarghouei, Amir
AU  - Shum, Hubert P. H.
T2  - IEEE Transactions on Multimedia
TI  - HINT: High-quality INpainting Transformer with Mask-Aware Encoding and Enhanced Attention
PY  - 2024
VL  - 26
SP  - 7649
EP  - 7660
DO  - 10.1109/TMM.2024.3369897
SN  - 1520-9210
PB  - IEEE
ER  -

Plain Text

Shuang Chen, Amir Atapour-Abarghouei and Hubert P. H. Shum, "HINT: High-quality INpainting Transformer with Mask-Aware Encoding and Enhanced Attention," IEEE Transactions on Multimedia, vol. 26, pp. 7649-7660, IEEE, 2024.

Last updated on 25 July 2024
RSS Feed

HINT: High-quality INpainting Transformer with Mask-Aware Encoding and Enhanced Attention

Shuang Chen, Amir Atapour-Abarghouei and Hubert P. H. Shum
IEEE Transactions on Multimedia (TMM), 2024

Abstract

Downloads

YouTube

Citations

BibTeX

RIS

Plain Text

Supporting Grants

Similar Research

HINT: High-quality INpainting Transformer with Mask-Aware Encoding and Enhanced Attention

Shuang Chen, Amir Atapour-Abarghouei and Hubert P. H. ShumIEEE Transactions on Multimedia (TMM), 2024

Abstract

Downloads

YouTube

Citations

BibTeX

RIS

Plain Text

Supporting Grants

Similar Research

Shuang Chen, Amir Atapour-Abarghouei and Hubert P. H. Shum
IEEE Transactions on Multimedia (TMM), 2024