Papers
Topics
Authors
Recent
Search
2000 character limit reached

Towards Understanding Why Mask-Reconstruction Pretraining Helps in Downstream Tasks

Published 8 Jun 2022 in cs.LG, cs.CV, cs.NE, and stat.ML | (2206.03826v5)

Abstract: For unsupervised pretraining, mask-reconstruction pretraining (MRP) approaches, e.g. MAE and data2vec, randomly mask input patches and then reconstruct the pixels or semantic features of these masked patches via an auto-encoder. Then for a downstream task, supervised fine-tuning the pretrained encoder remarkably surpasses the conventional ``supervised learning'' (SL) trained from scratch. However, it is still unclear 1) how MRP performs semantic feature learning in the pretraining phase and 2) why it helps in downstream tasks. To solve these problems, we first theoretically show that on an auto-encoder of a two/one-layered convolution encoder/decoder, MRP can capture all discriminative features of each potential semantic class in the pretraining dataset. Then considering the fact that the pretraining dataset is of huge size and high diversity and thus covers most features in downstream dataset, in fine-tuning phase, the pretrained encoder can capture as much features as it can in downstream datasets, and would not lost these features with theoretical guarantees. In contrast, SL only randomly captures some features due to lottery ticket hypothesis. So MRP provably achieves better performance than SL on the classification tasks. Experimental results testify to our data assumptions and also our theoretical implications.

Citations (12)

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Authors (3)

Collections

Sign up for free to add this paper to one or more collections.