Papers
Topics
Authors
Recent
Search
2000 character limit reached

Learning from True-False Labels via Multi-modal Prompt Retrieving

Published 24 May 2024 in cs.LG and cs.CV | (2405.15228v2)

Abstract: Pre-trained Vision-LLMs (VLMs) exhibit strong zero-shot classification abilities, demonstrating great potential for generating weakly supervised labels. Unfortunately, existing weakly supervised learning methods are short of ability in generating accurate labels via VLMs. In this paper, we propose a novel weakly supervised labeling setting, namely True-False Labels (TFLs) which can achieve high accuracy when generated by VLMs. The TFL indicates whether an instance belongs to the label, which is randomly and uniformly sampled from the candidate label set. Specifically, we theoretically derive a risk-consistent estimator to explore and utilize the conditional probability distribution information of TFLs. Besides, we propose a convolutional-based Multi-modal Prompt Retrieving (MRP) method to bridge the gap between the knowledge of VLMs and target learning tasks. Experimental results demonstrate the effectiveness of the proposed TFL setting and MRP learning method. The code to reproduce the experiments is at https://github.com/Tranquilxu/TMP.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (49)
  1. An image is worth 16x16 words: Transformers for image recognition at scale. In Proceedings of the International Conference on Learning Representations, 2021.
  2. Detrs with collaborative hybrid assignments training. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 6748–6758, 2023.
  3. Encoder-decoder with atrous separable convolution for semantic image segmentation. In Proceedings of the European Conference on Computer Vision, pages 801–818, 2018.
  4. A survey on semi-supervised learning. Machine Learning, 109(2):373–440, 2020.
  5. Open-world semi-supervised learning. In Proceedings of the International Conference on Learning Representations, 2022.
  6. Robust semi-supervised learning when not all classes have labels. Advances in Neural Information Processing Systems, 35:3305–3317, 2022.
  7. Semi-supervised auc optimization based on positive-unlabeled learning. Machine Learning, 107:767–794, 2018.
  8. Partial optimal tranport with applications on positive-unlabeled learning. Advances in Neural Information Processing Systems, 33:2903–2913, 2020.
  9. Predictive adversarial learning from positive and unlabeled data. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 35, pages 7806–7814, 2021.
  10. Learning from corrupted binary labels via class-probability estimation. In Proceedings of the International Conference on Machine Learning, pages 125–134, 2015.
  11. Robust loss functions under label noise for deep neural networks. In Proceedings of the AAAI conference on Artificial Intelligence, volume 31, 2017.
  12. Sigua: Forgetting may make learning with noisy labels more robust. In Proceedings of the International Conference on Machine Learning, pages 4006–4016, 2020.
  13. Learning from partial labels. The Journal of Machine Learning Research, 12:1501–1536, 2011.
  14. Provably consistent partial-label learning. Advances in Neural Information Processing Systems, 33:10948–10960, 2020.
  15. Towards effective visual representations for partial-label learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 15589–15598, 2023.
  16. Learning from complementary labels. Advances in Neural Information Processing Systems, 30, 2017.
  17. Learning with biased complementary labels. In Proceedings of the European Conference on Computer Vision, pages 68–83, 2018.
  18. Yi Gao and Min-Ling Zhang. Discriminative complementary-label learning with weighted loss. In Proceedings of the International Conference on Machine Learning, pages 3587–3597, 2021.
  19. Learning transferable visual models from natural language supervision. In Proceedings of the International Conference on Machine Learning, pages 8748–8763, 2021.
  20. Blip: Bootstrapping language-image pre-training for unified vision-language understanding and generation. In Proceedings of the International Conference on Machine Learning, pages 12888–12900, 2022.
  21. Visual instruction tuning. Advances in Neural Information Processing Systems, 36, 2024.
  22. Debiased learning from naturally imbalanced pseudo-labels. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 14647–14657, 2022.
  23. Enhancing clip with clip: Exploring pseudolabeling for limited-label prompt tuning. Advances in Neural Information Processing Systems, 36:60984–61007, 2023.
  24. Learning generative visual models from few training examples: An incremental bayesian approach tested on 101 object categories. In 2004 Conference on Computer Vision and Pattern Recognition Workshop, pages 178–178, 2004.
  25. Zhi-Hua Zhou. A brief introduction to weakly supervised learning. National Science Review, 5(1):44–53, 2018.
  26. Semi-supervised learning by entropy minimization. Advances in Neural Information Processing Systems, 17, 2004.
  27. Dong-Hyun Lee et al. Pseudo-label: The simple and efficient semi-supervised learning method for deep neural networks. In Workshop on Challenges in Representation Learning, ICML, volume 3, page 896, 2013.
  28. Regularization with stochastic transformations and perturbations for deep semi-supervised learning. Advances in Neural Information processing Systems, 29, 2016.
  29. Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning results. Advances in Neural Tnformation Processing Systems, 30, 2017.
  30. Virtual adversarial training: a regularization method for supervised and semi-supervised learning. IEEE Transactions on Pattern Analysis and Machine Intelligence, 41(8):1979–1993, 2018.
  31. Mixmatch: A holistic approach to semi-supervised learning. Advances in Neural Tnformation Processing Systems, 32, 2019.
  32. Fixmatch: Simplifying semi-supervised learning with consistency and confidence. Advances in Neural Tnformation Processing Systems, 33:596–608, 2020.
  33. Marginmatch: Improving semi-supervised learning with pseudo-margins. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 15773–15782, 2023.
  34. Partial label learning via feature-aware disambiguation. In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 1335–1344, 2016.
  35. Confidence-rated discriminative partial label learning. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 31, 2017.
  36. Partial label learning via label enhancement. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 33, pages 5557–5564, 2019.
  37. Learning from ambiguously labeled examples. In Proceedings of the International Symposium on Intelligent Data Analysis, pages 168–179, 2005.
  38. Complementary-label learning for arbitrary losses and models. In Proceedings of the International Conference on Machine Learning, pages 2971–2980, 2019.
  39. Pre-train, prompt, and predict: A systematic survey of prompting methods in natural language processing. ACM Computing Surveys, 55(9):1–35, 2023.
  40. Prompt learning in computer vision: a survey. Frontiers of Information Technology & Electronic Engineering, 25(1):42–63, 2024.
  41. Learning to prompt for vision-language models. International Journal of Computer Vision, 130(9):2337–2348, 2022.
  42. Conditional prompt learning for vision-language models. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 16816–16825, 2022.
  43. Visual prompt tuning. In Proceedings of the European Conference on Computer Vision, pages 709–727, 2022.
  44. Maple: Multi-modal prompt learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 19113–19122, 2023.
  45. Learning multiple layers of features from tiny images. 2009.
  46. Ya Le and Xuan Yang. Tiny imagenet visual recognition challenge. CS 231N, 7(7):3, 2015.
  47. Food-101–mining discriminative components with random forests. In Proceedings of the European Conference on Computer Vision, pages 446–461, 2014.
  48. 3d object representations for fine-grained categorization. In Proceedings of the IEEE International Conference on Computer Vision Workshops, pages 554–561, 2013.
  49. Decoupled weight decay regularization. In Proceedings of the International Conference on Learning Representations, 2019.

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 1 tweet with 0 likes about this paper.