Papers
Topics
Authors
Recent
Search
2000 character limit reached

Optimizing Negative Prompts for Enhanced Aesthetics and Fidelity in Text-To-Image Generation

Published 12 Mar 2024 in cs.CV, cs.AI, and cs.LG | (2403.07605v3)

Abstract: In text-to-image generation, using negative prompts, which describe undesirable image characteristics, can significantly boost image quality. However, producing good negative prompts is manual and tedious. To address this, we propose NegOpt, a novel method for optimizing negative prompt generation toward enhanced image generation, using supervised fine-tuning and reinforcement learning. Our combined approach results in a substantial increase of 25% in Inception Score compared to other approaches and surpasses ground-truth negative prompts from the test set. Furthermore, with NegOpt we can preferentially optimize the metrics most important to us. Finally, we construct Negative Prompts DB (https://huggingface.co/datasets/mikeogezi/negopt_full), a publicly available dataset of negative prompts.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (17)
  1. Rlprompt: Optimizing discrete text prompts with reinforcement learning. arXiv preprint arXiv:2205.12548.
  2. Optimizing prompts for text-to-image generation. arXiv preprint arXiv:2212.09611.
  3. Clipscore: A reference-free evaluation metric for image captioning. arXiv preprint arXiv:2104.08718.
  4. Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980.
  5. The power of scale for parameter-efficient prompt tuning. arXiv preprint arXiv:2104.08691.
  6. Xiang Lisa Li and Percy Liang. 2021. Prefix-tuning: Optimizing continuous prompts for generation. arXiv preprint arXiv:2101.00190.
  7. Pre-train, prompt, and predict: A systematic survey of prompting methods in natural language processing. ACM Computing Surveys, 55(9):1–35.
  8. P-tuning: Prompt tuning can be comparable to fine-tuning across scales and tasks. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pages 61–68, Dublin, Ireland. Association for Computational Linguistics.
  9. Guanghui Qin and Jason Eisner. 2021. Learning how to ask: Querying lms with mixtures of soft prompts. arXiv preprint arXiv:2104.06599.
  10. Learning transferable visual models from natural language supervision. In International conference on machine learning, pages 8748–8763. PMLR.
  11. Exploring the limits of transfer learning with a unified text-to-text transformer. The Journal of Machine Learning Research, 21(1):5485–5551.
  12. Hierarchical text-conditional image generation with clip latents. arXiv preprint arXiv:2204.06125.
  13. High-resolution image synthesis with latent diffusion models. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 10684–10695.
  14. Improved techniques for training gans. Advances in neural information processing systems, 29.
  15. Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347.
  16. Sequence to sequence learning with neural networks. Advances in neural information processing systems, 27.
  17. Andrew Wong. 2023. How to use negative prompts?
Citations (1)

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Authors (2)

Collections

Sign up for free to add this paper to one or more collections.