On the Challenges and Opportunities in Generative AI
Abstract: The field of deep generative modeling has grown rapidly in the last few years. With the availability of massive amounts of training data coupled with advances in scalable unsupervised learning paradigms, recent large-scale generative models show tremendous promise in synthesizing high-resolution images and text, as well as structured data such as videos and molecules. However, we argue that current large-scale generative AI models exhibit several fundamental shortcomings that hinder their widespread adoption across domains. In this work, our objective is to identify these issues and highlight key unresolved challenges in modern generative AI paradigms that should be addressed to further enhance their capabilities, versatility, and reliability. By identifying these challenges, we aim to provide researchers with insights for exploring fruitful research directions, thus fostering the development of more robust and accessible generative AI solutions.
- Chemberta-2: Towards chemical foundation models. arXiv preprint arXiv:2209.01712, 2022.
- Interventional causal representation learning. In International conference on machine learning, pages 372–407. PMLR, 2023.
- Evaluating the fairness of discriminative foundation models in computer vision. In Proceedings of the 2023 AAAI/ACM Conference on AI, Ethics, and Society, 2023.
- Wasserstein generative adversarial networks. In Doina Precup and Yee Whye Teh, editors, Proceedings of the 34th International Conference on Machine Learning, volume 70 of Proceedings of Machine Learning Research, pages 214–223. PMLR, 06–11 Aug 2017. URL https://proceedings.mlr.press/v70/arjovsky17a.html.
- Neural machine translation by jointly learning to align and translate. In 3rd International Conference on Learning Representations, ICLR 2015, 2015.
- Constitutional ai: Harmlessness from ai feedback. arXiv preprint arXiv:2212.08073, 2022.
- Robust optimal transport with applications in generative modeling and domain adaptation. Advances in Neural Information Processing Systems, 33:12934–12944, 2020.
- Analysis of representations for domain adaptation. In NIPS, 2006.
- A theory of learning from different domains. In Machine Learning, 2010.
- On the dangers of stochastic parrots: Can language models be too big? Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency, 2021.
- Demystifying MMD GANs. In International Conference on Learning Representations, 2018. URL https://openreview.net/forum?id=r1lUOzWCW.
- Machine learning techniques for protein function prediction. Proteins: Structure, Function, and Bioinformatics, 88(3):397–413, 2020.
- Ali Borji. Pros and cons of gan evaluation measures: New developments. Computer Vision and Image Understanding, 215:103329, 2022.
- Audiolm: a language modeling approach to audio generation. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2023.
- Mark Braverman. On the complexity of real functions. In 46th Annual IEEE Symposium on Foundations of Computer Science (FOCS’05), pages 155–164. IEEE, 2005.
- Mark Braverman. Communication and information complexity, page 284–320. EMS Press, December 2023. ISBN 9783985475599. doi: 10.4171/icm2022/208. URL http://dx.doi.org/10.4171/icm2022/208.
- Video generation models as world simulators. 2024. URL https://openai.com/research/video-generation-models-as-world-simulators.
- Language models are few-shot learners. In H. Larochelle, M. Ranzato, R. Hadsell, M.F. Balcan, and H. Lin, editors, Advances in Neural Information Processing Systems, volume 33, pages 1877–1901. Curran Associates, Inc., 2020a.
- Language models are few-shot learners. arXiv preprint arXiv:2005.14165, 2020b.
- Language gans falling short. In International Conference on Learning Representations, 2019.
- Population stratification and spurious allelic association. The Lancet, 361(9357):598–604, 2003.
- Extracting training data from diffusion models. In Proceedings of the 32nd USENIX Conference on Security Symposium, SEC ’23, 2023a.
- Extracting training data from diffusion models. In 32nd USENIX Security Symposium (USENIX Security 23), pages 5253–5270, 2023b.
- SR. Carr and N. Jeffrey. Class Action Complaint. Sarah Anderson, et al., v. Stability AI LTD., et al, 2022.
- Do large language models understand chemistry? a conversation with chatgpt. Journal of Chemical Information and Modeling, 63(6):1649–1655, 2023. doi: 10.1021/acs.jcim.3c00285. URL https://doi.org/10.1021/acs.jcim.3c00285. PMID: 36926868.
- Chatgpt’s one-year anniversary: Are open-source large language models catching up? ArXiv, abs/2311.16989, 2023.
- Probabilistic machine learning for healthcare. Annual review of biomedical data science, 4:393–415, 2021a.
- Evaluating large language models trained on code. CoRR, abs/2107.03374, 2021b.
- Learning (very) simple generative models is hard. Advances in Neural Information Processing Systems, 35:35143–35155, 2022.
- C2l: Causally contrastive learning for robust text classification. Proceedings of the AAAI Conference on Artificial Intelligence, 36(10):10526–10534, Jun. 2022.
- Effectively unbiased fid and inception score and where to find them. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 6070–6079, 2020.
- Palm: Scaling language modeling with pathways. Journal of Machine Learning Research, 24:240:1–240:113, 2022.
- Don’t take the easy way out: Ensemble based methods for avoiding known dataset biases. In Kentaro Inui, Jing Jiang, Vincent Ng, and Xiaojun Wan, editors, Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, pages 4067–4080. Association for Computational Linguistics, 2019.
- When more is less: Incorporating additional datasets can hurt performance by introducing spurious correlations. In Machine Learning for Healthcare Conference, pages 110–127. PMLR, 2023.
- Imagenet: A large-scale hierarchical image database. In 2009 IEEE conference on computer vision and pattern recognition, pages 248–255. Ieee, 2009.
- Deep generative image models using a laplacian pyramid of adversarial networks. Advances in neural information processing systems, 28, 2015.
- Gpt3. int8 (): 8-bit matrix multiplication for transformers at scale. Advances in Neural Information Processing Systems, 35:30318–30332, 2022.
- Luc Devroye. Complexity questions in non-uniform random variate generation. In Proceedings of COMPSTAT’2010: 19th International Conference on Computational StatisticsParis France, August 22-27, 2010 Keynote, Invited and Contributed Papers, pages 3–18. Springer, 2010.
- Continuous diffusion for categorical data. arXiv preprint arXiv:2211.15089, 2022.
- Score-based generative modeling with critically-damped langevin diffusion. In International Conference on Learning Representations, 2022.
- Differentially private diffusion models. Transactions on Machine Learning Research, 2023. ISSN 2835-8856.
- Generating images with perceptual similarity metrics based on deep networks. Advances in neural information processing systems, 29, 2016.
- An image is worth 16x16 words: Transformers for image recognition at scale. In International Conference on Learning Representations, 2020.
- Palm-e: An embodied multimodal language model. In International Conference on Machine Learning, 2023.
- Shortcut learning of large language models in natural language understanding. Communications of the ACM, 67(1):110–120, 2023a.
- Robustness challenges in model distillation and pruning for natural language understanding. In Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics, pages 1758–1770, 2023b.
- Measuring and Improving Consistency in Pretrained Language Models. Transactions of the Association for Computational Linguistics, 9:1012–1031, 12 2021. ISSN 2307-387X. doi: 10.1162/tacl_a_00410. URL https://doi.org/10.1162/tacl_a_00410.
- Taming transformers for high-resolution image synthesis. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 12873–12883, 2021.
- Minedojo: Building open-ended embodied agents with internet-scale knowledge. In Thirty-sixth Conference on Neural Information Processing Systems Datasets and Benchmarks Track, 2022.
- Vincent Fortuin. Priors in Bayesian deep learning: A review. International Statistical Review, 90(3):563–591, 2022.
- SOM-VAE: Interpretable discrete representation learning on time series. In International Conference on Learning Representations, 2019.
- GP-VAE: Deep probabilistic time series imputation. In International Conference on Artificial Intelligence and Statistics, pages 1651–1661. PMLR, 2020.
- A representative study on human detection of artificially generated media across countries. In 2024 IEEE Symposium on Security and Privacy (SP), pages 159–159. IEEE Computer Society, 2024.
- OPTQ: Accurate quantization for generative pre-trained transformers. In The Eleventh International Conference on Learning Representations, 2023. URL https://openreview.net/forum?id=tcbBPnfwxS.
- Vision-language pre-training: Basics, recent advances, and future trends. Found. Trends. Comput. Graph. Vis., 14(3–4):163–352, dec 2022. ISSN 1572-2740.
- Improving alignment of dialogue agents via targeted human judgements. arXiv preprint arXiv:2209.14375, 2022.
- Lipstick on a pig: Debiasing methods cover up systematic gender biases in word embeddings but do not remove them. In Jill Burstein, Christy Doran, and Thamar Solorio, editors, Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 609–614, Minneapolis, Minnesota, June 2019. Association for Computational Linguistics. doi: 10.18653/v1/N19-1061. URL https://aclanthology.org/N19-1061.
- Generative adversarial nets. In Z. Ghahramani, M. Welling, C. Cortes, N. Lawrence, and K.Q. Weinberger, editors, Advances in Neural Information Processing Systems, volume 27. Curran Associates, Inc., 2014. URL https://proceedings.neurips.cc/paper_files/paper/2014/file/5ca3e9b122f61f8f06494c97b1afccf3-Paper.pdf.
- Social media: generative AI could harm mental health. Nature, 617(7962):676–676, May 2023. doi: 10.1038/d41586-023-01693-.
- Mamba: Linear-time sequence modeling with selective state spaces. arXiv preprint arXiv:2312.00752, 2023.
- Efficiently modeling long sequences with structured state spaces. In International Conference on Learning Representations, 2021.
- Decompdiff: Diffusion models with decomposed priors for structure-based drug design. In International Conference on Machine Learning, 2023.
- On calibration of modern neural networks. In International conference on machine learning, pages 1321–1330. PMLR, 2017.
- Annotation artifacts in natural language inference data. In Marilyn Walker, Heng Ji, and Amanda Stent, editors, Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers), pages 107–112, New Orleans, Louisiana, June 2018. Association for Computational Linguistics. doi: 10.18653/v1/N18-2017. URL https://aclanthology.org/N18-2017.
- Robust transfer learning with pretrained language models through adapters. In Chengqing Zong, Fei Xia, Wenjie Li, and Roberto Navigli, editors, Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, ACL/IJCNLP 2021, (Volume 2: Short Papers), Virtual Event, August 1-6, 2021, pages 854–861. Association for Computational Linguistics, 2021. doi: 10.18653/V1/2021.ACL-SHORT.108. URL https://doi.org/10.18653/v1/2021.acl-short.108.
- Interactive fiction games: A colossal adventure. Proceedings of the AAAI Conference on Artificial Intelligence, 34(05):7903–7910, Apr. 2020.
- Minimal random code learning: Getting bits back from compressed model parameters. In 7th International Conference on Learning Representations, ICLR 2019, 2019.
- Variational autoencoders with jointly optimized latent dependency structure. In International Conference on Learning Representations, 2019. URL https://openreview.net/forum?id=SJgsCjCqt7.
- Todd C. Helmus. Artificial Intelligence, Deepfakes, and Disinformation: A Primer. RAND Corporation, Santa Monica, CA, 2022. doi: 10.7249/PEA1043-1.
- Gans trained by a two time-scale update rule converge to a local nash equilibrium. Advances in neural information processing systems, 30, 2017.
- Denoising diffusion probabilistic models. In H. Larochelle, M. Ranzato, R. Hadsell, M.F. Balcan, and H. Lin, editors, Advances in Neural Information Processing Systems, volume 33, pages 6840–6851. Curran Associates, Inc., 2020. URL https://proceedings.neurips.cc/paper_files/paper/2020/file/4c5bcfec8584af0d967f1ab10179ca4b-Paper.pdf.
- Video diffusion models. In Advances in Neural Information Processing Systems, volume 35, pages 8633–8646. Curran Associates, Inc., 2022a.
- Video diffusion models. In Advances in Neural Information Processing Systems, volume 35, pages 8633–8646, 2022b.
- simple diffusion: End-to-end diffusion for high resolution images. arXiv preprint arXiv:2301.11093, 2023.
- Learning to simulate high energy particle collisions from unlabeled data. Scientific Reports, 12(1):7567, 2022.
- Strategies for pre-training graph neural networks. In International Conference on Learning Representations, 2020.
- Noise2music: Text-conditioned music generation with diffusion models. arXiv preprint arXiv:2302.03917, 2023.
- Pareto gan: Extending the representational power of gans to heavy-tailed distributions. In International Conference on Machine Learning, pages 4523–4532. PMLR, 2021.
- Adversarial examples are not bugs, they are features. In H. Wallach, H. Larochelle, A. Beygelzimer, F. d'Alché-Buc, E. Fox, and R. Garnett, editors, Advances in Neural Information Processing Systems, volume 32. Curran Associates, Inc., 2019.
- Attention is not explanation. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 3543–3556, 2019.
- Evaluating dynamic topic models. arXiv preprint arXiv:2309.08627, 2023.
- Multi-modal bias: Introducing a framework for stereotypical bias assessment beyond gender and race in vision–language models. In Andreas Vlachos and Isabelle Augenstein, editors, Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics, pages 1725–1735, Dubrovnik, Croatia, May 2023. Association for Computational Linguistics.
- Scalable Gaussian process variational autoencoders. In International Conference on Artificial Intelligence and Statistics, pages 3511–3519. PMLR, 2021.
- Have we learned to explain?: How interpretability methods can learn to encode predictions in their interpretations. In International Conference on Artificial Intelligence and Statistics, pages 1459–1467. PMLR, 2021.
- Linear and generalized linear mixed models and their applications, volume 1. Springer, 2007.
- Combining machine learning with physical knowledge in thermodynamic modeling of fluid mixtures. Annual Review of Chemical and Biomolecular Engineering, 14(1):31–51, 2023. doi: 10.1146/annurev-chembioeng-092220-025342. URL https://doi.org/10.1146/annurev-chembioeng-092220-025342. PMID: 36944250.
- Making thermodynamic models of mixtures predictive by machine learning: matrix completion of pair interactions. Chemical Science, 13(17):4854–4862, 2022.
- Highly accurate protein structure prediction with alphafold. Nature, 596(7873):583–589, July 2021. ISSN 1476-4687. doi: 10.1038/s41586-021-03819-2. URL http://dx.doi.org/10.1038/s41586-021-03819-2.
- Language models (mostly) know what they know. ArXiv, abs/2207.05221, 2022a. URL https://api.semanticscholar.org/CorpusID:250451161.
- Language models (mostly) know what they know. arXiv preprint arXiv:2207.05221, 2022b.
- Calibrated language models must hallucinate. arXiv preprint arXiv:2311.14648, 2023.
- Large language models struggle to learn long-tail knowledge. In Andreas Krause, Emma Brunskill, Kyunghyun Cho, Barbara Engelhardt, Sivan Sabato, and Jonathan Scarlett, editors, Proceedings of the 40th International Conference on Machine Learning, volume 202 of Proceedings of Machine Learning Research, pages 15696–15707. PMLR, 23–29 Jul 2023.
- Analyzing and improving the image quality of StyleGAN. In Proc. CVPR, 2020.
- Elucidating the design space of diffusion-based generative models. Advances in Neural Information Processing Systems, 35:26565–26577, 2022.
- Analyzing and improving the training dynamics of diffusion models. arXiv preprint arXiv:2312.02696, 2023.
- Auto-encoding variational bayes. In International Conference on Learning Representations, 2014.
- A watermark for large language models. In Andreas Krause, Emma Brunskill, Kyunghyun Cho, Barbara Engelhardt, Sivan Sabato, and Jonathan Scarlett, editors, Proceedings of the 40th International Conference on Machine Learning, volume 202 of Proceedings of Machine Learning Research, pages 17061–17084. PMLR, 23–29 Jul 2023. URL https://proceedings.mlr.press/v202/kirchenbauer23a.html.
- Training normalizing flows from dependent data. In International Conference on Machine Learning, pages 17105–17121. PMLR, 2023.
- From identifiable causal representations to controllable counterfactual generation: A survey on causal generative modeling. ArXiv, abs/2310.11011, 2023.
- Generative and reinforcement learning approaches for the automated de novo design of bioactive compounds. Communications Chemistry, 5(1):129, October 2022. ISSN 2399-3669.
- Raghuraman Krishnamoorthi. Quantizing deep convolutional networks for efficient inference: A whitepaper. ArXiv, abs/1806.08342, 2018.
- The role of imagenet classes in fr\\\backslash\’echet inception distance. International Conference on Learning Representations, 2023.
- Marginal tail-adaptive normalizing flows. In International Conference on Machine Learning, pages 12020–12048. PMLR, 2022.
- Benchmarking the fairness of image upsampling methods. arXiv preprint arXiv:2401.13555, 2024.
- Priorgrad: Improving conditional denoising diffusion models with data-dependent adaptive prior. In International Conference on Learning Representations, 2021.
- Multimodal foundation models: From specialists to general-purpose assistants. ArXiv, abs/2309.10020, 2023a.
- Learning to play text-based adventure games with maximum entropy reinforcement learning. In Machine Learning and Knowledge Discovery in Databases (ECML PKDD), 2023b.
- Ethics in action: Training reinforcement learning agent for moral decision-making in text-based adventure games. In Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), 2024.
- Diffusion-LM improves controllable text generation. In Alice H. Oh, Alekh Agarwal, Danielle Belgrave, and Kyunghyun Cho, editors, Advances in Neural Information Processing Systems, 2022a.
- Q-diffusion: Quantizing diffusion models. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 17535–17545, 2023c.
- Large language models can be strong differentially private learners. In International Conference on Learning Representations, 2021.
- Competition-level code generation with alphacode. Science, 378(6624):1092–1097, 2022b.
- Adversarial example does good: Preventing painting imitation from diffusion models via adversarial examples. In Andreas Krause, Emma Brunskill, Kyunghyun Cho, Barbara Engelhardt, Sivan Sabato, and Jonathan Scarlett, editors, Proceedings of the 40th International Conference on Machine Learning, volume 202 of Proceedings of Machine Learning Research, pages 20763–20786. PMLR, 23–29 Jul 2023.
- Fat–tailed variational inference with anisotropic tail adaptive flows. In International Conference on Machine Learning, pages 13257–13270. PMLR, 2022.
- Questioning the ai: informing design practices for explainable ai user experiences. In Proceedings of the 2020 CHI conference on human factors in computing systems, pages 1–15, 2020.
- Chin-Yew Lin. Rouge: A package for automatic evaluation of summaries. In Text summarization branches out, pages 74–81, 2004.
- AudioLDM: Text-to-audio generation with latent diffusion models. In Andreas Krause, Emma Brunskill, Kyunghyun Cho, Barbara Engelhardt, Sivan Sabato, and Jonathan Scarlett, editors, Proceedings of the 40th International Conference on Machine Learning, volume 202 of Proceedings of Machine Learning Research, pages 21450–21474. PMLR, 23–29 Jul 2023a. URL https://proceedings.mlr.press/v202/liu23f.html.
- Pseudo numerical methods for diffusion models on manifolds. In International Conference on Learning Representations, 2022.
- Adversarial training for large neural language models. ArXiv, 2020a.
- Adversarial training for large neural language models. arXiv preprint arXiv:2004.08994, 2020b.
- Llm-qat: Data-free quantization aware training for large language models. arXiv preprint arXiv:2305.17888, 2023b.
- Weakly-supervised disentanglement without compromises. In International Conference on Machine Learning, pages 6348–6359. PMLR, 2020.
- Dpm-solver: A fast ode solver for diffusion probabilistic model sampling in around 10 steps. Advances in Neural Information Processing Systems, 35:5775–5787, 2022.
- Vilbert: Pretraining task-agnostic visiolinguistic representations for vision-and-language tasks. In Neural Information Processing Systems, 2019.
- Machine learning for precision medicine. Genome, 64(4):416–425, 2021.
- T-DPSOM: An interpretable clustering method for unsupervised learning of patient health states. In Proceedings of the Conference on Health, Inference, and Learning, pages 236–245, 2021.
- A deep variational approach to clustering survival data. In International Conference on Learning Representations, 2022. URL https://openreview.net/forum?id=RQ428ZptQfU.
- Tree variational autoencoders. In Thirty-seventh Conference on Neural Information Processing Systems, 2023. URL https://openreview.net/forum?id=adq0oXb9KM.
- Right for the wrong reasons: Diagnosing syntactic heuristics in natural language inference. In Anna Korhonen, David Traum, and Lluís Màrquez, editors, Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 3428–3448, Florence, Italy, July 2019. Association for Computational Linguistics.
- On distillation of guided diffusion models. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 14297–14306, 2023.
- The imperative for regulatory oversight of large language models (or generative ai) in healthcare. NPJ Digital Medicine, 6, 2023.
- Diffusion based representation learning. In International Conference on Machine Learning, 2021.
- Assumption violations in causal discovery and the robustness of score matching. Advances in Neural Information Processing Systems, 36, 2024.
- Measuring axiomatic soundness of counterfactual image models. In The Eleventh International Conference on Learning Representations, 2023.
- Foundation models for generalist medical artificial intelligence. Nature 616, 259–265, 2023.
- Scalable extraction of training data from (production) language models. arXiv preprint arxiv:2311.17035, 2023.
- Probing neural network comprehension of natural language arguments. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 4658–4664, 2019.
- OpenAI. Gpt-4 technical report. ArXiv, abs/2303.08774, 2023.
- A call for standardization and validation of text style transfer evaluation. In Findings of the Association for Computational Linguistics: ACL 2023, pages 10791–10815, Toronto, Canada, July 2023. Association for Computational Linguistics. doi: 10.18653/v1/2023.findings-acl.687. URL https://aclanthology.org/2023.findings-acl.687.
- Text style transfer evaluation using large language models. In Proceedings of the International Conference on Computational Linguistics (COLING), 2024.
- Training language models to follow instructions with human feedback. In S. Koyejo, S. Mohamed, A. Agarwal, D. Belgrave, K. Cho, and A. Oh, editors, Advances in Neural Information Processing Systems, volume 35, pages 27730–27744. Curran Associates, Inc., 2022a.
- Training language models to follow instructions with human feedback. Advances in Neural Information Processing Systems, 35:27730–27744, 2022b.
- Large Language Models and Knowledge Graphs: Opportunities and Challenges. Transactions on Graph Data and Knowledge, 1(1):2:1–2:38, 2023. doi: 10.4230/TGDK.1.1.2. URL https://drops.dagstuhl.de/entities/document/10.4230/TGDK.1.1.2.
- A complete recipe for diffusion generative models. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 4261–4272, 2023.
- DiffuseVAE: Efficient, controllable and high-fidelity generation from low-dimensional latents. Transactions on Machine Learning Research, 2022. ISSN 2835-8856.
- Efficient integrators for diffusion generative models. In The Twelfth International Conference on Learning Representations, 2024. URL https://openreview.net/forum?id=qA4foxO5Gf.
- Bleu: a method for automatic evaluation of machine translation. In Proceedings of the 40th annual meeting of the Association for Computational Linguistics, pages 311–318, 2002.
- On aliased resizing and surprising subtleties in gan evaluation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 11410–11420, 2022.
- Deep structural causal models for tractable counterfactual inference. Advances in Neural Information Processing Systems, 33:857–869, 2020.
- Judea Pearl. The seven tools of causal inference, with reflections on machine learning. Commun. ACM, 62(3):54–60, feb 2019. ISSN 0001-0782.
- The Book of Why: The New Science of Cause and Effect. Basic Books, Inc., USA, 1st edition, 2018. ISBN 046509760X.
- Scalable diffusion models with transformers. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 4195–4205, 2023.
- SDXL: Improving latent diffusion models for high-resolution image synthesis. In The Twelfth International Conference on Learning Representations, 2024. URL https://openreview.net/forum?id=di52zR8xgf.
- Manipulating and measuring model interpretability. In Proceedings of the 2021 CHI conference on human factors in computing systems, pages 1–52, 2021.
- Robustness and generalization via generative adversarial training. 2021 IEEE/CVF International Conference on Computer Vision (ICCV), pages 15691–15700, 2021.
- Out-of-distribution generalization in the presence of nuisance-induced spurious correlations. In International Conference on Learning Representations, 2022.
- Don’t blame dataset shift! shortcut learning due to gradients and cross entropy. Advances in Neural Information Processing Systems, 36, 2023.
- Are multimodal models robust to image and text perturbations? arXiv:2212.08044 [cs.CV], 2022.
- Scaling language models: Methods, analysis & insights from training gopher. arXiv preprint arXiv:2112.11446, 2021.
- Big data analytics in healthcare: promise and potential. Health Information Science and Systems, 2, 2014.
- Zero-shot text-to-image generation. In Marina Meila and Tong Zhang, editors, Proceedings of the 38th International Conference on Machine Learning, volume 139 of Proceedings of Machine Learning Research, pages 8821–8831. PMLR, 18–24 Jul 2021. URL https://proceedings.mlr.press/v139/ramesh21a.html.
- Hierarchical text-conditional image generation with clip latents. arXiv preprint arXiv:2204.06125, 2022.
- Skilful precipitation nowcasting using deep generative models of radar. Nature, 597(7878):672–677, September 2021. ISSN 1476-4687. doi: 10.1038/s41586-021-03854-z. URL http://dx.doi.org/10.1038/s41586-021-03854-z.
- A generalist agent. Trans. Mach. Learn. Res., 2022, 2022.
- Jacobian-based causal discovery with nonlinear ICA. Transactions on Machine Learning Research, 2023. ISSN 2835-8856.
- Stochastic backpropagation and approximate inference in deep generative models. In Eric P. Xing and Tony Jebara, editors, Proceedings of the 31st International Conference on Machine Learning, volume 32 of Proceedings of Machine Learning Research, pages 1278–1286, Bejing, China, 22–24 Jun 2014. PMLR.
- High-resolution image synthesis with latent diffusion models. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 10684–10695, 2022.
- U-net: Convolutional networks for biomedical image segmentation. In Nassir Navab, Joachim Hornegger, William M. Wells, and Alejandro F. Frangi, editors, Medical Image Computing and Computer-Assisted Intervention – MICCAI 2015, pages 234–241, Cham, 2015. Springer International Publishing. ISBN 978-3-319-24574-4.
- Evaluating the interpretability of generative models by interactive reconstruction. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems, CHI ’21, New York, NY, USA, 2021. Association for Computing Machinery. ISBN 9781450380966.
- Robustness of ai-image detectors: Fundamental limits and practical attacks. In The Twelfth International Conference on Learning Representations, 2023.
- Photorealistic text-to-image diffusion models with deep language understanding. In Alice H. Oh, Alekh Agarwal, Danielle Belgrave, and Kyunghyun Cho, editors, Advances in Neural Information Processing Systems, 2022. URL https://openreview.net/forum?id=08Yk-n5l2Al.
- Progressive distillation for fast sampling of diffusion models. In International Conference on Learning Representations, 2022. URL https://openreview.net/forum?id=TIdIXIpzhoI.
- Improved techniques for training gans. Advances in neural information processing systems, 29, 2016.
- Diffusion causal models for counterfactual estimation. In First Conference on Causal Learning and Reasoning, 2022.
- Diffusion models for causal discovery via topological ordering. In The Eleventh International Conference on Learning Representations, 2023.
- Projected gans converge faster. Advances in Neural Information Processing Systems, 34:17480–17492, 2021.
- Are emergent abilities of large language models a mirage? Advances in Neural Information Processing Systems, 36, 2024.
- Toward causal representation learning. Proceedings of the IEEE, 109(5):612–634, 2021.
- On accurate evaluation of gans for language generation. arXiv preprint arXiv:1806.04936, 2018.
- Is attention interpretable? In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 2931–2951, 2019.
- C. E. Shannon. Prediction and entropy of printed english. The Bell System Technical Journal, 30(1):50–64, 1951. doi: 10.1002/j.1538-7305.1951.tb01366.x.
- Noam Shazeer. Glu variants improve transformer. arXiv preprint arXiv:2002.05202, 2020.
- " do anything now": Characterizing and evaluating in-the-wild jailbreak prompts on large language models. arXiv preprint arXiv:2308.03825, 2023.
- Make-a-video: Text-to-video generation without text-video data. In The Eleventh International Conference on Learning Representations, 2022.
- Where to diffuse, how to diffuse, and how to get back: Automated learning for multivariate diffusions. In The Eleventh International Conference on Learning Representations, 2023.
- Counterfactual explanations can be manipulated. Advances in neural information processing systems, 34:62–75, 2021.
- Deep unsupervised learning using nonequilibrium thermodynamics. In Francis Bach and David Blei, editors, Proceedings of the 32nd International Conference on Machine Learning, volume 37 of Proceedings of Machine Learning Research, pages 2256–2265, Lille, France, 07–09 Jul 2015. PMLR. URL https://proceedings.mlr.press/v37/sohl-dickstein15.html.
- Diffusion art or digital forgery? investigating data replication in diffusion models. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 6048–6058, 2023.
- Denoising diffusion implicit models. In International Conference on Learning Representations, 2021a.
- Improved techniques for training consistency models. In The Twelfth International Conference on Learning Representations, 2024. URL https://openreview.net/forum?id=WNzy9bRDvG.
- Score-based generative modeling through stochastic differential equations. arXiv preprint arXiv:2011.13456, 2020.
- Maximum likelihood training of score-based diffusion models. Advances in Neural Information Processing Systems, 34:1415–1428, 2021b.
- Consistency models. arXiv preprint arXiv:2303.01469, 2023.
- Language models are an effective representation learning technique for electronic health record data. Journal of Biomedical Informatics, 113:103637, 2021. ISSN 1532-0464.
- Resampling base distributions of normalizing flows. In International Conference on Artificial Intelligence and Statistics, pages 4915–4936. PMLR, 2022.
- Roformer: Enhanced transformer with rotary position embedding. Neurocomputing, 568:127063, 2024.
- Safety assessment of chinese large language models. arXiv preprint arXiv:2304.10436, 2023.
- Evaluating generalizability of fine-tuned models for fake news detection. ArXiv, abs/2205.07154, 2022.
- Learning group importance using the differentiable hypergeometric distribution. In The Eleventh International Conference on Learning Representations, 2023. URL https://openreview.net/forum?id=75O7S_L4oY.
- Efficient transformers: A survey. arXiv preprint arXiv:2009.06732, 2022.
- Catastrophic forgetting and mode collapse in gans. In 2020 international joint conference on neural networks (ijcnn), pages 1–10. IEEE, 2020.
- A note on the evaluation of generative models. In International Conference on Learning Representations, Apr 2016. URL http://arxiv.org/abs/1511.01844.
- Lamda: Language models for dialog applications. ArXiv, abs/2201.08239, 2022.
- Just ask for calibration: Strategies for eliciting calibrated confidence scores from language models fine-tuned with human feedback. arXiv preprint arXiv:2305.14975, 2023.
- Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971, 2023a.
- Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288, 2023b.
- Language models don’t always say what they think: unfaithful explanations in chain-of-thought prompting. Advances in Neural Information Processing Systems, 36, 2024.
- Nvae: A deep hierarchical variational autoencoder. Advances in neural information processing systems, 33:19667–19679, 2020.
- Score-based generative modeling in latent space. Advances in Neural Information Processing Systems, 34:11287–11302, 2021.
- Applications of machine learning in drug discovery and development. Nature reviews Drug discovery, 18(6):463–477, 2019.
- Attention is all you need. In NIPS, 2017.
- Venkat Venkatasubramanian. The promise of artificial intelligence in chemical engineering: Is it here, finally? AIChE Journal, 65(2):466–478, 2019.
- Nonparametric identifiability of causal representations from unknown interventions. Advances in Neural Information Processing Systems, 36, 2024.
- Counterfactual explanations without opening the black box: Automated decisions and the gdpr. Harv. JL & Tech., 31:841, 2017.
- On the robustness of chatGPT: An adversarial and out-of-distribution perspective. In ICLR 2023 Workshop on Trustworthy and Reliable Large-Scale Machine Learning Models, 2023a.
- Lora ensembles for large language model fine-tuning. arXiv preprint arXiv:2310.00035, 2023b.
- Are explanations helpful? a comparative study of the effects of explanations in ai-assisted decision-making. In 26th international conference on intelligent user interfaces, pages 318–328, 2021.
- Desiderata for representation learning: A causal perspective. arXiv preprint arXiv:2109.03795, 2021.
- Do prompt-based models really understand the meaning of their prompts? In Marine Carpuat, Marie-Catherine de Marneffe, and Ivan Vladimir Meza Ruiz, editors, Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 2300–2344, Seattle, United States, July 2022. Association for Computational Linguistics. doi: 10.18653/v1/2022.naacl-main.167. URL https://aclanthology.org/2022.naacl-main.167.
- Emergent abilities of large language models. Transactions on Machine Learning Research, 2022. ISSN 2835-8856. URL https://openreview.net/forum?id=yzkSU5zdwD. Survey Certification.
- Machine learning in materials science. InfoMat, 1(3):338–358, 2019.
- Ethical and social risks of harm from language models. ArXiv, abs/2112.04359, 2021.
- Tree-rings watermarks: Invisible fingerprints for diffusion images. In Thirty-seventh Conference on Neural Information Processing Systems, 2023. URL https://openreview.net/forum?id=Z57JrmubNl.
- Assaying out-of-distribution generalization in transfer learning. In Neural Information Processing Systems (NeurIPS), 2022.
- Data collection and quality challenges in deep learning: a data-centric ai perspective. The VLDB Journal, 32:791–813, 2021.
- Attention is not not explanation. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 11–20, 2019.
- Learning continuous and data-driven molecular descriptors by translating equivalent chemical representations. Chem. Sci., 10:1692–1701, 2019. doi: 10.1039/C8SC04175J. URL http://dx.doi.org/10.1039/C8SC04175J.
- Sustainable ai: Environmental implications, challenges and opportunities. In D. Marculescu, Y. Chi, and C. Wu, editors, Proceedings of Machine Learning and Systems, volume 4, pages 795–813, 2022. URL https://proceedings.mlsys.org/paper_files/paper/2022/file/462211f67c7d858f663355eff93b745e-Paper.pdf.
- AR-diffusion: Auto-regressive diffusion model for text generation. In Thirty-seventh Conference on Neural Information Processing Systems, 2023. URL https://openreview.net/forum?id=0EG6qUQ4xE.
- Towards generalist robots: A promising paradigm via generative simulation. arXiv preprint arXiv:2305.10455, 2023.
- Smoothquant: Accurate and efficient post-training quantization for large language models. In International Conference on Machine Learning, pages 38087–38099. PMLR, 2023.
- Tackling the generative learning trilemma with denoising diffusion GANs. In International Conference on Learning Representations, 2022. URL https://openreview.net/forum?id=JprM0p-q0Co.
- Can LLMs express their uncertainty? an empirical evaluation of confidence elicitation in LLMs. In The Twelfth International Conference on Learning Representations, 2024. URL https://openreview.net/forum?id=gjeQKFxFpZ.
- A theory of usable information under computational constraints. In International Conference on Learning Representations, 2020. URL https://openreview.net/forum?id=r1eBeyHFDH.
- Lossy image compression with conditional diffusion models. In Neural Information Processing Systems, 2023.
- Diffusion probabilistic modeling for video generation. Entropy, 25(10):1469, 2023a.
- Variational Bayesian quantization. In International Conference on Machine Learning, pages 10670–10680. PMLR, 2020.
- An introduction to neural data compression. Foundations and Trends® in Computer Graphics and Vision, 15(2):113–200, 2023b.
- Zeroquant: Efficient and affordable post-training quantization for large-scale transformers. In S. Koyejo, S. Mohamed, A. Agarwal, D. Belgrave, K. Cho, and A. Oh, editors, Advances in Neural Information Processing Systems, volume 35, pages 27168–27183. Curran Associates, Inc., 2022.
- Diffusion models with deterministic normalizing flow priors. arXiv preprint arXiv:2309.01274, 2023.
- Root mean square layer normalization. Advances in Neural Information Processing Systems, 32, 2019.
- Adding conditional control to text-to-image diffusion models. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 3836–3847, October 2023.
- Fast sampling of diffusion models with exponential integrator. In The Eleventh International Conference on Learning Representations, 2023.
- Opt: Open pre-trained transformer language models. arXiv preprint arXiv:2205.01068, 2022.
- Gptbias: A comprehensive framework for evaluating bias in large language models. arXiv preprint arXiv:2312.06315, 2023a.
- Bias and generalization in deep generative models: An empirical study. In S. Bengio, H. Wallach, H. Larochelle, K. Grauman, N. Cesa-Bianchi, and R. Garnett, editors, Advances in Neural Information Processing Systems, volume 31. Curran Associates, Inc., 2018.
- A recipe for watermarking diffusion models. arXiv preprint arXiv:2303.10137, 2023b.
- Unlearnable examples for diffusion models: Protect data from unauthorized exploitation. arXiv preprint arXiv:2306.01902, 2023c.
- Truncated diffusion probabilistic models and diffusion-based adversarial auto-encoders. In The Eleventh International Conference on Learning Representations, 2023. URL https://openreview.net/forum?id=HDxgaKk956l.
- Causal discovery with heterogeneous observational data. In James Cussens and Kun Zhang, editors, Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence, volume 180 of Proceedings of Machine Learning Research, pages 2383–2393. PMLR, 01–05 Aug 2022.
- Emerging synergies in causality and deep generative models: A survey. arXiv preprint arXiv:2301.12351, 2023.
- Hype: A benchmark for human eye perceptual evaluation of generative models. Advances in neural information processing systems, 32, 2019.
- Texygen: A benchmarking platform for text generation models. In The 41st international ACM SIGIR conference on research & development in information retrieval, pages 1097–1100, 2018.
- Revisiting adversarial robustness distillation: Robust soft labels make student better. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 16443–16452, 2021.
- Fine-tuning language models from human preferences. arXiv preprint arXiv:1909.08593, 2019.
- Universal and transferable adversarial attacks on aligned language models. ArXiv, abs/2307.15043, 2023.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.