LLMatic: Neural Architecture Search via Large Language Models and Quality Diversity Optimization
Abstract: LLMs have emerged as powerful tools capable of accomplishing a broad spectrum of tasks. Their abilities span numerous areas, and one area where they have made a significant impact is in the domain of code generation. Here, we propose using the coding abilities of LLMs to introduce meaningful variations to code defining neural networks. Meanwhile, Quality-Diversity (QD) algorithms are known to discover diverse and robust solutions. By merging the code-generating abilities of LLMs with the diversity and robustness of QD solutions, we introduce \texttt{LLMatic}, a Neural Architecture Search (NAS) algorithm. While LLMs struggle to conduct NAS directly through prompts, \texttt{LLMatic} uses a procedural approach, leveraging QD for prompts and network architecture to create diverse and high-performing networks. We test \texttt{LLMatic} on the CIFAR-10 and NAS-bench-201 benchmarks, demonstrating that it can produce competitive networks while evaluating just $2,000$ candidates, even without prior knowledge of the benchmark domain or exposure to any previous top-performing models for the benchmark. The open-sourced code is available in \url{https://github.com/umair-nasir14/LLMatic}.
- A few thousand translations go a long way! leveraging pre-trained models for african news translation. arXiv preprint arXiv:2205.02022, 2022.
- Efficient hardware implementation of radial basis function neural network with customized-precision floating-point operations. Control Engineering Practice, 60:124–132, 2017.
- Jon Louis Bentley. Multidimensional binary search trees used for associative searching. Communications of the ACM, 18(9):509–517, 1975.
- Léon Bottou. Large-scale machine learning with stochastic gradient descent. In Proceedings of COMPSTAT’2010: 19th International Conference on Computational StatisticsParis France, August 22-27, 2010 Keynote, Invited and Contributed Papers, pp. 177–186. Springer, 2010.
- Language models are few-shot learners. Advances in neural information processing systems, 33:1877–1901, 2020.
- Evoprompting: Language models for code-level neural architecture search. arXiv preprint arXiv:2302.14838, 2023.
- Evaluating large language models trained on code. arXiv preprint arXiv:2107.03374, 2021.
- A downsampled variant of imagenet as an alternative to the cifar datasets. arXiv preprint arXiv:1707.08819, 2017.
- Quality and diversity optimization: A unifying modular framework. IEEE Transactions on Evolutionary Computation, 22(2):245–259, 2017.
- Imagenet: A large-scale hierarchical image database. In 2009 IEEE conference on computer vision and pattern recognition, pp. 248–255. Ieee, 2009.
- Euclidean distance matrices: essential theory, algorithms, and applications. IEEE Signal Processing Magazine, 32(6):12–30, 2015.
- Xuanyi Dong and Yi Yang. Nas-bench-201: Extending the scope of reproducible neural architecture search. arXiv preprint arXiv:2001.00326, 2020.
- Neural architecture search: A survey. The Journal of Machine Learning Research, 20(1):1997–2017, 2019.
- The pile: An 800gb dataset of diverse text for language modeling. arXiv preprint arXiv:2101.00027, 2020.
- Sam Greydanus. Scaling down deep learning. arXiv preprint arXiv:2011.14439, 2020.
- Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 770–778, 2016.
- Batch normalization: Accelerating deep network training by reducing internal covariate shift. In International conference on machine learning, pp. 448–456. pmlr, 2015.
- Reinforcement learning for neural architecture search: A review. Image and Vision Computing, 89:57–66, 2019.
- Neural architecture search with bayesian optimisation and optimal transport. Advances in neural information processing systems, 31, 2018.
- Learning multiple layers of features from tiny images. 2009.
- Evolution through large models. arXiv preprint arXiv:2206.08896, 2022.
- A survey on evolutionary neural architecture search. IEEE transactions on neural networks and learning systems, 2021.
- Nas-bench-suite: Nas evaluation is (now) surprisingly easy. arXiv preprint arXiv:2201.13396, 2022.
- Designing neural networks using genetic algorithms. In ICGA, volume 89, pp. 379–384, 1989.
- Illuminating search spaces by mapping elites. arXiv preprint arXiv:1504.04909, 2015.
- ΔΔ\Deltaroman_Δ-darts: Mitigating performance collapse by harmonizing operation selection among cells. arXiv preprint arXiv:2210.07998, 2022.
- Rectified linear units improve restricted boltzmann machines. In Proceedings of the 27th international conference on machine learning (ICML-10), pp. 807–814, 2010.
- Practical pcg through large language models. arXiv preprint arXiv:2305.18243, 2023.
- Geographical distance is the new hyperparameter: A case study of finding the optimal pre-trained language for english-isizulu machine translation. arXiv preprint arXiv:2205.08621, 2022.
- Augmentative topology agents for open-ended learning. arXiv preprint arXiv:2210.11442, 2022.
- Codegen: An open large language model for code with multi-turn program synthesis. arXiv preprint arXiv:2203.13474, 2022.
- Policy gradient assisted map-elites. In Proceedings of the Genetic and Evolutionary Computation Conference, pp. 866–875, 2021.
- Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems, 32, 2019.
- Quality diversity: A new frontier for evolutionary computation. Frontiers in Robotics and AI, pp.  40, 2016.
- Language models are unsupervised multitask learners. OpenAI blog, 1(8):9, 2019.
- Imagenet large scale visual recognition challenge. International journal of computer vision, 115:211–252, 2015.
- Evolving neural networks through augmenting topologies. Evolutionary computation, 10(2):99–127, 2002.
- Efficientnet: Rethinking model scaling for convolutional neural networks. In International conference on machine learning, pp. 6105–6114. PMLR, 2019.
- Mnasnet: Platform-aware neural architecture search for mobile. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 2820–2828, 2019.
- Self organizing neural networks for the identification problem. Advances in Neural Information Processing Systems, 1, 1988.
- Level generation through large language models. In Proceedings of the 18th International Conference on the Foundations of Digital Games, pp. 1–8, 2023.
- Transfer learning. In Handbook of research on machine learning applications and trends: algorithms, methods, and techniques, pp. 242–264. IGI global, 2010.
- Analysis and comparison of a proposed mutation operator and its effects on the performance of genetic algorithm. Indonesian Journal of Electrical Engineering and Computer Science, 25(2):1208–12168, 2022.
- Using centroidal voronoi tessellations to scale up the multidimensional archive of phenotypic elites algorithm. IEEE Transactions on Evolutionary Computation, 22(4):623–630, 2017.
- Attention is all you need. Advances in neural information processing systems, 30, 2017.
- The clrs algorithmic reasoning benchmark. In International Conference on Machine Learning, pp. 22084–22102. PMLR, 2022.
- Neural architecture search: Insights from 1000 papers. arXiv preprint arXiv:2301.08727, 2023.
- Martin Wistuba. Finding competitive network architectures within a day using uct. arXiv preprint arXiv:1712.07420, 2017.
- Nas-bench-101: Towards reproducible neural architecture search. In International Conference on Machine Learning, pp. 7105–7114. PMLR, 2019.
- Can gpt-4 perform neural architecture search? arXiv preprint arXiv:2304.10970, 2023.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.