Accelerating Materials Design via LLM-Guided Evolutionary Search

Published 26 Oct 2025 in cs.LG, cond-mat.mtrl-sci, cs.AI, and cs.NE | (2510.22503v1)

Abstract: Materials discovery requires navigating vast chemical and structural spaces while satisfying multiple, often conflicting, objectives. We present LLM-guided Evolution for MAterials design (LLEMA), a unified framework that couples the scientific knowledge embedded in LLMs with chemistry-informed evolutionary rules and memory-based refinement. At each iteration, an LLM proposes crystallographically specified candidates under explicit property constraints; a surrogate-augmented oracle estimates physicochemical properties; and a multi-objective scorer updates success/failure memories to guide subsequent generations. Evaluated on 14 realistic tasks spanning electronics, energy, coatings, optics, and aerospace, LLEMA discovers candidates that are chemically plausible, thermodynamically stable, and property-aligned, achieving higher hit-rates and stronger Pareto fronts than generative and LLM-only baselines. Ablation studies confirm the importance of rule-guided generation, memory-based refinement, and surrogate prediction. By enforcing synthesizability and multi-objective trade-offs, LLEMA delivers a principled pathway to accelerate practical materials discovery. Code: https://github.com/scientific-discovery/LLEMA

Abstract PDF Upgrade to Chat

Summary

The paper introduces the LLM-Guided Evolution framework which integrates LLMs with evolutionary search for multi-objective materials discovery.
The framework combines material candidate generation, crystallographic conversion, and physicochemical property prediction to refine material designs iteratively.
Experimental results show that the method outperforms existing generative models by achieving higher hit rates and improved Pareto fronts across 14 real-world tasks.

Accelerating Materials Design via LLM-Guided Evolutionary Search

Introduction and Background

Materials discovery is crucial for technological advancements across multiple domains, including electronics, energy, aerospace, and optics. Traditional methods for materials discovery have been slow and computationally expensive due to the vast chemical and structural spaces that need to be navigated to find materials with desired properties. Machine learning has provided tools to speed up this process; however, it relies heavily on large labeled datasets, which are often unavailable. LLMs offer a novel approach by leveraging expansive textual corpora to inject prior scientific knowledge, even in data-scarce regimes. This paper introduces LLM-guided Evolution for MAterial discovery (\AlgName), a framework that couples the scientific knowledge embedded in LLMs with evolutionary search and chemistry-informed rules for materials design.

Figure 1: Overview of our multi-objective material discovery benchmark.

Methodology

\AlgName Framework

Figure 2: \AlgName Framework, consisting of four main components: (A) Material Candidate Generation, (B) Crystallographic Representation, (C) Physicochemical Property Prediction, and (D) Fitness Assessment and Feedback.

The \AlgName framework comprises several key components:

Material Candidate Generation: An LLM generates material candidates based on task descriptions and property constraints.
Crystallographic Representation: The generated materials are converted into structured crystallographic information files (CIFs), enabling detailed structural analysis.
Physicochemical Property Prediction: A surrogate-assisted oracle evaluates task-relevant physicochemical properties, such as formation energy and band gap.
Fitness Assessment and Feedback: Success and failure memories guide iterative refinement, assessing material constraints and providing feedback to inform future generations.

Problem Formulation

The framework tackles the materials discovery task as a multi-objective optimization problem, searching for materials that satisfy multiple property constraints while achieving optimal trade-offs between competing objectives. This requires not only meeting each constraint but also navigating the vast chemical space to uncover novel, stable, and synthesizable compounds.

Experiments and Results

Quantitative Metrics

\AlgName was evaluated on 14 real-world tasks including wide-bandgap semiconductors, high- $k$ dielectrics, and photovoltaic absorbers. It consistently outperformed generative models like CDVAE and DiffCSP and LLM-based methods such as LLMatDesign by achieving higher hit rates and stronger Pareto fronts.

Figure 3: Pareto front analysis of candidate materials for two design tasks. (a) Wide-Bandgap Semiconductors; (b) HardâStiff Ceramics.

Convergence and Stability

Figure 4: Evolution of the Pareto front during multi-objective optimization for SAW/BAW Acoustics substrates.

The framework demonstrated rapid convergence towards feasible solutions, improving the stability and chemical validity of generated materials over iterations. \AlgName’s ability to balance exploration and exploitation resulted in superior material designs compared to baselines, as seen in the progressive expansion of the Pareto front.

Elemental Diversity

\AlgName significantly enhanced the diversity of elemental compositions in generated outcomes, which is critical for discovering novel compounds that are not only theoretically plausible but also experimentally realizable.

Figure 5: Evolution of periodic table coverage during SAW/BAW acoustic substrate optimization.

Implications and Future Directions

\AlgName provides a robust pathway for accelerating the discovery of practical materials. By integrating knowledge-driven generative models with chemistry-informed constraints and evolutionary search, it helps address the need for materials that meet complex, multi-objective criteria. Future developments could expand \AlgName’s applicability across even broader tasks, incorporating real-time experimental feedback and adapting its search strategies to dynamically changing goals.

Conclusion

This study demonstrates that \AlgName is an effective LLM-driven framework for the autonomous discovery of novel materials. It combines the strengths of hypothetical reasoning with rigorous property assessments, allowing it to significantly surpass existing methods in both efficiency and output quality. As the field progresses, \AlgName paves the way for more scalable, reliable, and impactful advancements in automated materials discovery.