Papers
Topics
Authors
Recent
Search
2000 character limit reached

SimXRD-4M: Big Simulated X-ray Diffraction Data Accelerate the Crystal Symmetry Classification

Published 15 Jun 2024 in cond-mat.mtrl-sci | (2406.15469v2)

Abstract: Spectroscopic data, particularly diffraction data, contain detailed crystal and microstructure information and thus are crucial for materials discovery. Powder X-ray diffraction (XRD) patterns are greatly effective in identifying crystals. Although ML has significantly advanced the analysis of powder XRD patterns, the progress is hindered by a lack of training data. To address this, we introduce SimXRD, the largest open-source simulated XRD pattern dataset so far, to accelerate the development of crystallographic informatics. SimXRD comprises 4,065,346 simulated powder X-ray diffraction patterns, representing 119,569 distinct crystal structures under 33 simulated conditions that mimic real-world variations. We find that the crystal symmetry inherently follows a long-tailed distribution and evaluate 21 sequence learning models on SimXRD. The results indicate that existing neural networks struggle with low-frequency crystal classifications. The present work highlights the academic significance and the engineering novelty of simulated XRD patterns in this interdisciplinary field.

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.