Papers
Topics
Authors
Recent
Search
2000 character limit reached

ZeoNet: Geometric Encoding for Zeolite Discovery

Updated 31 January 2026
  • ZeoNet representation is a deep learning paradigm that converts periodic zeolite structures into normalized 3D distance grids, capturing both atomic coordination and mesoscale pore architectures.
  • A compact 3D CNN, built with a ResNet-style architecture, extracts a 512-dimensional fingerprint and achieves over 95% classification accuracy compared to traditional geometric filters.
  • By filtering over 300,000 hypothetical structures and reducing false positives to 0.4%, ZeoNet streamlines the discovery of synthesizable zeolite frameworks for experimental validation.

The ZeoNet representation is a specialized geometric encoding and deep learning approach designed to distinguish experimentally realizable zeolite frameworks from large computational libraries of hypothetical candidates. Unlike atom-centric graphs or purely topological descriptors, ZeoNet transforms the periodic crystal lattice of a zeolite into a 3D volumetric "distance-to-surface" grid, which encodes both local coordination geometries and mesoscale pore architectures. A compact 3D CNN extracts a 512-dimensional fingerprint from this grid and enables classification with accuracy exceeding prior geometric filters and machine learning baselines. ZeoNet thus enables the systematic assessment of synthetic feasibility for silicate and aluminophosphate zeolite-like materials absent exhaustive physics-based criteria, effectively reducing the practical search space for new synthesizable frameworks (Liu et al., 24 Jan 2026).

1. Construction of the 3D Volumetric Distance Grid

A periodic zeolite unit cell is embedded in a Cartesian box whose side length LL is set to a crystallographic axis or minimum-image equivalent. A uniform grid of N×N×NN \times N \times N voxels is placed within this box, with spacing Δ=L/N\Delta = L/N and grid point coordinates rijk=r0+(iΔ,jΔ,kΔ)r_{ijk} = r_0 + (i \Delta, j \Delta, k \Delta) for i,j,k{0,,N1}i,j,k \in \{0, \ldots, N-1\}. At each rr, the minimum Euclidean distance to any atom in the framework, d0(r)=minjrRj2d_0(r) = \min_j \|r - R_j\|_2, is calculated. Subtracting a probe radius Rp=1.2A˚R_p = 1.2\,\text{Å} (chosen to reflect typical structure-directing species) yields a signed distance field d(r)=d0(r)Rpd(r) = d_0(r) - R_p; negative d(r)d(r) values indicate overlap with atomic spheres, while positive values indicate accessible pore volume at least RpR_p away from the framework. The resulting field is clipped and normalized to [1,1][-1, 1] via dnorm(r)=d(r)/dmaxd_{\text{norm}}(r) = d(r)/d_{\max}, where dmaxd_{\max} is typically 5A˚5\,\text{Å}.

This volumetric field enables the ZeoNet architecture to capture atomic-scale to mesoscale structural detail, including pore geometry, connectivity, local bond angles, and crystallographic symmetry, in a format compatible with convolutional processing.

2. Input Tensor Formats and Channel Variants

Standard ZeoNet implementations provide input as either a single-channel or two-channel tensor:

  • Single-channel: All framework atoms within the unit cell contribute to one distance field, resulting in XR1×N×N×NX \in \mathbb{R}^{1 \times N \times N \times N}, with X[0,i,j,k]=dnorm(rijk)X[0,i,j,k] = d_{\text{norm}}(r_{ijk}).
  • Two-channel: Separates T-atoms (Si or P) and O-atoms, yielding XR2×N×N×NX \in \mathbb{R}^{2 \times N \times N \times N}; channel 0 encodes the distance to the nearest T-atom, channel 1 to the nearest O-atom.

Each channel is mean-centered and scaled to unit variance across the training set; further data augmentation is not applied apart from random sampling among the four classes ("synthesizable as silicate," "as aluminophosphate," "as both," and "not synthesizable") using WeightedRandomSampler.

3. ZeoNet 3D Convolutional Network Architecture

ZeoNet's encoder is a small-scale 3D ResNet-style or simple convolutional-pooling stack. The network ingests the distance grid input (N32N \sim 32–$48$) and passes it through successive layers:

  • Conv3D (C32C \to 32), BatchNorm3D, ReLU, MaxPool3D (N/2N/2).
  • Conv3D (326432 \to 64), BatchNorm3D, ReLU, MaxPool3D (N/4N/4).
  • Conv3D (6412864 \to 128), BatchNorm3D, ReLU, MaxPool3D (N/8N/8).
  • Conv3D (128256128 \to 256), BatchNorm3D, ReLU, MaxPool3D (N/16N/16).
  • Fully connected (Flatten \to Linear 256(N/16)31024256 \cdot (N/16)^3 \to 1024), ReLU, Dropout (p=0.2p = 0.2).
  • Linear (10245121024 \to 512), ReLU, Dropout (p=0.2p = 0.2).
  • Final classification head: Linear (5124512 \to 4) \to Softmax.

All convolutions use ReLU activations, batch normalization, and weight decay (L21×104L_2 \approx 1 \times 10^{-4}). The penultimate 512-dimensional vector is designated as the "ZeoNet embedding" and serves as a fixed-length structural fingerprint for visualization and downstream classification tasks.

Layer Type Output Shape Comments
Conv3D + Pool 32×N/2332 \times N/2^3 Initial feature extraction
Conv3D + Pool 64×N/4364 \times N/4^3 Intermediate depth
Conv3D + Pool 128×N/83128 \times N/8^3 Deeper receptive field
Conv3D + Pool 256×N/163256 \times N/16^3 Last conv layer
Fully Connected 1024, then 512 Embedding extraction
Linear + Softmax 4 Four-class output

4. Structural Features Encoded by ZeoNet

The distance-grid encoding integrates atomic coordination and pore-scale topology:

  • Pore Geometry and Connectivity: Level sets d(r)=constd(r) = \text{const} map local pore diameters and channel connectivity; bottlenecks appear as narrow regions.
  • Local Coordination: Near T–O bonds, the distance field exhibits conical troughs encoding bond lengths and T–O–T angles, allowing recognition of realistic tetrahedral distortions.
  • Global Cell Metrics: The grid directly encodes crystallographic symmetry and cell aspect ratio, empirically correlated with zeolite family classification.
  • Probe Radius Selection: Rp=1.2A˚R_p = 1.2\,\text{Å} reflects the scale of structure-directing and hydrothermal species, emphasizing functional pore environments.

This unified geometric-semantic encoding enables the network to learn empirical patterns that distinguish the roughly 200 known IZA (International Zeolite Association) topologies from the vastly larger pool of hypothetical PCOD (Predicted Crystallographic Online Database) structures.

5. Performance Metrics and Comparative Results

ZeoNet, as reported by Liu et al. (Liu et al., 24 Jan 2026), demonstrates substantive improvements over prior geometric filters and ML methods:

  • IZA Four-Class Accuracy: 383 IZA structures (2014 release), only 18 misclassified—top-1 accuracy 95.3%\approx 95.3\%.
  • Balanced Recall Across Classes: Si-only, P-only, and Si/P classes all exhibit recalls exceeding 92%92\%.
  • False Positive Suppression: Among >300,000>300,000 hypothetical PCOD structures, only 0.4%0.4\% are incorrectly classified as synthesizable, reducing the search space by over 95%95\% compared to baseline approaches.
  • Baseline Comparison: Geometric filter methods achieve IZA recall of 85%\sim 85\%, while ZeoNet elevates it to 95%95\% and halves the false-positive rate.
  • Misclassified Candidates: The small cohort (1,200\sim 1,200) of false negatives likely represents promising structures for future experimental synthesis, given that their formation energies and bond metrics closely match known zeolites.

6. Implications, Applications, and Limitations

ZeoNet's probe-corrected volumetric grids and convolutional architecture yield a differentiable representation strongly correlated with empirical synthesizability. The approach does not rely on comprehensive physics-based criteria for framework formation, and its high selectivity suggests immediate utility for guiding experimental prioritization in large computational libraries. A plausible implication is that the ZeoNet embedding encodes heretofore unquantified combinations of local and global motifs influential in zeolite formation chemistry.

However, ZeoNet's predictions are contingent on the scope of training data and the particular probe radius used. The subset of hypothetical structures misclassified as "synthesizable" may indicate gaps in current chemical understanding or unrecognized experimental potential. The use of volumetric grids is tailored for periodic, crystalline frameworks and may require adaptation for disordered or non-porous materials.

Together, ZeoNet's representation and classifier provide a robust tool for structural informatics in zeolite science, streamlining the identification of frameworks suitable for synthetic exploration and advancing the methodological state of zeolite discovery (Liu et al., 24 Jan 2026).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to ZeoNet Representation.