Ab Initio-Parametrized TB Model

Updated 4 February 2026

Ab initio-parametrized TB models directly map first-principles DFT data to tight-binding parameters for accurate electronic structure descriptions.
They employ machine learning techniques to parameterize environment-dependent on-site and hopping integrals, enhancing transferability across diverse systems.
These models enable efficient simulations of systems with up to millions of atoms, validated through comparisons with DFT band structures and experimental data.

An ab initio-parametrized tight-binding (TB) model is a computational formalism in which the Hamiltonian is constructed to reproduce first-principles reference data—typically density functional theory (DFT) band structures, eigenstates, and/or wavefunctions—by systematically optimizing the (semi-)empirical parameters of a localized-orbital TB representation. Such models combine the accuracy of ab initio methods with the computational efficiency and transferability needed for large-scale simulations, often reaching millions to hundreds of millions of atoms. Recent work leverages ML architectures to parameterize the environmental dependence of on-site and hopping integrals directly from ab initio data, delivering both accuracy and scalability. The following sections review the foundational concepts, parametrization strategies, state-of-the-art workflows, transferability benchmarks, and large-scale applications of ab initio-parametrized TB models, focusing on the latest advances such as GPUTB (Wang et al., 8 Sep 2025).

1. Fundamental Concepts and Motivation

The tight-binding approach approximates the electronic structure of solids by considering electrons hopping between a finite localized-orbital basis. Traditional empirical TB models are typically parameterized by fitting to experimental data or a limited subset of ab initio electronic properties, but this can result in poor transferability to non-bulk geometries, heterostructures, or varying environments. Ab initio-parametrized TB models address this limitation by employing a direct mapping from first-principles calculations (e.g., DFT, hybrid functional, GW) to the TB Hamiltonian parameters, ensuring that band dispersion, effective masses, bandgaps, and even wavefunctions at the desired energy range are faithfully captured (Tan et al., 2015, Tan et al., 2015, Dick et al., 6 Aug 2025).

Machine-learning approaches, such as GPUTB and DeePTB, further extend the traditional paradigm by enabling nontrivial environment dependence and flexibility under structural perturbations, facilitating predictive simulations for device-scale, disordered, or finite-temperature systems (Wang et al., 8 Sep 2025, Gu et al., 2023).

2. Parametrization Workflow and Model Construction

2.1. Choice of Hamiltonian Structure

The standard ab initio-parametrized TB Hamiltonian operates in an orthogonal localized-orbital basis:

$H = \sum_{i,a} \varepsilon_{i,a}\,|i,a\rangle \langle i,a| + \sum_{i \ne j} \sum_{a,b} t_{i,a;j,b}\,|i,a\rangle \langle j,b|$

where $\varepsilon_{i,a}$ are on-site energies and $t_{i,a;j,b}$ are hopping (Slater–Koster) integrals. These matrix elements are explicitly parameterized to match DFT eigenvalues and, when possible, projected wavefunction properties (Tan et al., 2015, Agapito et al., 2015).

2.2. Reference Data Generation

Reference target data are generated as follows:

High-quality DFT calculations in an appropriate basis (e.g., LCAO DZP, PW, MLWF) with stringent energy cutoffs and dense $k$ -sampling.
For environmental robustness, both 0 K relaxed and finite- $T$ (AIMD or advanced force-field MD) structures are included in the training set (Wang et al., 8 Sep 2025).

2.3. Descriptor Engineering and Parameterization Strategy

Slater–Koster formalism: All on-site and hopping integrals are expanded in the standard SK channels, with direction cosines encapsulating bond orientation (Tan et al., 2015, Silva-Guillén et al., 2016).
Environment descriptors: Recent frameworks (GPUTB, DeePTB) use invariant node (atom) and edge (bond) descriptors—such as Chebyshev expansions of neighbor distances, smooth cutoff functions, or local neural-network embeddings—allowing explicit mapping of TB parameters to the atomic environment (Gu et al., 2023, Wang et al., 8 Sep 2025).
Parameter networks: Parameter corrections to reference atomic-type integrals are obtained from dedicated neural network modules (MLPs), which consume the environment descriptors and produce systematic deviations $\delta \varepsilon$ , $\delta t$ from tabulated values. Directional dependence is built in using directional cosines for the SK channels.

2.4. Optimization and Training

Loss function: The training objective combines (a) mean-squared error between predicted and DFT band eigenvalues over all bands and $k$ -points (band-dependent weights), and (b) explicit regularization (e.g., $L_2$ norm) to control overfitting:

$L = \sum_{b,k} w_b [ E_{b,k}^{\mathrm{pred}} - E_{b,k}^{\mathrm{ref}} ]^2 + \lambda \|\Theta\|^2$

with early stopping based on MAE plateaus (Wang et al., 8 Sep 2025).

Optimization: Training is performed using advanced stochastic gradient methods (AdamW), batch normalization, and SiLU activations. For state-of-the-art ML-TB frameworks, environmental-invariant symmetry preservation is strictly enforced throughout.

2.5. Post-fitting Validation

Spectral comparison: Direct overlays of DFT and TB band structures, densities of states (DOS), and—when available—planar-averaged charge densities $|\psi(z)|^2$ validate the model's accuracy (Tan et al., 2015, Wang et al., 8 Sep 2025).
Transport and finite-temperature physics: Quantum transport coefficients are benchmarked against experiment using O(N) scaling quantum transport codes (e.g., LSQT), and the model's ability to generalize to large or thermally fluctuating supercells is tested (Wang et al., 8 Sep 2025, Gu et al., 2023).

3. Atomic-Environment Dependence and Machine-Learning TB

A central development in modern ab initio-parametrized TB models is the systematic incorporation of atomic-environment dependence. This is accomplished through multi-level, symmetry-preserving descriptors and machine-learning parameter networks:

Node descriptor $v_i$ : Encodes the local chemical neighborhood of each atom via a Chebyshev expansion of the radial distribution of neighbor distances, weighted by a smooth cutoff function.
Edge descriptor $e_{ij}$ : Encodes bond lengths (and optionally angles), ensuring representations are robust under local distortions.

These descriptors are concatenated, fed through environment neural networks, and output environment-dependent corrections to on-site and hopping SK parameters. This framework allows the same parameterization to transfer across different structural phases, XC functionals, or chemical environments (e.g., graphene, diamond, h-BN, and their heterostructures), a property demonstrated explicitly in GPUTB (Wang et al., 8 Sep 2025) and DeePTB (Gu et al., 2023).

Table: Core Elements of GPUTB/TB-ML Protocol

Element	Description	Source/Notes
Hamiltonian structure	Orthogonal LCAO basis with on-site/hopping SK integrals	(Wang et al., 8 Sep 2025)
Node/edge descriptors	Chebyshev expansion of distances, smooth cutoff; $v_i$ , $e_{ij}$ as K-vectors	(Wang et al., 8 Sep 2025)
Parameter networks	Three MLPs (onsite, hopping, environment); SiLU activation, BatchNorm, AdamW optimizer	(Wang et al., 8 Sep 2025)
Training protocol	Band-weighted quadratic loss, $L_2$ regularization, early stopping, MAE $\lesssim 20$ meV	(Wang et al., 8 Sep 2025)
Validation	DFT bands/DOS, experiment (e.g., mobility), transfer to heterojunctions/allotropes	(Wang et al., 8 Sep 2025)

4. Transferability, Robustness, and Physical Validation

Ab initio-parametrized TBs, when constructed with rigorous environment dependence, demonstrate high transferability:

Basis/bonding transfer: Single parametrizations can describe multiple crystal structures, bonding motifs (sp $^2$ /sp $^3$ ), and even heterostructures (e.g., h-BN/graphene, diamond/graphene) without re-fitting (Wang et al., 8 Sep 2025).
XC functional robustness: Parametrizations are stable when the DFT reference is switched from PBE to more advanced functionals such as SCAN or HSE (Wang et al., 8 Sep 2025).
Thermal and structural stability: Models trained on finite- $T$ MD snapshots accurately reproduce thermal smearing and bandgap renormalization in large-scale supercells ( $10^6$ – $10^8$ atoms), enabling studies of fluctuating and disordered environments (Gu et al., 2023, Wang et al., 8 Sep 2025).

Validation includes:

Band structure/ DOS overlay at 0 K and finite $T$ , demonstrating <20 meV MAE across hundreds of bands.
Device-relevant properties: e.g., computed carrier mobilities in graphene as a function of charge density agree with experimental data in the $10^{11}$ – $10^{12}\ \mathrm{cm}^{-2}$ range (Wang et al., 8 Sep 2025).
Allotrope and heterostructure benchmarks: The same model parameters apply to both graphene and diamond, or to h-BN/graphene interfaces without reparametrization.

5. Scalability and Large-Scale Applications

Machine-learning TB models exploit GPU acceleration and linear-scaling (O(N)) quantum transport solvers (e.g., LSQT) to generate device-scale Hamiltonians and simulate physical observables in systems containing up to $10^8$ atoms (Wang et al., 8 Sep 2025). Key attributes include:

Construction speed: Massive supercell Hamiltonians are built in time linear with system size, bottlenecked only by memory bandwidth and GPU architecture.
Simulation efficiency: Electronic spectra, DOS, and conductivity can be computed with Chebyshev-based KPM algorithms combined with sparse-matrix–vector operations, making million-atom MD–electronic coupling feasible (Wang et al., 8 Sep 2025, Gu et al., 2023).
Integration with MD: The GPUTB and DeePTB frameworks allow seamless coupling with ab initio or force-field MD, providing dynamic feedback between ionic and electronic degrees of freedom at scale.

Table: Validation and Efficiency Metrics

System	Atoms	Key Accuracy/Validation	Source
Graphene (pristine)	$10^{8}$	DOS, bands MAE $\lesssim$ 20 meV	(Wang et al., 8 Sep 2025)
Graphene/h-BN	$10^6$	DOS, transport, heterostructure transfer	(Wang et al., 8 Sep 2025)
Si/GaAs/III-V	$10^6$	Finites $T$ bands/DOS, mobility	(Gu et al., 2023)

6. Connection to Traditional and Modern Fitting Strategies

Classical ab initio-parametrized TB models constructed via direct projection or mapping procedures—such as Löwdin orthogonalization, maximally localized Wannier function (MLWF) projection, and numerical optimization of on-site/hopping matrices—are foundational but have limits in heterostructure transferability and environmental dependence (Tan et al., 2015, Agapito et al., 2016, Dick et al., 6 Aug 2025). The essence of the modern ML-TB approach is that, rather than a fixed parameter set deduced by least-squares fit, the TB parameters become functional outputs of symmetry-respecting environmental descriptors, enabling superior transfer and adaptability with less manual heuristic.

A plausible implication is that, as TB models increasingly incorporate ML-based parametrization and systematically generated reference data, the distinction between “empirical”, “environment-dependent semi-empirical” (Hegde et al., 2013), and “ab initio-parametrized” TB will continue to blur, with high-level model selection governed by the target system's size, complexity, and required electronic property accuracy.

7. Future Prospects and Open Directions

State-of-the-art ab initio-parametrized TB models, especially those leveraging deep neural networks and O(N) quantum transport codes, now permit routine, high-accuracy calculation of electronic, optical, and even transport properties in systems far beyond the realistic reach of either pure DFT or classical empirical TB. Work such as GPUTB (Wang et al., 8 Sep 2025) and DeePTB (Gu et al., 2023) establishes a scalable, transparent, and generalizable framework.

Open research directions include:

Extension to strongly correlated and magnetic systems (including spin-orbit, exchange splittings, and beyond-DFT effects).
Automated TB parameter pipelines for heterostructures, interfaces, and complex device-scale geometries.
Coupling with many-body methods for accurate low-energy effective models.
Integration with high-throughput workflows for materials discovery and machine-learned property prediction.

Ab initio-parametrized TB thus bridges atomistic accuracy and device-scale applicability, and is central to the ongoing convergence of first-principles electronic structure theory and large-scale materials simulation.