FMQA: Factorization Machine with Quadratic Annealing
- The paper introduces FMQA, a framework that uses a trained factorization machine as a quadratic surrogate (QUBO) to efficiently reduce expensive objective evaluations.
- It combines gradient-trained FM regression with annealing optimizers, such as quantum and simulated annealers, to solve complex combinatorial and binary variable problems.
- FMQA demonstrates significant improvements in applications like materials discovery, genetic studies, and nanophotonic design by reducing computational costs and accelerating convergence.
Factorization Machine with Quadratic Optimization Annealing (FMQA) is a surrogate-model-based black-box optimization framework integrating quadratic factorization machine (FM) regression with quantum or classical Ising machine optimization. The central principle is to use a trained FM as a quadratic surrogate (QUBO) for an expensive objective, then rapidly propose low-cost candidates using annealing to optimize the surrogate. FMQA is well-suited to combinatorial, binary, integer, and discrete variables, is scalable to high dimensions, and underpins a wide range of applications in black-box optimization across physics, chemistry, biology, and engineering (Tamura et al., 24 Jul 2025).
1. Mathematical Framework and Surrogate Construction
The FMQA method is anchored on the second-order factorization machine, parameterized as follows for binary inputs :
with as latent factor vectors (), the bias, the linear weights, and as factorized quadratic couplings. The FM is fitted to a dataset by minimizing squared error or a regularized loss (Tamura et al., 24 Jul 2025, Couzinie et al., 2024, Nakano et al., 28 Jul 2025):
Hyperparameters include the FM rank , regularization coefficients, and learning rate for the gradient-based optimizer.
After training, the surrogate output is in QUBO (quadratic unconstrained binary optimization) form:
where , , and constant (typically omitted in the QUBO for optimization) (Tamura et al., 24 Jul 2025).
2. QUBO Mapping and Annealing Optimization
Optimization is performed over the FM surrogate by solving the QUBO problem:
using either physical quantum annealers (e.g., D-Wave), simulated annealing, or digital Ising machines (Tamura et al., 24 Jul 2025, Nakano et al., 28 Jul 2025, Couzinie et al., 2024). In the case where encodes constraint structure (such as fixed Hamming weight or one-hot blocks), quadratic penalty terms of the form or are added to (Endo et al., 2024, Kikuchi et al., 5 Jan 2026). For quantum annealing, the QUBO is mapped to the Ising Hamiltonian through a change of variables , giving (Tamura et al., 24 Jul 2025, Wang et al., 2 Jul 2025).
The solver typically performs many sampling runs with progressively decreasing temperature (classical or quantum schedule), each proposing candidate minimizing the surrogate. Constraints are either strictly enforced (hard penalty terms, feasible region encoding) or softly imposed (Couzinie et al., 2024, Kikuchi et al., 5 Jan 2026).
3. Iterative Optimization Loop and Data Management
The FMQA procedure is iterative, combining surrogate updates with annealing-based candidate proposals and true function evaluations. A typical workflow is as follows (Tamura et al., 24 Jul 2025, Nakano et al., 28 Jul 2025, Kikuchi et al., 5 Jan 2026):
- Initialization: Generate an initial set of random valid samples , evaluate , and construct initial dataset .
- Training: Fit FM parameters to .
- QUBO Extraction: Form updated QUBO matrix from FM coefficients.
- Annealing Step: Solve QUBO (Ising/annealer) to generate low-cost candidate solutions.
- Evaluation & Data Update: Evaluate the black-box function at new candidates; augment with pairs.
- Dataset Management: Optionally, restrict to the most recent samples to avoid dilution of new information—a limited-memory FIFO update improves convergence speed versus training on all data (Nakano et al., 28 Jul 2025).
- Convergence Check: Continue until a stopping criterion is met (maximum iterations, stagnation, or target attained).
This loop is summarized in the following schematic table:
| Step | Action | Computational Aspect |
|---|---|---|
| Initialization | Generate, evaluate initial points | black-box calls |
| Training | FM fit via gradient-based optimizer | per epoch |
| QUBO Extraction | Read-off quadratic coefficients | |
| Annealing | Optimize surrogate on Ising/annealer | Annealing time per solve |
| Evaluation | Evaluate black-box at new solutions | Costly, main bottleneck |
| Data Update | Add new data (optionally FIFO/truncate) | per step |
By carefully managing the dataset size, FMQA avoids stagnation arising from the diminishing influence of new data on the surrogate’s loss landscape (Nakano et al., 28 Jul 2025).
4. Extensions: Initialization, Smoothing, and Higher-Order Interactions
Low-Rank Warm-Start Initialization
Systematic initialization of the FM parameters via low-rank approximation of a known or approximated Ising model dramatically improves early convergence. Given symmetric coupling matrix , the procedure shifts (where is the smallest eigenvalue), truncates to rank , and encodes by assigning rows of to FM's factor vectors . Random-matrix theory predicts the required FM rank for target accuracy, and this method yields a surrogate whose error rapidly vanishes as (Seki et al., 2024).
Function Smoothing for Continuous/Binary Encodings
When continuous variables are discretized to binary via one-hot, FM training can leave “dead” features (bits unactivated in data) with random coefficients, introducing significant noise in the surrogate surface. Adding a function-smoothing regularizer penalizing differences in and between neighboring bits restores smoothness and regularizes the surrogate:
This dramatically improves sample efficiency, especially in low-data regimes for parameter-fitting tasks (Endo et al., 2024).
Higher-Order Interactions via Slack Variables
To efficiently capture higher-order (beyond quadratic) interactions, FMQA can be extended with slack variables: . These variables, introduced as input features and learned via annealing, enhance the FM surrogate’s expressiveness, effectively approximating higher-order terms while keeping optimization quadratic. Empirical results confirm improved surrogate accuracy and data efficiency for moderate (Wang et al., 2 Jul 2025).
5. Applications in Scientific Computing and Engineering
FMQA supports optimization in domains with expensive, black-box objective functions and high-dimensional, combinatorial search spaces:
- Materials discovery and crystal structure prediction: Encodes atomic arrangements as binaries; efficiently samples ground and metastable states for Lennard-Jones, Stillinger–Weber, and EAM potentials with sub-linear numbers of energy evaluations. Empirically, FMQA finds ground states in fewer than 2–20 iterations where brute-force search would have orders-of-magnitude higher cost. Ranking of local minima by FM surrogate’s energy correlates well with true energetics (Kendall often ) (Couzinie et al., 2024).
- Combinatorial genetic studies: High-order epistasis detection is recast as subset selection under MDR-based CER evaluation. FMQA identifies true epistatic sets with up to loci (combinatorial search spaces ) using only – expensive MDR calls, a reduction of many orders of magnitude compared to exhaustive search (Kikuchi et al., 5 Jan 2026).
- Nanophotonic and metasurface design: A binary VAE maps continuous designs to latent codes; FM surrogate is trained over latent space, rapidly proposing out-of-sample device designs outperforming training-set bests for efficiency and diffraction—gain of several pp after 30–50 iterations (Wilson et al., 2021).
- Drug combination effect prediction: Augmented FMQA with slack variables achieves higher accuracy and lower variance in high-sparsity dose-response matrix reconstruction and pair prediction (Wang et al., 2 Jul 2025).
- Hyperparameter tuning, circuit design, traffic signal optimization, molecule and peptide generation: FMQA’s applicability is broad, covering discrete or binary-encoded scientific and engineering optimization (Tamura et al., 24 Jul 2025).
Key performance metrics across domains include success rate for ground-truth identification, mean number of expensive evaluations required, solution ranking accuracy (Spearman/Kendall), and convergence to known minima.
6. Algorithmic Variants, Complexity, and Solver Landscape
The core FMQA loop is agnostic to the underlying annealing solver: quantum annealers, simulated annealing, digital annealers, and classical Ising machines are all supported. Solver-specific workflows involve:
- QUBO assembly: time to build from trained FM.
- Annealing/sampling: Hardware-dependent; D-Wave/Ocean samplers, Fixstars Amplify AE, Fujitsu Digital Annealer are typical (Tamura et al., 24 Jul 2025).
- FM training: per epoch; efficiency scales with latent dimension and dataset size.
- Data handling: Memory-limited updates ( samples) improve convergence speed and reduce stagnation (Nakano et al., 28 Jul 2025).
Python packages such as fmqa, Amplify SDK (Amplify-BBOpt), D-Wave’s dimod, and PyTorch-based FM libraries are available for practical implementation; these enable domain experts to encode variables, initialize the loop, and interface with various annealing engines (Tamura et al., 24 Jul 2025).
7. Limitations and Open Directions
While FMQA achieves notable data efficiency and scales to very large binary variable spaces (up to hundreds of thousands of bits), several challenges remain:
- Surrogate accuracy: The ranking and smoothness of the FM’s QUBO can be sensitive to training data coverage and rank ; smoothing regularization and low-rank initialization partially mitigate these effects (Seki et al., 2024, Endo et al., 2024). Dead features in high-dimensional or sparse encodings can still degrade surrogate quality.
- Higher-order interactions: Standard FM models only quadratic (pairwise) feature interactions. The addition of slack bits enables compact approximate higher-order modeling, at the expense of increased QUBO size and mixed-integer complexity (Wang et al., 2 Jul 2025).
- Annealer and solver limits: Physical quantum annealers and digital Ising machines have finite embedding capacities, imposing upper bounds on ; solution quality can depend on chain lengths (minor embedding), noise, and annealing schedule specifics.
- Convergence guarantees: The nonconvexity of the surrogate landscape and finite QUBO solver fidelity can lead to suboptimal convergence; no formal convergence guarantees are provided for arbitrary black-box functions.
Continued advances are anticipated by integrating more powerful surrogate models (e.g., hybrid deep-factored FM architectures), further optimizing data selection and regularization, and scaling up hardware and constrained sampler capabilities. Extensions are likely in domains requiring efficient black-box optimization under combinatorial constraints, such as quantum device layout, network design, and synthetic biology (Tamura et al., 24 Jul 2025, Kikuchi et al., 5 Jan 2026).