Papers
Topics
Authors
Recent
Search
2000 character limit reached

EnzyControl: Adding Functional and Substrate-Specific Control for Enzyme Backbone Generation

Published 29 Oct 2025 in q-bio.BM and cs.LG | (2510.25132v1)

Abstract: Designing enzyme backbones with substrate-specific functionality is a critical challenge in computational protein engineering. Current generative models excel in protein design but face limitations in binding data, substrate-specific control, and flexibility for de novo enzyme backbone generation. To address this, we introduce EnzyBind, a dataset with 11,100 experimentally validated enzyme-substrate pairs specifically curated from PDBbind. Building on this, we propose EnzyControl, a method that enables functional and substrate-specific control in enzyme backbone generation. Our approach generates enzyme backbones conditioned on MSA-annotated catalytic sites and their corresponding substrates, which are automatically extracted from curated enzyme-substrate data. At the core of EnzyControl is EnzyAdapter, a lightweight, modular component integrated into a pretrained motif-scaffolding model, allowing it to become substrate-aware. A two-stage training paradigm further refines the model's ability to generate accurate and functional enzyme structures. Experiments show that our EnzyControl achieves the best performance across structural and functional metrics on EnzyBind and EnzyBench benchmarks, with particularly notable improvements of 13\% in designability and 13\% in catalytic efficiency compared to the baseline models. The code is released at https://github.com/Vecteur-libre/EnzyControl.

Summary

  • The paper presents EnzyControl, a novel framework that achieves a 13% improvement in native enzyme design using guided generative modeling.
  • It employs a two-stage training strategy with a pretrained base network and EnzyAdapter to inject substrate information via cross-modal projection.
  • Benchmarking on the EnzyBind dataset shows enhanced catalytic efficiency and substrate specificity, with zero-shot generalization on unseen enzymes.

EnzyControl: Adding Functional and Substrate-Specific Control for Enzyme Backbone Generation

EnzyControl is an innovative framework designed to address limitations in current computational protein engineering techniques by allowing for functional and substrate-specific control in enzyme backbone generation. It is a promising approach for generating enzyme structures customized for specific interactions with substrates, integrating functional site conservation and substrate-awareness into the generative process.

Introduction

Designing enzyme backbones that exhibit substrate-specific functionality is a critical challenge due to the stringent requirements for substrate binding, functional site preservation, and sensitive catalytic conformations. Traditional protein design methods are inadequate for enzyme design because they often neglect these aspects. EnzyControl leverages a curated dataset, EnzyBind, containing 11,100 enzyme-substrate pairs, focusing on utilizing multiple sequence alignments (MSA) for functional site annotation and incorporating substrate information via a modular component called EnzyAdapter. Figure 1

Figure 1

Figure 1: Dataset collection and preprocessing.

Methodology

EnzyControl Architecture

EnzyControl consists of three key components:

  1. Base Network: Pretrained for motif-scaffolding, integrating functional site conservation using MSA.
  2. EnzyAdapter: A modular addition that injects substrate information into the network, employing a cross-modal projector to bridge the modality gap between substrates and enzyme backbones.
  3. Two-Stage Training Strategy: Initial training aligns substrate features with enzyme structures without altering base network parameters, followed by fine-tuning using Low-Rank Adaptation (LoRA) methods. Figure 2

    Figure 2: EnzyControl is a flexible approach for the conditional backbone generation of enzymes.

Flow Matching

EnzyControl utilizes Flow Matching (FM), a generative modeling technique, which provides efficient and stable sampling processes. This technique estimates vector fields describing the evolution between data and noise distributions for generating enzyme backbones. FM formulates the generative task as solving an Ordinary Differential Equation (ODE) enabling backward sampling from noise.

Experimental Results

EnzyControl was benchmarked on the EnzyBind dataset across multiple structural and functional metrics. Significant findings include:

  • Designability: Achieving a 13% improvement over baseline models, signifying enhanced alignment to native enzyme structures.
  • Functional Performance: Improvements of 13% in catalytic efficiency ($k_{\text{cat}$) and 10% in EC match rates, demonstrating strong functionality preservation during backbone generation.
  • Substrate Affinity: Enhanced binding affinity and substrate specificity scores compared to other models.

EnzyControl also exhibits zero-shot generalization capabilities, maintaining strong binding affinities on previously unseen substrates and enzyme categories. Figure 3

Figure 3

Figure 3: Zero-shot generalization.

Case Study

A targeted case study on enzyme 2cv3 demonstrated that EnzyControl-generated backbones achieve better substrate-specificity and interaction characteristics than existing models like RFDiffusion. Figure 4

Figure 4: Comparison of docking results between EnzyControl and RFDiffusion on the 2cv3 enzyme.

Conclusion

EnzyControl effectively integrates functional site conservation and substrate-specific control, pioneering a nuanced approach to enzyme design that extends current motif-scaffolding models. The framework not only advances structural accuracy but also enhances functional relevance in enzyme backbone generation, offering a robust tool for computational protein engineering and potential practical applications in pharmaceuticals, specialty chemicals, and biotechnology. Future directions include refining substrate-conformation modeling and expanding capabilities for multi-substrate or multi-chain enzyme systems.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 1 tweet with 8 likes about this paper.