FoldSAE: Learning to Steer Protein Folding Through Sparse Representations
Abstract: RFdiffusion is a popular and well-established model for generation of protein structures. However, this generative process offers limited insight into its internal representations and how they contribute to the final protein structure. Concurrently, recent work in mechanistic interpretability has successfully used Sparse Autoencoders (SAEs) to discover interpretable features within neural networks. We combine these concepts by applying SAE to the internal representations of RFdiffusion to uncover secondary structure-specific features and establish a relationship between them and generated protein structures. Building on these insights, we introduce a novel steering mechanism that enables precise control of secondary structure formation through a tunable hyperparameter, while simultaneously revealing interpretable block and neuron-level representations within RFdiffusion. Our work pioneers a new framework for making RFdiffusion more interpretable, demonstrating how understanding internal features can be directly translated into precise control over the protein design process.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.