Assessing the Extrapolation Capability of Template-Free Retrosynthesis Models
Abstract: Despite the acknowledged capability of template-free models in exploring unseen reaction spaces compared to template-based models for retrosynthesis prediction, their ability to venture beyond established boundaries remains relatively uncharted. In this study, we empirically assess the extrapolation capability of state-of-the-art template-free models by meticulously assembling an extensive set of out-of-distribution (OOD) reactions. Our findings demonstrate that while template-free models exhibit potential in predicting precursors with novel synthesis rules, their top-10 exact-match accuracy in OOD reactions is strikingly modest (< 1%). Furthermore, despite the capability of generating novel reactions, our investigation highlights a recurring issue where more than half of the novel reactions predicted by template-free models are chemically implausible. Consequently, we advocate for the future development of template-free models that integrate considerations of chemical feasibility when navigating unexplored regions of reaction space.
- Machine intelligence for chemical reaction space. Wiley Interdisciplinary Reviews: Computational Molecular Science, 12(5):e1604, 2022.
- Predictive chemistry: machine learning for reaction deployment, reaction development, and reaction discovery. Chemical Science, 2023.
- Computer-assisted retrosynthesis based on molecular similarity. ACS central science, 3(12):1237–1245, 2017.
- Neural-symbolic machine learning for retrosynthesis and reaction prediction. Chemistry–A European Journal, 23(25):5966–5971, 2017.
- Retrosynthesis prediction with conditional graph logic network. Advances in Neural Information Processing Systems, 32, 2019.
- Deep retrosynthetic reaction prediction using local reactivity and global attention. JACS Au, 1(10):1612–1620, 2021.
- Retrosynthetic reaction prediction using neural sequence-to-sequence models. ACS central science, 3(10):1103–1113, 2017.
- State-of-the-art augmented nlp transformer models for direct and single-step retrosynthesis. Nature communications, 11(1):1–11, 2020.
- Gta: Graph truncated attention for retrosynthesis. In Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, pages 531–539. 2021.
- Chemformer: a pre-trained transformer for computational chemistry. Machine Learning: Science and Technology, 3(1):015022, 2022.
- Retroformer: Pushing the limits of end-to-end retrosynthesis transformer. In International Conference on Machine Learning, pages 22475–22490. PMLR, 2022.
- Permutation invariant graph-to-sequence model for template-free retrosynthesis and reaction prediction. Journal of chemical information and modeling, 62(15):3503–3513, 2022.
- Lowe, D. M. Extraction of chemical structures and reactions from the literature. Ph.D. thesis, University of Cambridge, 2012.
- A collection of robust organic synthesis reactions for in silico molecule design. Journal of chemical information and modeling, 51(12):3093–3098, 2011.
- Rdchiral: An rdkit wrapper for handling stereochemistry in retrosynthetic template extraction and application. Journal of chemical information and modeling, 59(6):2529–2537, 2019.
- What’s what: The (nearly) definitive guide to reaction role assignment. Journal of chemical information and modeling, 56(12):2336–2346, 2016.
- Predicting organic reaction outcomes with weisfeiler-lehman network. Advances in neural information processing systems, 30, 2017.
- A generalized-template-based graph neural network for accurate organic reactivity prediction. Nature Machine Intelligence, 4(9):772–780, 2022.
- Molecular transformer: a model for uncertainty-calibrated chemical reaction prediction. ACS central science, 5(9):1572–1583, 2019.
- Predicting retrosynthetic pathways using transformer-based models and a hyper-graph exploration strategy. Chemical science, 11(12):3316–3325, 2020.
- Landrum, G. Rdkit documentation. Release, 1(1-79):4, 2013.
- Extraction of organic chemistry grammar from unsupervised learning of chemical reactions. Science Advances, 7(15):eabe4166, 2021.
- Attention is all you need. Advances in neural information processing systems, 30, 2017.
- Root-aligned smiles: a tight representation for chemical reaction prediction. Chemical Science, 13(31):9023–9034, 2022.
- o𝑜oitalic_o-gnn: incorporating ring priors into molecular modeling. In The Eleventh International Conference on Learning Representations. 2022.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.