Papers
Topics
Authors
Recent
Search
2000 character limit reached

Towards a more inductive world for drug repurposing approaches

Published 21 Nov 2023 in cs.LG and q-bio.QM | (2311.12670v2)

Abstract: Drug-target interaction (DTI) prediction is a challenging, albeit essential task in drug repurposing. Learning on graph models have drawn special attention as they can significantly reduce drug repurposing costs and time commitment. However, many current approaches require high-demanding additional information besides DTIs that complicates their evaluation process and usability. Additionally, structural differences in the learning architecture of current models hinder their fair benchmarking. In this work, we first perform an in-depth evaluation of current DTI datasets and prediction models through a robust benchmarking process, and show that DTI prediction methods based on transductive models lack generalization and lead to inflated performance when evaluated as previously done in the literature, hence not being suited for drug repurposing approaches. We then propose a novel biologically-driven strategy for negative edge subsampling and show through in vitro validation that newly discovered interactions are indeed true. We envision this work as the underpinning for future fair benchmarking and robust model design. All generated resources and tools are publicly available as a python package.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (52)
  1. Simon G. Patching. Surface plasmon resonance spectroscopy for characterisation of membrane protein–ligand interactions and its potential for drug discovery. Biochimica et Biophysica Acta (BBA) - Biomembranes, 1838(1, Part A):43–55, 2014. Structural and biophysical characterisation of membrane protein-ligand binding.
  2. Discovering high-affinity ligands for proteins: Sar by nmr. Science, 274(5292):1531–1534, 1996.
  3. Innovation in the pharmaceutical industry: New estimates of r&d costs. Journal of Health Economics, 47:20–33, 2016.
  4. Geometric deep learning: Grids, groups, graphs, geodesics, and gauges. arXiv preprint arXiv:2104.13478, 2021.
  5. Computational methods for drug design and discovery: focus on china. Trends Pharmacol Sci, 34(10):549–559, Oct 2013.
  6. In silico methods to address polypharmacology: current status, applications and future perspectives. Drug Discov Today, 21(2):288–298, Feb 2016.
  7. DDR: efficient computational method to predict drug–target interactions using graph mining and machine learning approaches. Bioinformatics, 34(7):1164–1173, 11 2017.
  8. An end-to-end heterogeneous graph representation learning-based framework for drug–target interaction prediction. Briefings in Bioinformatics, 22(5), 2021.
  9. HyperAttentionDTI: improving drug–protein interaction prediction by sequence-based deep learning with attention mechanism. Bioinformatics, 38(3):655–662, 10 2021.
  10. Prediction of drug–target interaction networks from the integration of chemical and genomic spaces. Bioinformatics, 24(13):i232–i240, 2008.
  11. Drugbank: a comprehensive resource for in silico drug discovery and exploration. Nucleic acids research, 34(suppl_1):D668–D672, 2006.
  12. Sagar Maheshwari Marinka Zitnik, Rok Sosič and Jure Leskovec. BioSNAP Datasets: Stanford biomedical network dataset collection. http://snap.stanford.edu/biodata, 2018.
  13. Comprehensive analysis of kinase inhibitor selectivity. Nature biotechnology, 29(11):1046–1051, 2011.
  14. Bindingdb: a web-accessible database of experimentally determined protein–ligand binding affinities. Nucleic acids research, 35(suppl_1):D198–D201, 2007.
  15. Diversity and chemical library networks of large data sets. Journal of Chemical Information and Modeling, 62(9):2186–2201, 2022.
  16. Promiscuous drugs compared to selective drugs (promiscuity can be a virtue). BMC Clinical Pharmacology, 5(1):3, 2005.
  17. Deep-learning-based drug–target interaction prediction. Journal of proteome research, 16(4):1401–1409, 2017.
  18. Deepdta: deep drug–target binding affinity prediction. Bioinformatics, 34(17):i821–i829, 2018.
  19. A network integration approach for drug-target interaction prediction and computational drug repositioning from heterogeneous information. Nat Commun, 8(1):573, 2017.
  20. Dtigems+: drug–target interaction prediction using graph embedding, graph mining, and similarity-based techniques. Journal of Cheminformatics, 12(1):44, 2020.
  21. Dti2vec: Drug–target interaction prediction using network embedding and ensemble learning. Journal of Cheminformatics, 13(1):71, 2021.
  22. NeoDTI: neural integration of neighbor information from a heterogeneous network for discovering new drug–target interactions. Bioinformatics, 35(1):104–111, 07 2018.
  23. MolTrans: Molecular Interaction Transformer for drug–target interaction prediction. Bioinformatics, 37(6):830–836, 10 2020.
  24. The sider database of drugs and side effects. Nucleic Acids Res, 44, 2016.
  25. Comparative toxicogenomics database (ctd): update 2021. Nucleic Acids Research, 49(D1):D1138–D1143, 2020.
  26. Flaws in evaluation schemes for pair-input computational predictions. Nature methods, 9(12):1134–1136, 2012.
  27. Toward more realistic drug–target interaction predictions. Briefings in Bioinformatics, 16(2):325–337, 04 2014.
  28. node2vec: Scalable feature learning for networks. In Proceedings of the 22nd ACM SIGKDD international conference on Knowledge discovery and data mining, pages 855–864, 2016.
  29. The evolution of protein structures and structural ensembles under functional constraint. Genes, 2(4):748–762, 2011.
  30. Computational modelling of cancerous mutations in the egfr/erk signalling pathway. BMC systems biology, 3(1):1–17, 2009.
  31. S6k1 regulates gsk3 under conditions of mtor-dependent feedback inhibition of akt. Molecular cell, 24(2):185–197, 2006.
  32. Stanford-SNAP-Group. Miner: Gigascale multimodal biological network. GitHub Repository, 2017.
  33. Therapeutics data commons: Machine learning datasets and tasks for drug discovery and development, 2021.
  34. Drug–target prediction utilizing heterogeneous bio-linked network embeddings. Briefings in Bioinformatics, 22(1):568–580, 12 2019.
  35. Graph regularized non-negative matrix factorization with prior knowledge consistency constraint for drug–target interactions prediction. BMC Bioinformatics, 23(1):564, 2022.
  36. From genomics to chemical genomics: new developments in kegg. Nucleic Acids Res, 34(Database issue):D354–7, Jan 2006.
  37. Brenda, the enzyme database: updates and major new developments. Nucleic Acids Res, 32(Database issue):D431–3, Jan 2004.
  38. Supertarget and matador: resources for exploring drug-target relationships. Nucleic Acids Res, 36(Database issue):D919–22, Jan 2008.
  39. US Food and Drug Administration. Questions and answers on fda’s adverse event reporting system (faers). Washington: US Department of Health and Human Services, 2018.
  40. Human protein reference database–2009 update. Nucleic Acids Res, 37, 2009.
  41. Stitch: interaction networks of chemicals and proteins. Nucleic acids research, 36(suppl_1):D684–D688, 2007.
  42. Biomart–biological queries made easy. BMC genomics, 10(1):1–12, 2009.
  43. Chembl: a large-scale bioactivity database for drug discovery. Nucleic acids research, 40(D1):D1100–D1107, 2012.
  44. PyTorch: An Imperative Style, High-Performance Deep Learning Library. Curran Associates Inc., Red Hook, NY, USA, 2019.
  45. Rdkit: Open-source cheminformatics. version 2022.09.1. 2022.
  46. The Protein Data Bank. Nucleic Acids Research, 28(1):235–242, 01 2000.
  47. AlphaFold Protein Structure Database: massively expanding the structural coverage of protein-sequence space with high-accuracy models. Nucleic Acids Research, 50(D1):D439–D444, 11 2021.
  48. Highly accurate protein structure prediction with alphafold. Nature, 596(7873):583–589, 2021.
  49. Schrödinger, LLC. The PyMOL molecular graphics system, version 1.8. November 2015.
  50. An integrative approach unveils fosl1 as an oncogene vulnerability in kras-driven lung and pancreatic cancer. Nature Communications, 8(1):14294, 2017.
  51. The UniProt Consortium. UniProt: the Universal Protein Knowledgebase in 2023. Nucleic Acids Research, 51(D1):D523–D531, 11 2022.
  52. Classyfire: automated chemical classification with a comprehensive, computable taxonomy. Journal of Cheminformatics, 8(1):61, 2016.

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.