Papers
Topics
Authors
Recent
Search
2000 character limit reached

Empowering Biomedical Discovery with AI Agents

Published 3 Apr 2024 in cs.AI | (2404.02831v2)

Abstract: We envision "AI scientists" as systems capable of skeptical learning and reasoning that empower biomedical research through collaborative agents that integrate AI models and biomedical tools with experimental platforms. Rather than taking humans out of the discovery process, biomedical AI agents combine human creativity and expertise with AI's ability to analyze large datasets, navigate hypothesis spaces, and execute repetitive tasks. AI agents are poised to be proficient in various tasks, planning discovery workflows and performing self-assessment to identify and mitigate gaps in their knowledge. These agents use LLMs and generative models to feature structured memory for continual learning and use machine learning tools to incorporate scientific knowledge, biological principles, and theories. AI agents can impact areas ranging from virtual cell simulation, programmable control of phenotypes, and the design of cellular circuits to developing new therapies.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (207)
  1. Hiroaki Kitano. Nobel turing challenge: creating the engine for scientific discovery. NPJ systems biology and applications, 7(1):29, 2021.
  2. Autonomous chemical research with large language models. Nature, 624(7992):570–578, 2023.
  3. Augmenting large language models with chemistry tools. In NeurIPS 2023 AI for Science Workshop, 2023. URL https://openreview.net/forum?id=wdGIL6lx3l.
  4. The rise and potential of large language model based agents: A survey. arXiv preprint arXiv:2309.07864, 2023.
  5. Large language model based multi-agents: A survey of progress and challenges. arXiv preprint arXiv:2402.01680, 2024.
  6. Scientific discovery in the age of artificial intelligence. Nature, 620(7972):47–60, 2023a.
  7. Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971, 2023.
  8. Gemini: a family of highly capable multimodal models. arXiv preprint arXiv:2312.11805, 2023.
  9. Improving language understanding by generative pre-training. 2018.
  10. Chatgpt for robotics: Design principles and model abilities. Microsoft Auton. Syst. Robot. Res, 2:20, 2023.
  11. Significant Gravitas. Autogpt, 2023. URL https://agpt.co.
  12. React: Synergizing reasoning and acting in language models. In The Eleventh International Conference on Learning Representations, 2023a. URL https://openreview.net/forum?id=WE_vluYUL-X.
  13. Reflexion: Language agents with verbal reinforcement learning. In Thirty-seventh Conference on Neural Information Processing Systems, 2023.
  14. Autogen: Enabling next-gen llm applications via multi-agent conversation framework. arXiv preprint arXiv:2308.08155, 2023a.
  15. Progprompt: Generating situated robot task plans using large language models. In 2023 IEEE International Conference on Robotics and Automation (ICRA), pages 11523–11530, 2023. doi: 10.1109/ICRA48891.2023.10161317.
  16. Language models as zero-shot planners: Extracting actionable knowledge for embodied agents. In International Conference on Machine Learning, pages 9118–9147. PMLR, 2022a.
  17. On scientific understanding with artificial intelligence. Nature Reviews Physics, 4(12):761–769, 2022.
  18. Trustllm: Trustworthiness in large language models. arXiv:2401.05561, 2024.
  19. Understanding catastrophic forgetting in language models via implicit inference. arXiv:2309.10105, 2023.
  20. Ethics of large language models in medicine and medical research. The Lancet Digital Health, 5(6):e333–e335, 2023a.
  21. Unreliable llm bioethics assistants: Ethical and pedagogical risks. The American Journal of Bioethics, 23(10):89–91, 2023.
  22. The ethics of interaction: Mitigating security threats in llms. arXiv:2401.12273, 2024.
  23. Causal protein-signaling networks derived from multiparameter single-cell data. Science, 308(5721):523–529, 2005.
  24. Msa transformer. In International Conference on Machine Learning, pages 8844–8856. PMLR, 2021.
  25. Evolutionary-scale prediction of atomic-level protein structure with a language model. Science, 379(6637):1123–1130, 2023.
  26. Accurate prediction of protein structures and interactions using a three-track neural network. Science, 373(6557):871–876, 2021.
  27. Predicting the sequence specificities of dna-and rna-binding proteins by deep learning. Nature biotechnology, 33(8):831–838, 2015.
  28. Transfer learning enables predictions in network biology. Nature, pages 1–9, 2023.
  29. Translation of genotype to phenotype by a hierarchy of cell subsystems. Cell systems, 2(2):77–88, 2016.
  30. Gene expression correlates of clinical prostate cancer behavior. Cancer cell, 1(2):203–209, 2002.
  31. Diffuse large b-cell lymphoma outcome prediction by gene-expression profiling and supervised machine learning. Nature medicine, 8(1):68–74, 2002.
  32. Predicting drug response and synergy using a deep learning model of human cancer cells. Cancer cell, 38(5):672–684, 2020.
  33. A small-molecule tnik inhibitor targets fibrosis in preclinical and clinical models. Nature Biotechnology, March 2024. ISSN 1087-0156, 1546-1696. doi: 10.1038/s41587-024-02143-0. URL https://www.nature.com/articles/s41587-024-02143-0.
  34. A deep learning approach to antibiotic discovery. Cell, 180(4):688–702, 2020.
  35. The protein data bank. Nucleic acids research, 28(1):235–242, 2000.
  36. 1000 Genomes Project Consortium et al. An integrated map of genetic variation from 1,092 human genomes. Nature, 491(7422):56, 2012.
  37. Drugbank: a comprehensive resource for in silico drug discovery and exploration. Nucleic acids research, 34(suppl_1):D668–D672, 2006.
  38. Alphafold protein structure database: massively expanding the structural coverage of protein-sequence space with high-accuracy models. Nucleic acids research, 50(D1):D439–D444, 2022.
  39. Highly accurate protein structure prediction with alphafold. Nature, 596(7873):583–589, 2021.
  40. The anatomy of a large-scale hypertextual web search engine. Computer networks and ISDN systems, 30(1-7):107–117, 1998.
  41. Basic local alignment search tool. Journal of molecular biology, 215(3):403–410, 1990.
  42. The chembl database in 2017. Nucleic acids research, 45(D1):D945–D954, 2017.
  43. Fast and accurate protein structure search with foldseek. Nature Biotechnology, pages 1–4, 2023.
  44. Siren’s song in the ai ocean: a survey on hallucination in large language models. arXiv preprint arXiv:2309.01219, 2023a.
  45. Imagenet classification with deep convolutional neural networks. Advances in neural information processing systems, 25, 2012.
  46. Deep learning. MIT press, 2016.
  47. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778, 2016.
  48. Attention is all you need. Advances in neural information processing systems, 30, 2017.
  49. Multi-fidelity active learning with GFlownets, 2024. URL https://openreview.net/forum?id=3QR230r11w.
  50. Training language models to follow instructions with human feedback. In Alice H. Oh, Alekh Agarwal, Danielle Belgrave, and Kyunghyun Cho, editors, Advances in Neural Information Processing Systems, 2022. URL https://openreview.net/forum?id=TG8KACxEON.
  51. Deep learning enables rapid identification of potent ddr1 kinase inhibitors. Nature biotechnology, 37(9):1038–1040, 2019.
  52. Adaptive machine learning for protein engineering. Current Opinion in Structural Biology, 72:145–152, February 2022. ISSN 0959440X. doi: 10.1016/j.sbi.2021.11.002. URL https://linkinghub.elsevier.com/retrieve/pii/S0959440X21001457.
  53. Top-down design of protein architectures with reinforcement learning. Science, 380(6642):266–273, April 2023. ISSN 0036-8075, 1095-9203. doi: 10.1126/science.adf6591.
  54. Deep batch active learning for drug discovery. bioRxiv, pages 2023–07, 2023.
  55. Evidential deep learning for guided molecular property prediction and discovery. ACS Central Science, 7(8):1356–1367, August 2021. ISSN 2374-7943, 2374-7951. doi: 10.1021/acscentsci.1c00546.
  56. Active learning for optimal intervention design in causal models. Nature Machine Intelligence, 5(10):1066–1075, October 2023b. ISSN 2522-5839. doi: 10.1038/s42256-023-00719-0.
  57. Optimizing risk-based breast cancer screening policies with reinforcement learning. Nature Medicine, 28(1):136–143, January 2022. ISSN 1078-8956, 1546-170X. doi: 10.1038/s41591-021-01599-w. URL https://www.nature.com/articles/s41591-021-01599-w.
  58. Cognitive architectures for language agents. Transactions on Machine Learning Research, 2024. ISSN 2835-8856. URL https://openreview.net/forum?id=1i6ZCvflQJ. Survey Certification.
  59. A survey on large language model based autonomous agents. Frontiers of Computer Science, 18(6):1–26, 2024.
  60. Finetuned language models are zero-shot learners. In International Conference on Learning Representations, 2022a. URL https://openreview.net/forum?id=gEZrGCozdqR.
  61. A visual–language foundation model for pathology image analysis using medical twitter. Nature medicine, 29(9):2307–2316, 2023a.
  62. Can generalist foundation models outcompete special-purpose tuning? case study in medicine. arXiv preprint arXiv:2311.16452, 2023.
  63. Generative agents: Interactive simulacra of human behavior. In Proceedings of the 36th Annual ACM Symposium on User Interface Software and Technology, pages 1–22, 2023.
  64. Biogpt: generative pre-trained transformer for biomedical text generation and mining. Briefings in Bioinformatics, 23(6):bbac409, 2022.
  65. Health system-scale language models are all-purpose prediction engines. Nature, pages 1–6, 2023.
  66. Large language models encode clinical knowledge. Nature, 620(7972):172–180, 2023a.
  67. Towards expert-level medical question answering with large language models. arXiv preprint arXiv:2305.09617, 2023b.
  68. Language models are few-shot learners. In H. Larochelle, M. Ranzato, R. Hadsell, M.F. Balcan, and H. Lin, editors, Advances in Neural Information Processing Systems, volume 33, pages 1877–1901. Curran Associates, Inc., 2020.
  69. Voyager: An open-ended embodied agent with large language models. In Intrinsically-Motivated and Open-Ended Learning Workshop @NeurIPS2023, 2023b. URL https://openreview.net/forum?id=nfx5IutEed.
  70. Promptbreeder: Self-referential self-improvement via prompt evolution, 2024. URL https://openreview.net/forum?id=HKkiX32Zw1.
  71. Large language models as optimizers. arXiv preprint arXiv:2309.03409, 2023.
  72. Yann LeCun. A path towards autonomous machine intelligence. Open Review, 62(1), 2022.
  73. Can large language models provide useful feedback on research papers? a large-scale empirical analysis. arXiv preprint arXiv:2310.01783, 2023a.
  74. Reconcile: Round-table conference improves reasoning via consensus among diverse llms. arXiv preprint arXiv:2309.13007, 2023.
  75. Biological research and self-driving labs in deep space supported by artificial intelligence. Nature Machine Intelligence, 5(3):208–219, 2023.
  76. Advancing mathematics by guiding human intuition with ai. Nature, 600(7887):70–74, 2021.
  77. A brief history of the hypothesis. Cell, 134(3):378–381, 2008.
  78. In silico protein interaction screening uncovers donson’s role in replication initiation. Science, 381(6664):eadi3448, 2023.
  79. Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems, 35:24824–24837, 2022b.
  80. Automated bioinformatics analysis via autoba. arXiv preprint arXiv:2309.03242, 2023a.
  81. Medagents: Large language models as collaborators for zero-shot medical reasoning. arXiv preprint arXiv:2311.10537, 2023.
  82. De novo drug design using reinforcement learning with multiple gpt agents. In Thirty-seventh Conference on Neural Information Processing Systems, 2023a.
  83. Levels of agi: Operationalizing progress on the path to agi. arXiv preprint arXiv:2311.02462, 2023.
  84. Dual use of artificial-intelligence-powered drug discovery. Nature Machine Intelligence, 4(3):189–191, March 2022. ISSN 2522-5839. doi: 10.1038/s42256-022-00465-9. URL https://www.nature.com/articles/s42256-022-00465-9.
  85. Prioritizing safeguarding over autonomy: Risks of llm agents for science. arXiv preprint arXiv:2402.04247, 2024.
  86. Protein design meets biosecurity, 2024.
  87. A tutorial on conducting genome-wide association studies: Quality control and statistical analysis. Int. J. Methods Psychiatr. Res., 27(2):e1608, June 2018.
  88. Genome-wide association studies. Nature Reviews Methods Primers, 1(1):59, Aug 2021. ISSN 2662-8449. doi: 10.1038/s43586-021-00056-9. URL https://doi.org/10.1038/s43586-021-00056-9.
  89. Felix W Frueh. Real-world clinical effectiveness, regulatory transparency and payer coverage: three ingredients for translating pharmacogenomics into clinical practice. Pharmacogenomics, 11(5):657–660, May 2010.
  90. CP Panayiotopoulos. The Epilepsies: Seizures, Syndromes and Management. Bladon Medical Publishing, Oxfordshire, UK, 2005.
  91. International League Against Epilepsy Consortium on Complex Epilepsies. Gwas meta-analysis of over 29,000 people with epilepsy identifies 26 risk loci and subtype-specific genetic architecture. Nature Communications, 9(1):1–12, 2018.
  92. Uk biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLoS medicine, 12(3):e1001779, 2015.
  93. Identification of potential disease-associated variants in idiopathic generalized epilepsy using targeted sequencing. Journal of Human Genetics, 69(2):59–67, Feb 2024. ISSN 1435-232X. doi: 10.1038/s10038-023-01208-3. URL https://doi.org/10.1038/s10038-023-01208-3.
  94. Genes4Epilepsy: An epilepsy gene resource. Epilepsia, 64(5):1368–1375, May 2023.
  95. Recruitment strategies and lessons learned from a large genetic study of African Americans. PLOS Glob. Public Health, 2(8):e0000416, August 2022.
  96. Brahim Aissani. Confounding by linkage disequilibrium. Journal of Human Genetics, 59(2):110–115, 2014. ISSN 1435-232X. doi: 10.1038/jhg.2013.130. URL https://doi.org/10.1038/jhg.2013.130.
  97. The human cell atlas. eLife, 6:e27041, December 2017. ISSN 2050-084X. doi: 10.7554/eLife.27041. URL https://elifesciences.org/articles/27041.
  98. A next generation connectivity map: L1000 platform and the first 1,000,000 profiles. Cell, 171(6):1437–1452.e17, November 2017. ISSN 00928674. doi: 10.1016/j.cell.2017.10.049. URL https://linkinghub.elsevier.com/retrieve/pii/S0092867417313090.
  99. A proteome-wide atlas of drug mechanism of action. Nature Biotechnology, 41(6):845–857, June 2023. ISSN 1087-0156, 1546-1696. doi: 10.1038/s41587-022-01539-0. URL https://www.nature.com/articles/s41587-022-01539-0.
  100. Next-generation characterization of the cancer cell line encyclopedia. Nature, 569(7757):503–508, May 2019. ISSN 0028-0836, 1476-4687. doi: 10.1038/s41586-019-1186-3. URL https://www.nature.com/articles/s41586-019-1186-3.
  101. Three million images and morphological profiles of cells treated with matched chemical and genetic perturbations. Biorxiv, pages 2022–01, 2022.
  102. Rxrx3: Phenomics map of biology. bioRxiv, pages 2023–02, 2023.
  103. Convolutional networks for supervised mining of molecular patterns within cellular context. Nature Methods, 20(2):284–294, February 2023. ISSN 1548-7091, 1548-7105. doi: 10.1038/s41592-022-01746-2. URL https://www.nature.com/articles/s41592-022-01746-2.
  104. Serial lift-out: sampling the molecular anatomy of whole organisms. Nature Methods, pages 1–9, 2023.
  105. Emma Lundberg and Georg H. H. Borner. Spatial proteomics: a powerful discovery tool for cell biology. Nature Reviews Molecular Cell Biology, 20(5):285–302, May 2019. ISSN 1471-0072, 1471-0080. doi: 10.1038/s41580-018-0094-y. URL https://www.nature.com/articles/s41580-018-0094-y.
  106. Opencell: Endogenous tagging for the cartography of human cellular organization. Science, 375(6585):eabi6983, March 2022. ISSN 0036-8075, 1095-9203. doi: 10.1126/science.abi6983. URL https://www.science.org/doi/10.1126/science.abi6983.
  107. Building the next generation of virtual cells to understand cellular biology. Biophysical Journal, 122(18):3560–3569, September 2023. ISSN 00063495. doi: 10.1016/j.bpj.2023.04.006. URL https://linkinghub.elsevier.com/retrieve/pii/S0006349523002369.
  108. Contextualizing protein representations using deep learning on protein networks and single-cell data. Nature Methods, 2024.
  109. Digital pipette: open hardware for liquid transfer in self-driving laboratories. Digital Discovery, 2(6):1745–1751, 2023. doi: 10.1039/D3DD00115F. URL https://pubs.rsc.org/en/content/articlelanding/2023/dd/d3dd00115f.
  110. Systematic profiling of conditional pathway activation identifies context-dependent synthetic lethalities. Nature Genetics, 55(10):1709–1720, October 2023. ISSN 1061-4036, 1546-1718. doi: 10.1038/s41588-023-01515-7. URL https://www.nature.com/articles/s41588-023-01515-7.
  111. Slide-tags enables single-nucleus barcoding for multimodal spatial genomics. Nature, 625(7993):101–109, January 2024. ISSN 0028-0836, 1476-4687. doi: 10.1038/s41586-023-06837-4. URL https://www.nature.com/articles/s41586-023-06837-4.
  112. Proximity extension assay in combination with next-generation sequencing for high-throughput proteome-wide analysis. Molecular and Cellular Proteomics, 20:100168, 2021. ISSN 15359476. doi: 10.1016/j.mcpro.2021.100168. URL https://linkinghub.elsevier.com/retrieve/pii/S1535947621001407.
  113. Spatial single-cell mass spectrometry defines zonation of the hepatocyte proteome. Nature Methods, 20(10):1530–1536, October 2023. ISSN 1548-7091, 1548-7105. doi: 10.1038/s41592-023-02007-6. URL https://www.nature.com/articles/s41592-023-02007-6.
  114. High-plex protein and whole transcriptome co-mapping at cellular resolution with spatial cite-seq. Nature Biotechnology, 41(10):1405–1409, October 2023a. ISSN 1087-0156, 1546-1696. doi: 10.1038/s41587-023-01676-0. URL https://www.nature.com/articles/s41587-023-01676-0.
  115. Perturb-seq: Dissecting molecular circuits with scalable single-cell rna profiling of pooled genetic screens. Cell, 167(7):1853–1866.e17, December 2016. ISSN 00928674. doi: 10.1016/j.cell.2016.11.038. URL https://linkinghub.elsevier.com/retrieve/pii/S0092867416316105.
  116. Simultaneous crispr screening and spatial transcriptomics reveals intracellular, intercellular, and functional transcriptional circuits. Biorxiv, 2023.
  117. Diffdock: Diffusion steps, twists, and turns for molecular docking. In The Eleventh International Conference on Learning Representations, 2023. URL https://openreview.net/forum?id=kKF8_K-mBbS.
  118. Robust deep learning–based protein sequence design using proteinmpnn. Science, 378(6615):49–56, 2022.
  119. De novo design of protein structure and function with rfdiffusion. Nature, 620(7976):1089–1100, 2023.
  120. Discovery of a structural class of antibiotics with explainable deep learning. Nature, pages 1–9, 2023.
  121. Scaling deep learning for materials discovery. Nature, pages 1–6, 2023.
  122. Drugging the’undruggable’cancer targets. Nature Reviews Cancer, 17(8):502–508, 2017.
  123. Mitochondrial fragmentation drives selective removal of deleterious mtdna in the germline. Nature, 570(7761):380–384, 2019.
  124. Camel: Communicative agents for" mind" exploration of large language model society. In Thirty-seventh Conference on Neural Information Processing Systems, 2023b.
  125. Visual instruction tuning. In Thirty-seventh Conference on Neural Information Processing Systems, 2023b. URL https://openreview.net/forum?id=w0H2xGHlkw.
  126. Video-LLaMA: An instruction-tuned audio-visual language model for video understanding. In Yansong Feng and Els Lefever, editors, Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, pages 543–553, Singapore, December 2023c. Association for Computational Linguistics. doi: 10.18653/v1/2023.emnlp-demo.49. URL https://aclanthology.org/2023.emnlp-demo.49.
  127. Deep learning tools for the measurement of animal behavior in neuroscience. Current Opinion in Neurobiology, 60:1–11, February 2020. ISSN 09594388. doi: 10.1016/j.conb.2019.10.008. URL https://linkinghub.elsevier.com/retrieve/pii/S0959438819301151.
  128. Live-imaging of astrocyte morphogenesis and function in zebrafish neural circuits. Nature Neuroscience, 23(10):1297–1306, October 2020. ISSN 1097-6256, 1546-1726. doi: 10.1038/s41593-020-0703-x. URL https://www.nature.com/articles/s41593-020-0703-x.
  129. PaLM-e: An embodied multimodal language model. In Andreas Krause, Emma Brunskill, Kyunghyun Cho, Barbara Engelhardt, Sivan Sabato, and Jonathan Scarlett, editors, Proceedings of the 40th International Conference on Machine Learning, volume 202 of Proceedings of Machine Learning Research, pages 8469–8488. PMLR, 23–29 Jul 2023. URL https://proceedings.mlr.press/v202/driess23a.html.
  130. Proteome-wide mapping of short-lived proteins in human cells. Molecular cell, 81(22):4722–4735, 2021.
  131. Learning transferable visual models from natural language supervision. In International conference on machine learning, pages 8748–8763. PMLR, 2021.
  132. MiniGPT-4: Enhancing vision-language understanding with advanced large language models. In The Twelfth International Conference on Learning Representations, 2024. URL https://openreview.net/forum?id=1tZbq88f27.
  133. Flamingo: a visual language model for few-shot learning. Advances in Neural Information Processing Systems, 35:23716–23736, 2022.
  134. InstructBLIP: Towards general-purpose vision-language models with instruction tuning. In Thirty-seventh Conference on Neural Information Processing Systems, 2023. URL https://openreview.net/forum?id=vvoWPYqZJA.
  135. Next-gpt: Any-to-any multimodal llm. arXiv preprint arXiv:2309.05519, 2023b.
  136. Introducing our multimodal models, 2023. URL https://www.adept.ai/blog/fuyu-8b.
  137. Revisiting human-agent communication: The importance of joint co-construction and understanding mental states. Frontiers in Psychology, 12:580955, 2021.
  138. Inner monologue: Embodied reasoning through planning with language models. In 6th Annual Conference on Robot Learning, 2022b. URL https://openreview.net/forum?id=3R3Pz5i0tye.
  139. Direct preference optimization: Your language model is secretly a reward model. arXiv preprint arXiv:2305.18290, 2023.
  140. Self-adaptive large language model (llm)-based multiagent systems. In 2023 IEEE International Conference on Autonomic Computing and Self-Organizing Systems Companion (ACSOS-C), pages 104–109. IEEE, 2023.
  141. Multi-agent actor-critic for mixed cooperative-competitive environments. Advances in neural information processing systems, 30, 2017.
  142. MetaGPT: Meta programming for multi-agent collaborative framework. In The Twelfth International Conference on Learning Representations, 2024. URL https://openreview.net/forum?id=VtmBAGCN7o.
  143. Building cooperative embodied agents modularly with large language models. In The Twelfth International Conference on Learning Representations, 2024a. URL https://openreview.net/forum?id=EnXJfQqy0K.
  144. Encouraging divergent thinking in large language models through multi-agent debate. arXiv preprint arXiv:2305.19118, 2023b.
  145. Improving language model negotiation with self-play and in-context learning from ai feedback. arXiv preprint arXiv:2305.10142, 2023.
  146. Roco: Dialectic multi-robot collaboration with large language models. arXiv preprint arXiv:2307.04738, 2023.
  147. Can language models teach weaker agents? teacher explanations improve students via theory of mind. arXiv preprint arXiv:2306.09299, 2023.
  148. Epidemic modeling with generative agents. arXiv preprint arXiv:2307.04986, 2023.
  149. Social simulacra: Creating populated prototypes for social computing systems. In Proceedings of the 35th Annual ACM Symposium on User Interface Software and Technology, pages 1–18, 2022.
  150. Talm: Tool augmented language models. arXiv preprint arXiv:2205.12255, 2022.
  151. Toolformer: Language models can teach themselves to use tools. In Thirty-seventh Conference on Neural Information Processing Systems, 2023. URL https://openreview.net/forum?id=Yacmpz84TH.
  152. Webgpt: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332, 2021.
  153. HuggingGPT: Solving AI tasks with chatGPT and its friends in hugging face. In Thirty-seventh Conference on Neural Information Processing Systems, 2023. URL https://openreview.net/forum?id=yHdTscY6Ci.
  154. Chatdb: Augmenting llms with databases as their symbolic memory. arXiv preprint arXiv:2306.03901, 2023b.
  155. A robotic platform for flow synthesis of organic compounds informed by ai planning. Science, 365(6453):eaax1566, 2019.
  156. Do as i can, not as i say: Grounding language in robotic affordances. arXiv preprint arXiv:2204.01691, 2022.
  157. Zero-shot text-to-image generation. In International Conference on Machine Learning, pages 8821–8831. PMLR, 2021.
  158. LoRA: Low-rank adaptation of large language models. In International Conference on Learning Representations, 2022. URL https://openreview.net/forum?id=nZeVKeeFYf9.
  159. Communicative agents for software development. arXiv preprint arXiv:2307.07924, 2023.
  160. Llm as dba. arXiv preprint arXiv:2308.05481, 2023b.
  161. Ghost in the minecraft: Generally capable agents for open-world enviroments via large language models with text-based knowledge and memory. arXiv preprint arXiv:2305.17144, 2023.
  162. Text and code embeddings by contrastive pre-training. arXiv preprint arXiv:2201.10005, 2022.
  163. Memorybank: Enhancing large language models with long-term memory. Proceedings of the AAAI Conference on Artificial Intelligence, 38(17):19724–19731, Mar. 2024a. doi: 10.1609/aaai.v38i17.29946. URL https://ojs.aaai.org/index.php/AAAI/article/view/29946.
  164. QLoRA: Efficient finetuning of quantized LLMs. In Thirty-seventh Conference on Neural Information Processing Systems, 2023. URL https://openreview.net/forum?id=OUIFPHEgJU.
  165. Locating and editing factual associations in gpt. In Advances in Neural Information Processing Systems, 2022.
  166. A comprehensive study of knowledge editing for large language models. arXiv preprint arXiv:2401.01286, 2024b.
  167. Sayplan: Grounding large language models using 3d scene graphs for scalable robot task planning. In 7th Annual Conference on Robot Learning, 2023. URL https://openreview.net/forum?id=wMpOMO0Ss7a.
  168. Vicuna: An open-source chatbot impressing gpt-4 with 90%* chatgpt quality. See https://vicuna. lmsys. org (accessed 14 April 2023), 2023.
  169. Large language models with controllable working memory. In Anna Rogers, Jordan Boyd-Graber, and Naoaki Okazaki, editors, Findings of the Association for Computational Linguistics: ACL 2023, pages 1774–1793, Toronto, Canada, July 2023c. Association for Computational Linguistics. doi: 10.18653/v1/2023.findings-acl.112. URL https://aclanthology.org/2023.findings-acl.112.
  170. Large language models are zero-shot reasoners. Advances in neural information processing systems, 35:22199–22213, 2022.
  171. Llm+ p: Empowering large language models with optimal planning proficiency. arXiv preprint arXiv:2304.11477, 2023c.
  172. Dynamic planning with a llm. arXiv preprint arXiv:2308.06391, 2023.
  173. Igniting language intelligence: The hitchhiker’s guide from chain-of-thought reasoning to language agents. arXiv preprint arXiv:2311.11797, 2023d.
  174. Let’s think outside the box: Exploring leap-of-thought in large language models with creative humor generation. In The IEEE / CVF Computer Vision and Pattern Recognition Conference (CVPR), 2024b.
  175. Planning with large language models via corrective re-prompting. In NeurIPS 2022 Foundation Models for Decision Making Workshop, 2022.
  176. Tree of thoughts: Deliberate problem solving with large language models. In Thirty-seventh Conference on Neural Information Processing Systems, 2023b. URL https://openreview.net/forum?id=5Xc1ecxO1h.
  177. Recmind: Large language model powered agent for recommendation. arXiv preprint arXiv:2308.14296, 2023c.
  178. Least-to-most prompting enables complex reasoning in large language models. In The Eleventh International Conference on Learning Representations, 2023c. URL https://openreview.net/forum?id=WZH7099tgfM.
  179. Self-consistency improves chain of thought reasoning in language models. In The Eleventh International Conference on Learning Representations, 2023d. URL https://openreview.net/forum?id=1PL1NIMMrw.
  180. Graph of thoughts: Solving elaborate problems with large language models. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 38, pages 17682–17690, 2024.
  181. Reasoning with language model is planning with world model. In The 2023 Conference on Empirical Methods in Natural Language Processing, 2023. URL https://openreview.net/forum?id=VTWWvYtF1R.
  182. Self-refine: Iterative refinement with self-feedback. In Thirty-seventh Conference on Neural Information Processing Systems, 2023. URL https://openreview.net/forum?id=S37hOerQLB.
  183. Llm-planner: Few-shot grounded planning for embodied agents with large language models. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 2998–3009, 2023.
  184. Teaching large language models to self-debug. In The Twelfth International Conference on Learning Representations, 2024. URL https://openreview.net/forum?id=KuPixIqPiq.
  185. Embers of autoregression: Understanding large language models through the problem they are trained to solve. arXiv preprint arXiv:2309.13638, 2023.
  186. Reasoning or reciting? exploring the capabilities and limitations of language models through counterfactual tasks. arXiv preprint arXiv:2307.02477, 2023c.
  187. Show your work: Scratchpads for intermediate computation with language models. arXiv preprint arXiv:2112.00114, 2021.
  188. Quantifying uncertainty in answers from any language model via intrinsic and extrinsic confidence assessment. arXiv preprint arXiv:2308.16175, 2023.
  189. Just ask for calibration: Strategies for eliciting calibrated confidence scores from language models fine-tuned with human feedback. In The 2023 Conference on Empirical Methods in Natural Language Processing, 2023. URL https://openreview.net/forum?id=g3faCfrwm7.
  190. Semantic uncertainty: Linguistic invariances for uncertainty estimation in natural language generation. In The Eleventh International Conference on Learning Representations, 2023. URL https://openreview.net/forum?id=VD-AYtP0dve.
  191. A tutorial on conformal prediction. Journal of Machine Learning Research, 9(3), 2008a.
  192. A tutorial on conformal prediction. Journal of Machine Learning Research, 9(12):371–421, 2008b. URL http://jmlr.org/papers/v9/shafer08a.html.
  193. Discovering language model behaviors with model-written evaluations. In Anna Rogers, Jordan Boyd-Graber, and Naoaki Okazaki, editors, Findings of the Association for Computational Linguistics: ACL 2023, pages 13387–13434, Toronto, Canada, July 2023. Association for Computational Linguistics. doi: 10.18653/v1/2023.findings-acl.847. URL https://aclanthology.org/2023.findings-acl.847.
  194. Fine-tuning aligned language models compromises safety, even when users do not intend to! In The Twelfth International Conference on Learning Representations, 2024. URL https://openreview.net/forum?id=hTEGyKf0dZ.
  195. Jailbroken: How does LLM safety training fail? In Thirty-seventh Conference on Neural Information Processing Systems, 2023. URL https://openreview.net/forum?id=jA235JGM09.
  196. Holistic evaluation of language models. Annals of the New York Academy of Sciences, 1525(1):140–146, 2023. doi: https://doi.org/10.1111/nyas.15007. URL https://nyaspubs.onlinelibrary.wiley.com/doi/abs/10.1111/nyas.15007.
  197. GAIA: a benchmark for general AI assistants. In The Twelfth International Conference on Learning Representations, 2024. URL https://openreview.net/forum?id=fibxvahvs3.
  198. Beyond the imitation game: Quantifying and extrapolating the capabilities of language models. Transactions on Machine Learning Research, 2023. ISSN 2835-8856. URL https://openreview.net/forum?id=uyTL5Bvosj.
  199. Benchmarking large language models as ai research agents. ArXiv, abs/2310.03302, 2023b. URL https://api.semanticscholar.org/CorpusID:263671541.
  200. Agentbench: Evaluating LLMs as agents. In The Twelfth International Conference on Learning Representations, 2024. URL https://openreview.net/forum?id=zAdUB0aCTQ.
  201. Machine learning for integrating data in biology and medicine: Principles, practice, and opportunities. Information Fusion, 50:71–91, 2019. ISSN 1566-2535. doi: https://doi.org/10.1016/j.inffus.2018.09.012. URL https://www.sciencedirect.com/science/article/pii/S1566253518304482.
  202. Office of Science and Technology Policy. Ai bill of rights. https://www.whitehouse.gov/ostp/ai-bill-of-rights/, 2023. Accessed: 02/2024.
  203. Ai regulation has its own alignment problem: The technical and institutional feasibility of disclosure, registration, licensing, and auditing. George Washington Law Review, 11 2023. Forthcoming, Available at SSRN: https://ssrn.com/abstract=4634443.
  204. Plug in the safety chip: Enforcing constraints for llm-driven robot agents. In International Conference on Robotics and Automation, 2024.
  205. Evaluating large language models trained on code. arXiv:2107.03374, 2021.
  206. Code as policies: Language model programs for embodied control. In 2023 IEEE International Conference on Robotics and Automation (ICRA), pages 9493–9500. IEEE, 2023c.
  207. Enhancing trust in llm-based ai automation agents: New considerations and future challenges. In International Joint Conference on Artificial Intelligence, 2023.
Citations (23)

Summary

  • The paper introduces AI agents, or 'AI scientists', that autonomously generate hypotheses and support experimental design in biomedical research.
  • The paper details the evolution from data-driven models to multi-agent systems, enabling robust data integration and collaborative analyses.
  • The paper identifies key challenges such as ensuring reliability, quality datasets, comprehensive evaluation protocols, and ethical governance in AI deployment.

Empowering Biomedical Discovery with AI Agents

Introduction

The paper "Empowering Biomedical Discovery with AI Agents" outlines a vision where AI agents, driven by LLMs and other machine learning tools, collaborate with human scientists to advance biomedical research. These agents, envisioned as "AI scientists," are capable of skeptical learning and reasoning, coordinating a range of tasks from hypothesis generation to experimental execution. The integration of LLMs allows these agents to leverage vast datasets, make predictions, and generate hypotheses, potentially leading to significant advancements in areas such as hybrid cell simulation and the development of new therapies. Figure 1

Figure 1: Empowering biomedical research with AI agents. AI agents pave the way for "AI scientists" capable of skeptical learning and reasoning.

Data-Driven Models in Biomedical Research

The paper highlights the evolution of data-driven models in biomedical research, detailing how databases, search engines, machine learning, and interactive learning models have reshaped the field. The transition to advanced AI agent systems marks a significant shift, enabling more robust data integration and analysis. Figure 2

Figure 2: Evolving use of data-driven models in research. Data-driven approaches from databases and search engines to advanced agent systems have reshaped biomedical research.

Types and Configurations of AI Agents

AI agents in biomedical research can be categorized into LLM-based agents and multi-agent systems. LLM-based agents use large pre-trained models fine-tuned for specific tasks, offering conversational interfaces and tool access. Multi-agent systems, on the other hand, incorporate agents with specialized capabilities, enabling collaborative schemes like brainstorming, expert consultation, and self-driving labs. Figure 3

Figure 3: Diverse configurations of AI agents in biology, from LLM-based AI agents to multi-agent systems.

Levels of Autonomy

AI agents are classified into four levels based on their autonomy in hypothesis generation, experimentation, and reasoning. At Level 0, agents serve merely as tools, while Level 3 agents, considered "AI scientists," can generate creative hypotheses and develop novel experimental methods. Current advancements predominantly encompass Levels 1 and 2, where AI acts as an assistant or collaborator, respectively.

Challenges in Implementation

Several challenges impede the realization of autonomous AI agents in scientific discovery:

  • Robustness and Reliability: Ensuring the reliability of AI agents in generating accurate predictions and reasoning without hallucinating or generating biases remains a critical concern.
  • Evaluation Protocols: Developing comprehensive evaluation frameworks for these agents that encompass both theoretical capabilities and practical implications is necessary for widespread adoption.
  • Dataset Generation: High-quality, multi-modal datasets are essential for training agents, especially given the variability in biological data.
  • Governance and Ethics: Establishing effective governance frameworks that balance innovation with ethical considerations and safety is paramount for responsible deployment.
  • Risks and Safeguards: Addressing the potential misuse of AI agents, whether through malfunctions or malicious intents, is a significant concern. Figure 4

    Figure 4: Challenges for AI agents in scientific discovery, including robustness, evaluation protocols, and governance.

Conclusion

The paper envisions a transformative era in biomedical research, facilitated by AI agents that integrate advanced computational models with human creativity. While challenges remain, particularly in ensuring the robustness and ethical governance of these agents, their potential to revolutionize scientific discovery is substantial. Continued advancements in AI, coupled with rigorous development of safety protocols and evaluation criteria, are essential to achieving the vision of AI scientists in biomedicine.

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 9 tweets with 279 likes about this paper.