REINFORCE-ING Chemical Language Models in Drug Design
Abstract: Chemical LLMs, combined with reinforcement learning, have shown significant promise to efficiently traverse large chemical spaces in drug design. However, the performance of various RL algorithms and their best practices for practical drug design are still unclear. Here, starting from the principles of the REINFORCE algorithm, we investigate the effect of different components from RL theory including experience replay, hill-climbing, baselines to reduce variance, and alternative reward shaping. Additionally we demonstrate how RL hyperparameters can be fine-tuned for effectiveness, efficiency, or chemical regularization as demonstrated using the MolOpt benchmark.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.