Papers
Topics
Authors
Recent
Search
2000 character limit reached

Graph Reinforcement Learning for Combinatorial Optimization: A Survey and Unifying Perspective

Published 9 Apr 2024 in cs.LG and cs.AI | (2404.06492v2)

Abstract: Graphs are a natural representation for systems based on relations between connected entities. Combinatorial optimization problems, which arise when considering an objective function related to a process of interest on discrete structures, are often challenging due to the rapid growth of the solution space. The trial-and-error paradigm of Reinforcement Learning has recently emerged as a promising alternative to traditional methods, such as exact algorithms and (meta)heuristics, for discovering better decision-making strategies in a variety of disciplines including chemistry, computer science, and statistics. Despite the fact that they arose in markedly different fields, these techniques share significant commonalities. Therefore, we set out to synthesize this work in a unifying perspective that we term Graph Reinforcement Learning, interpreting it as a constructive decision-making method for graph problems. After covering the relevant technical background, we review works along the dividing line of whether the goal is to optimize graph structure given a process of interest, or to optimize the outcome of the process itself under fixed graph structure. Finally, we discuss the common challenges facing the field and open research questions. In contrast with other surveys, the present work focuses on non-canonical graph problems for which performant algorithms are typically not known and Reinforcement Learning is able to provide efficient and effective solutions.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (211)
  1. Bruce Abramson. Expected-outcome: a general model of static evaluation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 12(2):182–193, 1990.
  2. A fast and scalable radiation hybrid map construction and integration strategy. Genome Research, 10(3):350–364, 2000.
  3. Learning what to defer for maximum independent sets. In ICML, 2020.
  4. Ravindra K. Ahuja. Network Flows: Theory, Algorithms, and Applications. Prentice Hall, Englewood Cliffs, NJ, 1993.
  5. Chapter 1 Applications of network optimization. In Handbooks in Operations Research and Management Science, volume 7 of Network Models, pp.  1–83. Elsevier, 1995.
  6. Error and attack tolerance of complex networks. Nature, 406(6794):378–382, 2000.
  7. Towards real-time routing optimization with deep reinforcement learning: Open challenges. In HPSR, 2021.
  8. Thinking Fast and Slow with Deep Learning and Tree Search. In NeurIPS, 2017.
  9. Certification of an optimal tsp tour through 85,900 cities. Operations Research Letters, 37(1):11–15, 2009.
  10. Finite-time Analysis of the Multiarmed Bandit Problem. Machine Learning, 47(2):235–256, 2002.
  11. An overview of evolutionary algorithms for parameter optimization. Evolutionary Computation, 1(1):1–23, 1993.
  12. Neural Machine Translation by Jointly Learning to Align and Translate. In ICLR, 2016.
  13. A framework for behavioural cloning. In Machine Intelligence 15, pp.  103–129, 1999.
  14. Egon Balas. Machine sequencing via disjunctive graphs: an implicit enumeration algorithm. Operations Research, 17(6):941–957, 1969.
  15. Albert-László Barabási. Network Science. Cambridge University Press, 2016.
  16. Emergence of Scaling in Random Networks. Science, 286(5439):509–512, 1999.
  17. Dynamical Processes on Complex Networks. Cambridge University Press, 2008.
  18. Marc Barthélemy. Spatial networks. Physics Reports, 499(1-3), 2011.
  19. Relational inductive biases, deep learning, and graph networks. arXiv preprint arXiv:1806.01261, 2018.
  20. Laplacian Eigenmaps and Spectral Techniques for Embedding and Clustering. In NeurIPS, 2002.
  21. A distributional perspective on reinforcement learning. In ICML, 2017.
  22. Richard A. Bellman. Dynamic Programming. Princeton University Press, 1957.
  23. Machine Learning for Combinatorial Optimization: a Methodological Tour d’Horizon. European Journal of Operational Research, 290:405–421, 2021.
  24. Dimitri P. Bertsekas. Dynamic Programming and Optimal Control. Athena Scientific, 1995.
  25. Improving Network Robustness by Edge Modification. Physica A, 357:593–612, 2005.
  26. Learning heuristic search via imitation. In CoRL, 2017.
  27. A survey on metaheuristics for stochastic combinatorial optimization. Natural Computing, 8(2):239–287, 2009.
  28. Evasion attacks against machine learning at test time. In ECML-PKDD, 2013.
  29. Graph Theory, 1736-1936. Oxford University Press, 1986.
  30. Metaheuristics in combinatorial optimization: Overview and conceptual comparison. ACM Computing Surveys (CSUR), 35(3):268–308, 2003.
  31. Translating embeddings for modeling multi-relational data. In NeurIPS, 2013.
  32. What’s wrong with deep learning in tree search for combinatorial optimization. In ICLR, 2022.
  33. Packet Routing in Dynamically Changing Networks: A Reinforcement Learning Approach. In NeurIPS, 1994.
  34. Geometric Deep Learning: Going beyond Euclidean data. IEEE Signal Processing Magazine, 34(4):18–42, 2017.
  35. A survey of Monte Carlo tree search methods. IEEE Transactions on Computational Intelligence and AI in Games, 4(1):1–43, 2012.
  36. Spectral Networks and Locally Connected Networks on Graphs. In ICLR, 2014.
  37. Localization of the maximal entropy random walk. Physical review letters, 102(16):160602, 2009.
  38. Machine learning for molecular and materials science. Nature, 559(7715):547–555, 2018.
  39. Deep Blue. Artificial Intelligence, 134(1-2):57–83, 2002.
  40. Combinatorial optimization and reasoning with graph neural networks. In IJCAI, 2021.
  41. Ic insertion: an application of the travelling salesman problem. The International Journal of Production Research, 27(10):1837–1841, 1989.
  42. Progressive Strategies for Monte-Carlo Tree Search. New Mathematics and Natural Computation, 04(03):343–357, 2008.
  43. Contingency-aware influence maximization: A reinforcement learning approach. In UAI, 2021.
  44. Learning to perform local rewriting for combinatorial optimization. In NeurIPS, 2019.
  45. Large-sample learning of Bayesian networks is NP-hard. Journal of Machine Learning Research, 5:1287–1330, 2004.
  46. End-to-end driving via conditional imitation learning. In ICRA, 2018.
  47. Resilience of the Internet to Random Breakdowns. Physical Review Letters, 85(21):4626–4628, 2000.
  48. Breakdown of the Internet under Intentional Attack. Physical Review Letters, 86(16):3682–3685, 2001.
  49. Stephen A. Cook. The complexity of theorem-proving procedures. In STOC, 1971.
  50. Introduction to Algorithms. MIT Press, Fourth edition, 2022.
  51. Discriminative embeddings of latent variable models for structured data. In ICML, 2016.
  52. Adversarial attack on graph structured data. In ICML, 2018.
  53. Linear Programming, 1: Introduction. Springer, 1997.
  54. Solution of a large-scale traveling-salesman problem. Journal of the Operations Research Society of America, 2(4):393–410, 1954.
  55. Goal-directed graph construction using reinforcement learning. Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences, 477(2254):20210168, 2021a.
  56. Solving Graph-based Public Goods Games with Tree Search and Imitation Learning. In NeurIPS, 2021b.
  57. Graph Neural Modeling of Network Flows. arXiv preprint arXiv:2209.05208, 2022.
  58. Planning spatial networks with Monte Carlo tree search. Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences, 479(2269):20220383, 2023a.
  59. Tree search in DAG space with model-based reinforcement learning for causal discovery. arXiv preprint arXiv:2310.13576, 2023b.
  60. Go for a walk and arrive at the answer: Reasoning over paths in knowledge bases using reinforcement learning. In ICLR, 2018.
  61. MolGAN: An implicit generative model for small molecular graphs. In ICML Deep Generative Models Workshop, 2018.
  62. Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering. In NeurIPS, 2016.
  63. Learning to control a low-cost manipulator using data-efficient reinforcement learning. In RSS, 2011.
  64. Learning Structural Node Embeddings Via Diffusion Wavelets. In KDD, 2018.
  65. Dynamic Network Reconfiguration for Entropy Maximization using Deep Reinforcement Learning. In LoG, 2022.
  66. Experiments with the graph traverser program. Proceedings of the Royal Society of London. Series A. Mathematical and Physical Sciences, 294(1437):235–259, 1966.
  67. Ant colony optimization. IEEE Computational Intelligence Magazine, 1(4):28–39, 2006.
  68. BQ-NCO: Bisimulation quotienting for generalizable neural combinatorial optimization. In NeurIPS, 2023.
  69. Deep Reinforcement Learning in Large Discrete Action Spaces. In ICML, 2015.
  70. Convolutional Networks on Graphs for Learning Molecular Fingerprints. In NeurIPS, 2015.
  71. Shimon Even. Graph Algorithms. Cambridge University Press, 2011.
  72. Why (and how) networks should run themselves. arXiv preprint arXiv:1710.11583, 2017.
  73. Increasing internet capacity using local search. Computational Optimization and Applications, 29(1):13–48, 2004.
  74. Addressing function approximation error in actor-critic methods. In ICML, 2018.
  75. Computers and Intractability. A Guide to the Theory of NP-Completeness. W. H. Freeman and Co, 1979.
  76. Combining online and offline knowledge in UCT. In ICML, 2007.
  77. Algorithmic concept-based explainable reasoning. In AAAI, 2022.
  78. Neural Message Passing for Quantum Chemistry. In ICML, 2017.
  79. Oded Goldreich. Computational Complexity: A Conceptual Perspective. Cambridge University Press, 2008.
  80. Sanjeev Goyal. Connections: An Introduction to the Economics of Networks. Princeton University Press, 2012.
  81. node2vec: Scalable Feature Learning for Networks. In KDD, 2016.
  82. Continuous deep q-learning with model-based acceleration. In ICML, 2016.
  83. Objective-Reinforced Generative Adversarial Networks (ORGAN) for Sequence Generation Models. arXiv preprint arXiv:1705.10843, 2018.
  84. Deep Learning for Real-Time Atari Game Play Using Offline Monte-Carlo Tree Search Planning. In NeurIPS, 2014.
  85. Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor. In ICML, 2018.
  86. Representation learning on graphs: Methods and applications. IEEE Data Engineering Bulletin, 40(3):52–74, 2017a.
  87. Inductive Representation Learning on Large Graphs. In NeurIPS, 2017b.
  88. Graphical Enumeration. Academic Press, New York, 1973.
  89. Deep Reinforcement Learning with a Combinatorial Action Space for Predicting Popular Reddit Threads. In EMNLP, 2016.
  90. Keld Helsgaun. An effective implementation of the Lin–Kernighan traveling salesman heuristic. European Journal of Operational Research, 126(1):106–130, 2000.
  91. Deep Convolutional Networks on Graph-Structured Data. arXiv preprint arXiv:1506.05163, 2015.
  92. Rainbow: Combining Improvements in Deep Reinforcement Learning. In AAAI, 2018.
  93. Generative adversarial imitation learning. In NeurIPS, 2016.
  94. GDDR: GNN-based Data-Driven Routing. In ICDCS, 2021.
  95. Chapter 3 - Games on Networks. In Handbook of Game Theory with Economic Applications, volume 4, pp.  95–163. Elsevier, 2015.
  96. Graph Coloring Problems. Wiley, New York, 1995.
  97. Unleashing the potential of data-driven networking. In COMSNETS, 2017.
  98. Junction Tree Variational Autoencoder for Molecular Graph Generation. In ICML, 2018.
  99. Richard M. Karp. Reducibility among combinatorial problems. In Complexity of Computer Computations, pp.  85–103. Springer, 1972.
  100. A contribution to the mathematical theory of epidemics. Proceedings of the Royal Society of London Series A, 115(772):700–721, 1927.
  101. Learning combinatorial optimization algorithms over graphs. In NeurIPS, 2017.
  102. MIP-GNN: A data-driven framework for guiding combinatorial solvers. In AAAI, 2022.
  103. Learning collaborative policies to solve np-hard routing problems. In NeurIPS, 2021.
  104. Sym-NCO: Leveraging Symmetricity for Neural Combinatorial Optimization. In NeurIPS, 2022.
  105. Semi-Supervised Classification with Graph Convolutional Networks. In ICLR, 2017.
  106. Optimization by simulated annealing. Science, 220(4598):671–680, 1983.
  107. Bandit based Monte-Carlo planning. In ECML, 2006.
  108. Probabilistic Graphical Models: Principles and Techniques. MIT Press, 2009.
  109. Attention, learn to solve routing problems! In ICLR, 2019.
  110. Grammar Variational Autoencoder. In ICML, 2017.
  111. POMO: Policy optimization with multiple optima for reinforcement learning. In NeurIPS, 2020.
  112. An automatic method of solving discrete programming problems. Econometrica, 28(3):497–520, 1960.
  113. Ranked reward: Enabling self-play reinforcement learning for combinatorial optimization. arXiv preprint arXiv:1807.01672, 2018.
  114. Efficient Behavior of Small-World Networks. Physical Review Letters, 87(19):198701, 2001.
  115. Leonid Anatolevich Levin. Universal sequential search problems. Problemy Peredachi Informatsii, 9(3):115–116, 1973.
  116. Guided policy search. In ICML, 2013.
  117. Gated Graph Sequence Neural Networks. In ICLR, 2017.
  118. Learning deep generative models of graphs. In ICML, 2018.
  119. Efficient Graph Generation with Graph Recurrent Attention Networks. In NeurIPS, 2019.
  120. Continuous control with deep reinforcement learning. In ICLR, 2016.
  121. An Effective Heuristic Algorithm for the Traveling-Salesman Problem. Operations Research, 21(2):498–516, 1973.
  122. Graph adversarial attack via rewiring. In KDD, 2021.
  123. GCOMB: Learning budget-constrained combinatorial algorithms over billion-sized graphs. In NeurIPS, 2020.
  124. Simple random search provides a competitive approach to reinforcement learning. arXiv preprint arXiv:1803.07055, 2018.
  125. Abraham H. Maslow. The Psychology of Science: a Reconnaissance. Harper & Row, 1966.
  126. Reinforcement learning for combinatorial optimization: A survey. Computers & Operations Research, 134:105400, 2021.
  127. Controlling graph dynamics with reinforcement learning and graph neural networks. In ICML, 2021.
  128. Human-level control through deep reinforcement learning. Nature, 518(7540):529–533, 2015.
  129. Asynchronous Methods for Deep Reinforcement Learning. In ICML, 2016.
  130. A framework for mesencephalic dopamine systems based on predictive Hebbian learning. The Journal of Neuroscience, 16(5):1936–1947, 1996.
  131. Machine-learning–based column selection for column generation. Transportation Science, 55(4):815–831, 2021.
  132. Damon Mosk-Aoyama. Maximum algebraic connectivity augmentation is NP-hard. Operations Research Letters, 36(6):677–679, 2008.
  133. Towards interpretable reinforcement learning using attention augmented agents. In NeurIPS, 2019.
  134. John Nash. Some games and machines for playing them. Technical Report D-1164, Rand Corporation, 1952.
  135. M. E. J. Newman. Networks. Oxford University Press, 2018.
  136. Action-conditional video prediction using deep networks in atari games. In NeurIPS, 2015.
  137. Asymmetric Transitivity Preserving Graph Embedding. In KDD, 2016.
  138. Learning graph search heuristics. In LoG, 2022.
  139. Intrinsically motivated graph exploration using network theories of human curiosity. In Proceedings of the Second Learning on Graphs (LoG) Conference, 2023.
  140. Judea Pearl. Heuristics: Intelligent Search Strategies for Computer Problem Solving. Addison-Wesley Longman Publishing Co., Inc., 1984.
  141. Judea Pearl. Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Morgan Kaufmann, 1988.
  142. DeepWalk: Online Learning of Social Representations. In KDD, 2014.
  143. Reinforcement learning for adaptive routing. In IJCNN, 2002.
  144. Elements of Causal Inference: Foundations and Learning Algorithms. MIT Press, 2017.
  145. Learning Partial Policies to Speedup MDP Tree Search via Reduction to IID Learning. The Journal of Machine Learning Research, 18(1):2179–2213, 2017.
  146. Dean A. Pomerleau. ALVINN: An autonomous land vehicle in a neural network. In NeurIPS, 1988.
  147. Dean A. Pomerleau. Efficient training of artificial neural networks for autonomous navigation. Neural Computation, 3(1):88–97, 1991.
  148. struc2vec: Learning Node Representations from Structural Identity. In KDD, 2017.
  149. Martin Riedmiller. Neural Fitted Q Iteration – First Experiences with a Data Efficient Neural Reinforcement Learning Method. In ECML, 2005.
  150. A Survey of Multi-Objective Sequential Decision-Making. Journal of Artificial Intelligence Research, 48:67–113, 2013.
  151. An adaptive large neighborhood search heuristic for the pickup and delivery problem with time windows. Transportation Science, 40(4):455–472, 2006.
  152. Christopher D. Rosin. Nested Rollout Policy Adaptation for Monte Carlo Tree Search. In IJCAI, 2011.
  153. A reduction of imitation learning and structured prediction to no-regret online learning. In AISTATS, 2011.
  154. Artificial Intelligence: a Modern Approach. Prentice Hall, Fourth edition, 2020.
  155. Evolution strategies as a scalable alternative to reinforcement learning. arXiv preprint arXiv:1703.03864, 2017.
  156. Variational annealing on graphs for combinatorial optimization. In NeurIPS, 2023.
  157. The Graph Neural Network Model. IEEE Transactions on Neural Networks, 20(1):61–80, 2009.
  158. Prioritized experience replay. In ICLR, 2016.
  159. Mitigation of malicious attacks on networks. PNAS, 108(10):3838–3841, 2011.
  160. Mastering Atari, Go, chess and shogi by planning with a learned model. Nature, 588(7839):604–609, 2020.
  161. Trust region policy optimization. In ICML, 2015.
  162. Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347, 2017.
  163. Gideon Schwarz. Estimating the dimension of a model. The Annals of Statistics, 6(2):461–464, 1978.
  164. M-Walk: Learning to Walk over Graphs using Monte Carlo Tree Search. In NeurIPS, 2018.
  165. David Silver. Reinforcement Learning of Local Shape in the Game of Go. In IJCAI, 2007.
  166. High performance outdoor navigation from overhead data using imitation learning. In RSS, 2008.
  167. Mastering the game of Go with deep neural networks and tree search. Nature, 529(7587):484–489, 2016.
  168. Mastering the game of Go without human knowledge. Nature, 550(7676):354–359, 2017.
  169. A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play. Science, 362(6419):1140–1144, 2018.
  170. Information theory of complex networks: on evolution and architectural constraints. In Complex Networks, pp.  189–207. Springer, 2004.
  171. Supervised neural networks for the classification of structures. IEEE Transactions on Neural Networks, 8(3):714–735, 1997.
  172. Equibind: Geometric deep learning for drug binding structure prediction. In ICML, 2022.
  173. Peter Stone. TPOT-RL Applied to Network Routing. In ICML, 2000.
  174. Adversarial attacks on graph neural networks via node injections: A hierarchical reinforcement learning approach. In WWW, 2020.
  175. Reinforcement Learning: An Introduction. MIT Press, 2018.
  176. Intriguing properties of neural networks. In ICLR, 2014.
  177. A Multi-Agent, Policy-Gradient approach to Network Routing. In ICML, 2001.
  178. On-line Policy Improvement using Monte-Carlo Search. In NeurIPS, 1997.
  179. An Introduction to Linear Programming and Game Theory. John Wiley & Sons, 2011.
  180. Learning strategic network emergence games. In NeurIPS, 2020.
  181. GraphOpt: Learning Optimization Models of Graph Formation. In ICML, 2020.
  182. Learning to Route with Deep RL. In NeurIPS Deep Reinforcement Learning Symposium, 2017.
  183. Deep reinforcement learning with double Q-learning. In AAAI, 2016.
  184. Graph attention networks. In ICLR, 2018.
  185. Programmatically interpretable reinforcement learning. In ICML, 2018.
  186. Graphical models, exponential families, and variational inference. Foundations and Trends in Machine Learning, 1(1–2):1–305, 2008.
  187. Scientific discovery in the age of artificial intelligence. Nature, 620(7972):47–60, 2023.
  188. Neural Architecture Search using Deep Neural Networks and Monte Carlo Tree Search. In AAAI, 2020.
  189. Qi Wang and Chunlei Tang. Deep reinforcement learning for transportation network combinatorial optimization: A survey. Knowledge-Based Systems, 233:107526, 2021.
  190. Improving robustness of complex networks via the effective graph resistance. The European Physical Journal B, 87(9):221, 2014.
  191. Ordering-based causal discovery with reinforcement learning. In IJCAI, 2021.
  192. Christopher J. C. H. Watkins and Peter Dayan. Technical note: Q-learning. Machine Learning, 8(3-4):279–292, 1992.
  193. Ronald J. Williams. Simple statistical gradient-following algorithms for connectionist reinforcement learning. Machine Learning, 8(3-4):229–256, 1992.
  194. The Design of Approximation Algorithms. Cambridge University Press, 2011.
  195. No free lunch theorems for optimization. IEEE Transactions on Evolutionary Computation, 1(1):67–82, 1997.
  196. How powerful are graph neural networks? In ICLR, 2018a.
  197. What can neural networks reason about? In ICLR, 2020.
  198. Experience-driven networking: A deep reinforcement learning based approach. In IEEE INFOCOM, 2018b.
  199. Reinforcement causal structure learning on order graph. In AAAI, 2023a.
  200. Learning to boost resilience of complex networks via neural edge rewiring. Transactions on Machine Learning Research, 2023b.
  201. Hierarchical Graph Representation Learning with Differentiable Pooling. In NeurIPS, 2018.
  202. GNNExplainer: Generating explanations for graph neural networks. In NeurIPS, 2019.
  203. Graph convolutional policy network for goal-directed molecular graph generation. In NeurIPS, 2018a.
  204. GraphRNN: Generating Realistic Graphs with Deep Auto-regressive Models. In ICML, 2018b.
  205. Learn What Not to Learn: Action Elimination with Deep Reinforcement Learning. In NeurIPS, 2018.
  206. Learning to walk with dual agents for knowledge graph reasoning. In AAAI, 2022.
  207. CFR-RL: Traffic engineering with reinforcement learning in SDN. IEEE Journal on Selected Areas in Communications, 38(10):2249–2259, 2020.
  208. Deep learning enables rapid identification of potent DDR1 kinase inhibitors. Nature Biotechnology, 37(9):1038–1040, 2019.
  209. DAGs with no tears: continuous optimization for structure learning. In NeurIPS, 2018.
  210. Optimization of molecules via deep reinforcement learning. Scientific reports, 9(1):10752, 2019.
  211. Causal discovery with reinforcement learning. In ICLR, 2020.
Citations (3)

Summary

  • The paper presents a unifying perspective on applying reinforcement learning to combinatorial optimization on graphs via a Markov Decision Process framework.
  • It categorizes methodologies into Graph Structure Optimization, which alters graph topologies, and Graph Process Optimization, which refines control actions on fixed graphs.
  • The study highlights challenges like scalability, generalization, and interpretability, paving the way for hybrid RL models and future research directions.

Graph Reinforcement Learning for Combinatorial Optimization: A Survey and Unifying Perspective

Introduction

The paper "Graph Reinforcement Learning for Combinatorial Optimization: A Survey and Unifying Perspective" (2404.06492) provides a comprehensive examination of the application of Reinforcement Learning (RL) techniques to combinatorial optimization problems on graph structures. The authors aim to unify disparate approaches from various fields under the term Graph Reinforcement Learning (Graph RL), emphasizing its utility in addressing complex decision-making challenges that are not easily tackled by traditional methods.

Graphs provide a natural representation for systems where entities are connected by relationships, which are prevalent in domains such as chemistry, computer science, and the social sciences. Combinatorial optimization on graphs involves finding a solution that maximizes or minimizes an objective function defined over these structures. The solution space for such problems grows rapidly, making them computationally challenging.

Technical Background and Methodologies

The paper begins by elucidating the foundational aspects of graphs and combinatorial optimization. It proceeds to discuss how RL, particularly through the Markov Decision Process (MDP) framework, offers a viable alternative to classical approaches such as exact algorithms and (meta)heuristics. RL's trial-and-error paradigm, which enables the automatic discovery of decision-making strategies, is particularly advantageous for non-canonical graph problems where established algorithms are inadequate.

Graph RL problems are categorized into two primary types: Graph Structure Optimization and Graph Process Optimization. Graph Structure Optimization focuses on modifying the topology of a graph to improve an objective, while Graph Process Optimization involves optimizing outcomes of processes defined over a fixed graph structure. Figure 1

Figure 1: Visual summary of the structure and topics of the present survey.

Graph Structure Optimization

In Graph Structure Optimization, the focus is on altering the graph’s topology—through edge additions, removals, or rewiring—to enhance an objective function. This approach is relevant in various contexts, including:

  • Attacking Graph Neural Networks: Modifying graph structures to induce GNNs to make classification errors [dai_adversarial_2018].
  • Network Design: Constructing or modifying network topologies to improve resilience, efficiency, or other network metrics [darvariu2021goal].
  • Causal Discovery: Identifying causal relationships by constraining graph structures to be acyclic and maximizing explanatory power [zhu2020causal].
  • Molecular Optimization: Devising molecular structures with optimal properties, such as drug-likeness [you_graph_2018]. Figure 2

    Figure 2: High-level illustration of how Graph Structure Optimization problems are approached with RL.

Graph Process Optimization

Graph Process Optimization is concerned with optimizing a process over a static graph structure by choosing the best set of control actions. Significant applications include:

  • Routing on Networks: Optimizing flow distribution in networks to minimize congestion or maximize throughput [valadarsky2017learning].
  • Network Games: Identifying equilibrium states in network-based games that optimize social welfare or fairness [darvariu2021solvingshort].
  • Spreading Processes: Controlling processes such as epidemic spreading to minimize infections by optimally selecting influenced nodes [meirom2021controlling].
  • Search and Navigation: Developing strategies for effective exploration and pathfinding in graph structures, with applications in knowledge graph completion and robotic motion planning [shenMWalkLearningWalk2018]. Figure 3

    Figure 3: High-level illustration of how Graph Process Optimization problems are approached with RL.

Challenges and Implications

The paper outlines several key challenges in applying Graph RL:

  • Scalability: Managing the computational demands of large-scale graph problems.
  • Generalization: Ensuring models perform well across various graph instances and do not overfit to specific scenarios.
  • Interpretability: Extracting human-interpretable insights from learned RL models to enhance understanding and further optimization.

Addressing these challenges involves developing more efficient RL algorithms, leveraging domain-specific insights, and integrating RL with hybrid models that combine traditional algorithms with machine learning techniques.

Conclusion

Graph RL emerges as a powerful framework for tackling combinatorial optimization problems that are intractable with classical methods. By synthesizing diverse approaches under a unifying perspective, this work sets the stage for further advancements in both the theory and application of Graph RL. The implications extend across numerous domains, offering new paradigms for optimizing complex systems structured as graphs. As RL techniques continue to evolve, their integration with graph-based models promises significant breakthroughs in addressing both longstanding and emerging optimization challenges.

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 1 tweet with 1 like about this paper.