Papers
Topics
Authors
Recent
Search
2000 character limit reached

Fine-grained Graph Rationalization

Published 13 Dec 2023 in cs.LG and cs.SI | (2312.07859v3)

Abstract: Rationale discovery is defined as finding a subset of the input data that maximally supports the prediction of downstream tasks. In the context of graph machine learning, graph rationale is defined to locate the critical subgraph in the given graph topology. In contrast to the rationale subgraph, the remaining subgraph is named the environment subgraph. Graph rationalization can enhance the model performance as the mapping between the graph rationale and prediction label is viewed as invariant, by assumption. To ensure the discriminative power of the extracted rationale subgraphs, a key technique named "intervention" is applied whose heart is that given changing environment subgraphs, the semantics from the rationale subgraph is invariant, guaranteeing the correct prediction result. However, most, if not all, of the existing graph rationalization methods develop their intervention strategies on the graph level, which is coarse-grained. In this paper, we propose fine-grained graph rationalization (FIG). Our idea is driven by the self-attention mechanism, which provides rich interactions between input nodes. Based on that, FIG can achieve node-level and virtual node-level intervention. Our experiments involve 7 real-world datasets, and the proposed FIG shows significant performance advantages compared to 13 baseline methods.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (37)
  1. Invariant risk minimization. CoRR, abs/1907.02893, 2019. URL http://arxiv.org/abs/1907.02893.
  2. Layer normalization. CoRR, abs/1607.06450, 2016. URL http://arxiv.org/abs/1607.06450.
  3. Size-invariant graph representations for graph classification extrapolations. In Meila, M. and Zhang, T. (eds.), Proceedings of the 38th International Conference on Machine Learning, ICML 2021, 18-24 July 2021, Virtual Event, volume 139 of Proceedings of Machine Learning Research, pp.  837–851. PMLR, 2021. URL http://proceedings.mlr.press/v139/bevilacqua21a.html.
  4. Residual gated graph convnets. CoRR, abs/1711.07553, 2017. URL http://arxiv.org/abs/1711.07553.
  5. How attentive are graph attention networks? In The Tenth International Conference on Learning Representations, ICLR 2022, Virtual Event, April 25-29, 2022. OpenReview.net, 2022. URL https://openreview.net/forum?id=F72ximsx7C1.
  6. Invariant rationalization. In Proceedings of the 37th International Conference on Machine Learning, ICML 2020, 13-18 July 2020, Virtual Event, volume 119 of Proceedings of Machine Learning Research, pp.  1448–1458. PMLR, 2020. URL http://proceedings.mlr.press/v119/chang20c.html.
  7. Structure-aware transformer for graph representation learning. In Chaudhuri, K., Jegelka, S., Song, L., Szepesvári, C., Niu, G., and Sabato, S. (eds.), International Conference on Machine Learning, ICML 2022, 17-23 July 2022, Baltimore, Maryland, USA, volume 162 of Proceedings of Machine Learning Research, pp.  3469–3489. PMLR, 2022. URL https://proceedings.mlr.press/v162/chen22r.html.
  8. A generalization of transformer networks to graphs. CoRR, abs/2012.09699, 2020. URL https://arxiv.org/abs/2012.09699.
  9. Benchmarking graph neural networks. J. Mach. Learn. Res., 24:43:1–43:48, 2023. URL http://jmlr.org/papers/v24/22-0567.html.
  10. Unsupervised domain adaptation by backpropagation. In Bach, F. R. and Blei, D. M. (eds.), Proceedings of the 32nd International Conference on Machine Learning, ICML 2015, Lille, France, 6-11 July 2015, volume 37 of JMLR Workshop and Conference Proceedings, pp.  1180–1189. JMLR.org, 2015. URL http://proceedings.mlr.press/v37/ganin15.html.
  11. GOOD: A graph out-of-distribution benchmark. In NeurIPS, 2022. URL http://papers.nips.cc/paper_files/paper/2022/hash/0dc91de822b71c66a7f54fa121d8cbb9-Abstract-Datasets_and_Benchmarks.html.
  12. A survey on vision transformer. IEEE Trans. Pattern Anal. Mach. Intell., 45(1):87–110, 2023. doi: 10.1109/TPAMI.2022.3152247. URL https://doi.org/10.1109/TPAMI.2022.3152247.
  13. Open graph benchmark: Datasets for machine learning on graphs. CoRR, abs/2005.00687, 2020. URL https://arxiv.org/abs/2005.00687.
  14. Batch normalization: Accelerating deep network training by reducing internal covariate shift. In Bach, F. R. and Blei, D. M. (eds.), Proceedings of the 32nd International Conference on Machine Learning, ICML 2015, Lille, France, 6-11 July 2015, volume 37 of JMLR Workshop and Conference Proceedings, pp.  448–456. JMLR.org, 2015. URL http://proceedings.mlr.press/v37/ioffe15.html.
  15. Causal machine learning: A survey and open problems. CoRR, abs/2206.15475, 2022. doi: 10.48550/arXiv.2206.15475. URL https://doi.org/10.48550/arXiv.2206.15475.
  16. Out-of-distribution generalization with maximal invariant predictor. CoRR, abs/2008.01883, 2020. URL https://arxiv.org/abs/2008.01883.
  17. Rethinking graph transformers with spectral attention. In Ranzato, M., Beygelzimer, A., Dauphin, Y. N., Liang, P., and Vaughan, J. W. (eds.), Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, NeurIPS 2021, December 6-14, 2021, virtual, pp.  21618–21629, 2021. URL https://proceedings.neurips.cc/paper/2021/hash/b4fd1d2cb085390fbbadae65e07876a7-Abstract.html.
  18. Out-of-distribution generalization on graphs: A survey. CoRR, abs/2202.07987, 2022a. URL https://arxiv.org/abs/2202.07987.
  19. Learning invariant graph representations for out-of-distribution generalization. In NeurIPS, 2022b. URL http://papers.nips.cc/paper_files/paper/2022/hash/4d4e0ab9d8ff180bf5b95c258842d16e-Abstract-Conference.html.
  20. OOD-GNN: out-of-distribution generalized graph neural network. IEEE Trans. Knowl. Data Eng., 35(7):7328–7340, 2023. doi: 10.1109/TKDE.2022.3193725. URL https://doi.org/10.1109/TKDE.2022.3193725.
  21. Let invariant rationale discovery inspire graph contrastive learning. In Chaudhuri, K., Jegelka, S., Song, L., Szepesvári, C., Niu, G., and Sabato, S. (eds.), International Conference on Machine Learning, ICML 2022, 17-23 July 2022, Baltimore, Maryland, USA, volume 162 of Proceedings of Machine Learning Research, pp.  13052–13065. PMLR, 2022c. URL https://proceedings.mlr.press/v162/li22v.html.
  22. Graph rationalization with environment-based augmentations. In Zhang, A. and Rangwala, H. (eds.), KDD ’22: The 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, Washington, DC, USA, August 14 - 18, 2022, pp.  1069–1078. ACM, 2022. doi: 10.1145/3534678.3539347. URL https://doi.org/10.1145/3534678.3539347.
  23. Graphit: Encoding graph structure in transformers. CoRR, abs/2106.05667, 2021. URL https://arxiv.org/abs/2106.05667.
  24. Transformer for graphs: An overview from architecture perspective. CoRR, abs/2202.08455, 2022. URL https://arxiv.org/abs/2202.08455.
  25. Recipe for a general, powerful, scalable graph transformer. In NeurIPS, 2022. URL http://papers.nips.cc/paper_files/paper/2022/hash/5d4834a159f1547b267a05a4e2b7cf5e-Abstract-Conference.html.
  26. Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res., 15(1):1929–1958, 2014. doi: 10.5555/2627435.2670313. URL https://dl.acm.org/doi/10.5555/2627435.2670313.
  27. Attention is all you need. In Guyon, I., von Luxburg, U., Bengio, S., Wallach, H. M., Fergus, R., Vishwanathan, S. V. N., and Garnett, R. (eds.), Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, December 4-9, 2017, Long Beach, CA, USA, pp. 5998–6008, 2017. URL https://proceedings.neurips.cc/paper/2017/hash/3f5ee243547dee91fbd053c1c4a845aa-Abstract.html.
  28. Graph attention networks. In 6th International Conference on Learning Representations, ICLR 2018, Vancouver, BC, Canada, April 30 - May 3, 2018, Conference Track Proceedings. OpenReview.net, 2018. URL https://openreview.net/forum?id=rJXMpikCZ.
  29. Linformer: Self-attention with linear complexity. CoRR, abs/2006.04768, 2020. URL https://arxiv.org/abs/2006.04768.
  30. Discovering invariant rationales for graph neural networks. In The Tenth International Conference on Learning Representations, ICLR 2022, Virtual Event, April 25-29, 2022. OpenReview.net, 2022a. URL https://openreview.net/forum?id=hGXij5rfiHw.
  31. A comprehensive survey on graph neural networks. IEEE Trans. Neural Networks Learn. Syst., 32(1):4–24, 2021. doi: 10.1109/TNNLS.2020.2978386. URL https://doi.org/10.1109/TNNLS.2020.2978386.
  32. Representing long-range context for graph neural networks with global attention. CoRR, abs/2201.08821, 2022b. URL https://arxiv.org/abs/2201.08821.
  33. How powerful are graph neural networks? In 7th International Conference on Learning Representations, ICLR 2019, New Orleans, LA, USA, May 6-9, 2019. OpenReview.net, 2019. URL https://openreview.net/forum?id=ryGs6iA5Km.
  34. Multimodal learning with transformers: A survey. CoRR, abs/2206.06488, 2022. doi: 10.48550/arXiv.2206.06488. URL https://doi.org/10.48550/arXiv.2206.06488.
  35. Do transformers really perform badly for graph representation? In Ranzato, M., Beygelzimer, A., Dauphin, Y. N., Liang, P., and Vaughan, J. W. (eds.), Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, NeurIPS 2021, December 6-14, 2021, virtual, pp.  28877–28888, 2021. URL https://proceedings.neurips.cc/paper/2021/hash/f1c1592588411002af340cbaedd6fc33-Abstract.html.
  36. Big bird: Transformers for longer sequences. In Larochelle, H., Ranzato, M., Hadsell, R., Balcan, M., and Lin, H. (eds.), Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, NeurIPS 2020, December 6-12, 2020, virtual, 2020. URL https://proceedings.neurips.cc/paper/2020/hash/c8512d142a2d849725f31a9a7a361ab9-Abstract.html.
  37. Suger: A subgraph-based graph convolutional network method for bundle recommendation. In Hasan, M. A. and Xiong, L. (eds.), Proceedings of the 31st ACM International Conference on Information & Knowledge Management, Atlanta, GA, USA, October 17-21, 2022, pp.  4712–4716. ACM, 2022. doi: 10.1145/3511808.3557707. URL https://doi.org/10.1145/3511808.3557707.

Summary

  • The paper introduces IGT, a novel approach for fine-grained graph rationalization using node- and virtual node-level interventions through Transformer self-attention.
  • It employs an encoder, augmenter, intervener, and predictor in a synergistic architecture to identify informative subgraphs while ensuring robustness across varying environments.
  • Experiments on 7 real-world datasets show that IGT consistently outperforms or matches 13 baseline methods, demonstrating enhanced predictive accuracy and interpretability.

Introduction to Invariant Graph Transformer

Graphs are an immensely useful data structure, prevalently used to model relationships and interactions in various fields such as chemistry, social networks, and biology. A critical aspect of graph machine learning is to identify substructures within graphs, termed "graph rationales", that are most informative for the predictions of particular tasks. Graph rationales can enhance model performance and improve model explainability by capturing the most relevant features within a complex network.

Methodology

The proposed method introduces Invariant Graph Transformer (IGT), a novel architecture aiming at fine-grained graph rationalization. Unlike existing methods that intervene on a graph level, IGT operates on a more precise node-level or virtual node-level, leveraging the power of the self-attention mechanism found in Transformer models. Essentially, IGT is composed of an encoder, augmenter, intervener, and predictor, working in synergy to discover and exploit the pivotal subgraph, while ensuring its predictive robustness against the backdrop of varying environments.

Experimental Results

IGT has been rigorously tested across 7 real-world datasets, comparing its performance against 13 baseline methods. The experiments demonstrate that both node-level (IGT-N) and virtual node-level (IGT-VN) variants of IGT consistently outperform or match the competing methods. This indicates that IGT's approach to fine-grained intervention, combined with its invariant learning process, is highly effective in graph rationalization tasks.

Conclusion

The research introduces a new perspective to the graph rationale discovery problem, proposing a Transformer-inspired model that intervenes at a granular level. IGT not only identifies crucial subgraphs more effectively than coarse-grained approaches but also maintains their utility across variable conditions, resulting in an impressive performance. The study's findings lay the groundwork for future research in optimizing graph-learning models for both predictive accuracy and interpretability.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 1 tweet with 2 likes about this paper.