Learning Successor Features with Distributed Hebbian Temporal Memory

Published 20 Oct 2023 in cs.LG, cs.AI, and cs.NE | (2310.13391v4)

Abstract: This paper presents a novel approach to address the challenge of online sequence learning for decision making under uncertainty in non-stationary, partially observable environments. The proposed algorithm, Distributed Hebbian Temporal Memory (DHTM), is based on the factor graph formalism and a multi-component neuron model. DHTM aims to capture sequential data relationships and make cumulative predictions about future observations, forming Successor Features (SFs). Inspired by neurophysiological models of the neocortex, the algorithm uses distributed representations, sparse transition matrices, and local Hebbian-like learning rules to overcome the instability and slow learning of traditional temporal memory algorithms such as RNN and HMM. Experimental results show that DHTM outperforms LSTM, RWKV and a biologically inspired HMM-like algorithm, CSCG, on non-stationary data sets. Our results suggest that DHTM is a promising approach to address the challenges of online sequence learning and planning in dynamic environments.

Abstract PDF HTML Upgrade to Chat

Summary

The paper demonstrates that DHTM accelerates the learning of successor features compared to conventional RNN and LSTM approaches.
The paper introduces a Hebbian-inspired local learning mechanism that uses sparse transition matrices for efficient online adaptation.
The paper shows that DHTM outperforms state-of-the-art models in reinforcement learning tasks by quickly adapting to environmental changes.

Introduction

In the field of Artificial Intelligence, temporal memory algorithms (TM) occupy a crucial place, enabling models to leverage past experiences for future predictions—a mechanism pivotal for tasks such as reinforcement learning (RL), natural language processing, and world modeling. The traditional tools for TM, such as Recurrent Neural Networks (RNN) and Hidden Markov Models (HMM), have consistently been confronted with challenges, particularly in learning stable representations and online adaptation in non-stationary environments. In response, this paper introduces the Distributed Hebbian Temporal Memory (DHTM), a novel algorithm with a unique angle on handling the sequential data modeling task.

Theoretical Background

The DHTM posits a departure from classic backpropagation-driven approaches, circumventing the typical instabilities by integrating learning mechanisms inspired by neurophysiological models of cortical networks. Anchored in factor graph formalism and using a multicomponent neuron model that encapsulates sparse transition matrices, DHTM is designed to perform online learning, guided strictly by local Hebbian-like rules. This allows the model to adapt in real-time to changes within partially observable environments—conditions that reflect many real-world scenarios where data is sequential and stochastic elements are present.

The Successor Representation (SR) is leveraged within the DHTM framework, allowing the agent to disentangle the representation of environment states from the given reward function. The novel contribution of the DHTM is that it facilitates faster accumulation of predictive successions for the SR compared to LSTM and performs competitively against advanced RNN-like models like RWKV.

Methodology

DHTM takes a graphical structure reminiscent of Factorial-HMM but distinguishes itself by forming the graph online during training. A significant advance of DHTM is the introduction of an efficient storage mechanism for transition matrices, mirroring biological dendritic segment computations. Sparse matrices not only lead to fewer trainable parameters but also enable quicker learning rates.

Computations of segment likelihoods and excitation, which is transmitted to neural cells, are presented distinctly in the model. The ability to estimate distributions of hidden states and the learning of emission and transition factors without explicit construction reduce computational overhead. Furthermore, the authors outline an agent architecture suited for reinforcement learning tasks, incorporating the memory model, SR representations, and an observation reward function to test the viability of DHTM within an applied context.

Experimental Findings

DHTM's efficacy was rigorously tested in an RL setup, utilizing a pinball-like environment. The performance was measured against LSTM, RWKV, and a factorial CHMM. Results indicated that DHTM not only outperformed LSTM and RWKV in SR formation but also swiftly adapted to environmental changes. When an agent's optimal strategy became obsolete due to introduced perturbations, the DHTM-powered agent demonstrated greater adaptability in a fewer number of episodes compared to its counterparts.

Final Reflections

The Distributed Hebbian Temporal Memory algorithm showcases promise in addressing the challenges of online hidden representation learning. By naturally integrating the sequence modeling abilities of temporal memory within the learning process, DHTM offers a pathway to more robust AI performance in dynamic, stochastic environments. This neural model acknowledgement to cortical neuronal architectures not only bridges machine learning with biological plausibility but could also pave the way for more efficient and adaptable AI systems in future applications.

Markdown Report Issue