Auto-encoding GPS data to reveal individual and collective behaviour

Published 1 Dec 2023 in stat.AP, cs.LG, and cs.SI | (2312.00456v1)

Abstract: We propose an innovative and generic methodology to analyse individual and collective behaviour through individual trajectory data. The work is motivated by the analysis of GPS trajectories of fishing vessels collected from regulatory tracking data in the context of marine biodiversity conservation and ecosystem-based fisheries management. We build a low-dimensional latent representation of trajectories using convolutional neural networks as non-linear mapping. This is done by training a conditional variational auto-encoder taking into account covariates. The posterior distributions of the latent representations can be linked to the characteristics of the actual trajectories. The latent distributions of the trajectories are compared with the Bhattacharyya coefficient, which is well-suited for comparing distributions. Using this coefficient, we analyse the variation of the individual behaviour of each vessel during time. For collective behaviour analysis, we build proximity graphs and use an extension of the stochastic block model for multiple networks. This model results in a clustering of the individuals based on their set of trajectories. The application to French fishing vessels enables us to obtain groups of vessels whose individual and collective behaviours exhibit spatio-temporal patterns over the period 2014-2018.

Abstract PDF Upgrade to Chat

Citations (1)

View on Semantic Scholar

Summary

The paper introduces a convolutional CVAE that leverages covariates to isolate intrinsic vessel behaviour from temporal effects.
It employs the Bhattacharyya coefficient and network analysis to quantify and group individual and collective action from GPS trajectories.
Empirical findings show that incorporating seasonality reduces latent dimensionality and maps spatial exploration to distinct latent axes.

Auto-Encoding GPS Data to Reveal Individual and Collective Behaviour

This work introduces and systematically evaluates a deep probabilistic modeling pipeline designed for extracting low-dimensional representations from large-scale trajectory data, specifically applied to regulatory GPS tracks from French fishing vessels. The primary methodological contribution involves utilizing a convolutional conditional variational auto-encoder (CVAE) architecture which incorporates covariates (seasonality) for robust disentanglement of intrinsic behavioural variability from exogenous factors. Downstream, this low-dimensional latent space is exploited to quantify and compare both individual and group-level behaviour using metrics based on distributional similarity (the Bhattacharyya coefficient) and via network analysis of proximity graphs derived from these similarities.

Methodology Overview and Implementation Details

The model posits that each observed trajectory (a series of temporally aligned 2D positions) is generated by a nonlinear function of a low-dimensional latent variable, with exogenous covariate conditioning. The CVAE is chosen for its ability to regularize representations and mitigate conflation of latent behavioural structure with known time-dependent effects—a limitation identified in unconditioned VAEs.

Probabilistic Model and Training

Each trajectory $y_m \in \mathbb{R}^{24 \times 2}$ (24 hourly positions, longitude/latitude) is associated with a latent vector $z_m \in \mathbb{R}^{3}$ in the conditional setting, following a standard normal prior.
The decoder $g(z_m, x_m;\theta)$ maps the latent embedding and covariate (encoded as cyclical day-of-year) onto trajectory space, parameterizing the mean of a Gaussian likelihood.
The encoder $h(y_m, x_m; \phi)$ computes the mean and diagonal log-variance for the approximate posterior $q(z_m | y_m, x_m; \phi)$ .
Objective: Maximize the ELBO over the dataset, trading off trajectory reconstruction fidelity and KL-regularization towards the prior.
Optimization is via stochastic gradient descent with Adam, leveraging the reparameterization trick for gradient flow through stochastic sampling.

CNN Architecture

A key implementation detail is the use of 1D convolutions over the temporal dimension with channels corresponding to spatial coordinates and covariates. Encoder and decoder architectures are symmetric for stability. Covariates are broadcast as additional channels across the trajectory time sequence.

Pseudocode for the encoder structure:

class TrajectoryEncoder(nn.Module):
    def __init__(self):
        super().__init__()
        self.conv1 = nn.Conv1d(4, 8, kernel_size=4, stride=2)
        self.bn1 = nn.BatchNorm1d(8)
        self.conv2 = nn.Conv1d(8, 32, kernel_size=4, stride=2)
        self.bn2 = nn.BatchNorm1d(32)
        self.conv3 = nn.Conv1d(32, 128, kernel_size=3, stride=2)
        self.bn3 = nn.BatchNorm1d(128)
        self.conv4 = nn.Conv1d(128, 3, kernel_size=2, stride=1)
    
    def forward(self, x):
        x = F.leaky_relu(self.bn1(self.conv1(x)), 0.2)
        x = F.leaky_relu(self.bn2(self.conv2(x)), 0.2)
        x = F.leaky_relu(self.bn3(self.conv3(x)), 0.2)
        z = self.conv4(x)
        # z.shape: [batch, 3, 1]
        # z[:, :, 0] provides means and log-variances
        return z[..., 0]

Adjust kernel sizes and channel dimensions in the decoder analogously. Covariates (day-of-year) are concatenated to each timestep and trajectory channel.

Latent Representation Analysis

Selection of latent space dimensionality is based on ELBO plateaus and KL regularization saturation, converging to $d_Z=3$ for the CVAE. Per-dimension analysis of Bhattacharyya coefficients and marginal KL divergences demonstrates that only the first two dimensions are informative for most trajectories; the third dimension captures rare or atypical behaviours.

To relate latent variables to interpretable trajectory features, grid sampling and decoding are combined with random forest-based Sobol MDA analysis. This quantifies the mapping from first latent dimension to the east-west gradient of exploration, from the second to north-south activity, and shows minimal effect of temporal covariates on maximal vessel range.

Similarity Quantification and Network Construction

Unlike point embedding comparisons (Euclidean, cosine), the approach computes the Bhattacharyya coefficient (BC) between full posterior distributions in latent space, capturing uncertainty. This is meaningful in the context of Gaussian posteriors with diagonal covariance and gives a well-defined overlap metric.

Individual stability is encapsulated via the average (over day pairs) binarized BC within a vessel.
Collective behaviour is summarised by proximity graphs (edges between vessels exceeding BC threshold), aggregated over time spans (e.g., quarters).

The colSBM (Stochastic Block Model for collections of networks) clusters vessels by connection patterns in these proximity graphs, yielding robust mesoscale summaries across quarters.

Empirical Results

Applying the method to 20,512 daily trajectories from 33 vessels over five years yields strong findings:

The CVAE with seasonality as a covariate removes spurious temporal modes in the latent representation, yielding more behaviourally meaningful clusters than unconditioned VAEs.
Latent dimensions empirically align with major axes of spatial exploitation (east-west, north-south), facilitating interpretation.
Individual-level analysis shows high day-to-day stability of behaviour within vessels, but substantial diversity across the fleet over the full period.
Network analysis reveals two strongly connected, near non-interacting communities of vessels with peripheral groups, stable over quarters.
Vessels with shorter, port-proximal trips constitute one group, associated empirically with lower gross revenue per trip, indicative of possible cost-revenue trade-off strategies.

A notable quantitative result is the reduction of latent dimensionality (from VAE: 4, to CVAE: 3) attributable to explicit covariate modeling, confirming that part of the trajectory variance is explained by temporal exogenous factors.

Implementation Considerations and Trade-Offs

The convolutional approach, as opposed to LSTM-based encoders, is justified by the requirement to capture medium/large scale structures and preserve trajectory cyclical boundaries.
Regularization via the KL term (weighting via fixed $\sigma^2$ in the decoder) is critical; overly low variance leads to overfitting, while high values oversmooth behaviour.
The Bhattacharyya coefficient as a similarity metric is preferable for interpretability and closed-form evaluation but alternative distances (e.g., Wasserstein, Hellinger, others for Gaussian families) may offer robustness for different use cases.
Binary edge definition in the proximity graph via empirical BC thresholding is straightforward but could mask nuanced partial similarities; modeling edge strengths as continuous-valued or probabilistic (e.g., via Beta or continuous Bernoulli) is a potential extension.

Practical Implications and Future Directions

The approach directly supports construction of data-driven "reference fleets" for stratified stock abundance estimation, with an explicit mechanism for integrating both space and inter-vessel behaviour. The generative nature of the CVAE allows, in principle, hypothesis testing for independence of trajectories and simulation of alternative behavioural strategies under counterfactual conditions.

Prospects for extending the framework include:

Incorporation of additional covariates (environmental, vessel characteristics)
Attention-based or other context-aware sequence models for richer temporal relationships
Non-binary, probabilistically weighted proximity graphs and corresponding generalizations of SBM
Supervised or semi-supervised extensions to align latent codes with economic or ecological outcomes

Theoretical Implications

The fusion of low-dimensional probabilistic embeddings with distributional similarity metrics and network statistical mechanics (colSBM) offers a powerful paradigm for analysing collective behaviour in complex agent systems. The methodology enables hypothesis-driven and interpretable clustering, addressing both the variance explained by exogenous context and the intrinsic latent structure.

Conclusion

This study provides a reproducible, interpretable, and highly scalable pipeline—conditional variational auto-encoding of spatio-temporal sequence data, uncertainty-aware trajectory similarity via Bhattacharyya coefficient, and network-based group detection—demonstrated at fleet scale. The design choices (convolutional encoder/decoder, inclusion of covariates, use of distributional similarity) lead to enhanced interpretability, empirical robustness, and practical utility for marine resource management, but the approach is immediately generalisable to other domains of collective animal or agent movement analytics.

Markdown Report Issue