Data Augmentation of Multivariate Sensor Time Series using Autoregressive Models and Application to Failure Prognostics

Published 21 Oct 2024 in stat.ML, cs.LG, math.ST, and stat.ME | (2410.16419v2)

Abstract: This work presents a novel data augmentation solution for non-stationary multivariate time series and its application to failure prognostics. The method extends previous work from the authors which is based on time-varying autoregressive processes. It can be employed to extract key information from a limited number of samples and generate new synthetic samples in a way that potentially improves the performance of PHM solutions. This is especially valuable in situations of data scarcity which are very usual in PHM, especially for failure prognostics. The proposed approach is tested based on the CMAPSS dataset, commonly employed for prognostics experiments and benchmarks. An AutoML approach from PHM literature is employed for automating the design of the prognostics solution. The empirical evaluation provides evidence that the proposed method can substantially improve the performance of PHM solutions.

Abstract PDF HTML Upgrade to Chat

Summary

The paper introduces a TVAR-based approach that augments multivariate sensor time series to alleviate data scarcity in failure prognostics.
It employs time-varying parameters to model both the mean and covariance, providing an interpretable analytical representation of dynamic data.
Empirical results using the C-MAPSS dataset showed improvements in RMSE and scoring metrics, enhancing prediction accuracy.

Data Augmentation of Multivariate Sensor Time Series using Autoregressive Models and Application to Failure Prognostics

Introduction

The paper focuses on addressing data scarcity challenges in Prognostics and Health Management (PHM), especially in failure prognostics, by introducing a novel data augmentation method for non-stationary multivariate time series. The proposed technique enhances previous work by employing time-varying autoregressive (TVAR) models to generate synthetic data from limited samples, thus potentially improving PHM solutions.

Methodology

The TVAR model used extends traditional AR and ARMA models by incorporating time-varying parameters, thus capturing the dynamic nature of non-stationary multivariate time series without requiring data transformation for stationarity. The method involves formulating the TVAR model to simultaneously model both the mean and covariance of the time series data, leading to improved data capture capabilities.

The TVAR model's key feature is its ability to provide an analytical representation, useful in various engineering applications where interpretable solutions are crucial. This method contrasts with deep learning approaches, which generally require extensive amounts of data and lack analytical representations.

The data augmentation process is implemented using the C-MAPSS dataset, commonly employed for evaluating failure prognostics methodologies. AutoML techniques are further utilized to automate the design of the PHM solution, ensuring that the evaluation of the proposed method remains unbiased and comprehensive.

Empirical Evaluation

Emphasis is placed on empirical validation of the proposed data augmentation method. By leveraging the CMAPSS dataset, the authors demonstrate the ability of the TVAR model to augment data realistically. Specifically, only five samples from the historical dataset are used to create augmented data, reflecting real-world data scarcity scenarios.

The AutoML framework facilitates the design and testing of ML models for degradation trend prediction. The performance metrics used for assessment include RMSE and a scoring function from the PHM Society data challenge, both demonstrating substantial improvements in prognostic capabilities when the augmented data is employed.

Figure 1: Run to failure data from four sensors (11, 29, 21, and 25) for dataset #1 (FD001). Each plot presents data from 3 real units (47, 60, and 68) and one time series resulting from data augmentation (Aug. Unit).

Results

The experimental results indicate a marked improvement in prognostics accuracy when utilizing the proposed augmentation method. Specifically, the methods showed an average improvement in RMSE by 2% for dataset #1 and 6% for dataset #3. The scoring metric also showed improvement, with a significant increase of 20% for dataset #3, demonstrating the efficacy of the method in diverse operational conditions.

Conclusion

The proposed data augmentation method using TVAR models offers a significant advancement for failure prognostics in situations of data scarcity. It provides an analytical and practical framework for enhancing PHM solutions, demonstrating noteworthy improvements in prediction accuracy. Future work will explore additional datasets and further refine the augmentation strategies, particularly under varying operational conditions. This method's scalability and adaptability signify its potential application in industrial settings, where failure prognostics is critical.

Markdown Report Issue