ArrhythmiaVision: Resource-Conscious Deep Learning Models with Visual Explanations for ECG Arrhythmia Classification

Published 30 Apr 2025 in cs.LG, cs.AI, and eess.SP | (2505.03787v1)

Abstract: Cardiac arrhythmias are a leading cause of life-threatening cardiac events, highlighting the urgent need for accurate and timely detection. Electrocardiography (ECG) remains the clinical gold standard for arrhythmia diagnosis; however, manual interpretation is time-consuming, dependent on clinical expertise, and prone to human error. Although deep learning has advanced automated ECG analysis, many existing models abstract away the signal's intrinsic temporal and morphological features, lack interpretability, and are computationally intensive-hindering their deployment on resource-constrained platforms. In this work, we propose two novel lightweight 1D convolutional neural networks, ArrhythmiNet V1 and V2, optimized for efficient, real-time arrhythmia classification on edge devices. Inspired by MobileNet's depthwise separable convolutional design, these models maintain memory footprints of just 302.18 KB and 157.76 KB, respectively, while achieving classification accuracies of 0.99 (V1) and 0.98 (V2) on the MIT-BIH Arrhythmia Dataset across five classes: Normal Sinus Rhythm, Left Bundle Branch Block, Right Bundle Branch Block, Atrial Premature Contraction, and Premature Ventricular Contraction. In order to ensure clinical transparency and relevance, we integrate Shapley Additive Explanations and Gradient-weighted Class Activation Mapping, enabling both local and global interpretability. These techniques highlight physiologically meaningful patterns such as the QRS complex and T-wave that contribute to the model's predictions. We also discuss performance-efficiency trade-offs and address current limitations related to dataset diversity and generalizability. Overall, our findings demonstrate the feasibility of combining interpretability, predictive accuracy, and computational efficiency in practical, wearable, and embedded ECG monitoring systems.

Abstract PDF Upgrade to Chat

Authors (4)

Summary

The paper introduces two novel 1D CNN architectures, ArrhythmiNet V1 and V2, achieving 99% and 98% accuracy respectively on the MIT-BIH dataset.
It integrates explainable AI techniques like SHAP and Grad-CAM to visually highlight key ECG features such as the QRS complex and T-wave.
The lightweight design enables real-time, edge-device deployment, facilitating timely diagnosis in remote healthcare monitoring systems.

ArrhythmiaVision: Resource-Conscious Deep Learning Models with Visual Explanations for ECG Arrhythmia Classification

Introduction

The paper "ArrhythmiaVision: Resource-Conscious Deep Learning Models with Visual Explanations for ECG Arrhythmia Classification" (2505.03787) addresses the challenge of accurately classifying cardiac arrhythmias using electrocardiogram (ECG) signals with resource-efficient deep learning models. Arrhythmias are critical to diagnose early to prevent life-threatening cardiac events, but traditional manual ECG interpretation requires considerable time and expertise. This study introduces two novel 1D convolutional neural network (CNN) architectures, ArrhythmiNet V1 and V2, designed for deployment on edge devices with limited computational resources. These models achieve high classification accuracy while maintaining a competitive memory footprint, making them suitable for real-time monitoring in resource-constrained environments.

Methodology

Dataset and Preprocessing

The research utilizes the MIT-BIH Arrhythmia Database, which is the standard benchmark for arrhythmia classification. It includes data from approximately 1.1 million ECG beats sampled at 360 Hz across five arrhythmia classes: Normal Sinus Rhythm, Left-Bundle Branch Block, Right Bundle Branch Block, Atrial Premature Contraction, and Premature Ventricular Contraction.

Figure 1: (Left) Distribution of the unbalanced raw MIT-BIH dataset. (Right) Distribution after applying balancing techniques.

The dataset's inherent class imbalance was addressed through oversampling minority classes and undersampling majority classes, ensuring that every class was well-represented during training. The ECG signals were denoised using a wavelet transform and subsequently normalized before model training.

Figure 2: (Top Left) Normal Sinus Rhythm, the standard ECG signal exhibiting all regular morphological features. (Top Right) Premature Ventricular Contraction, characterized by an early contraction of the ventricles. (Middle Left) Right Bundle Branch Block. (Middle Right) Left Bundle Branch Block. (Bottom) Atrial Premature Contraction.

Architectural Design

ArrhythmiNet V1 and V2 leverage depthwise separable convolutions, inspired by MobileNet structures, to minimize computational load while preserving spatial and temporal ECG features crucial for arrhythmia detection. ArrhythmiNet V1 features five depthwise separable convolution blocks, achieving high classification accuracy with a modest model size of 303.18 KB.

Figure 3: ArrhythmiNet V1 architecture featuring five depthwise separable convolution blocks.

ArrhythmiNet V2 incorporates residual connections and bottleneck blocks, further enhancing feature extraction and convergence speed, while reducing the model size to 157.76 KB.

Figure 4: ArrhythmiNet V2 architecture showing residual connections and seven bottleneck blocks.

Explainable AI Techniques

The integration of explainable AI (XAI) methods such as SHAP (Shapley Additive Explanations) and Grad-CAM (Gradient-weighted Class Activation Mapping) enables both local and global interpretability of the model's predictions. These techniques illuminate which aspects of the ECG signals—such as the QRS complex and T-wave—are pivotal to the model's decision-making process.

Figure 5: (Left) Randomly sampled original ECG signal. (Middle) SHAP output for ArrhythmiNet V1. (Right) SHAP output for ArrhythmiNet V2.

Figure 6: (Left) Grad-CAM output for ArrhythmiNet V1. (Right) Grad-CAM results for ArrhythmiNet V2.

Results

In evaluating both architectures on the MIT-BIH dataset, ArrhythmiNet V1 achieved a classification accuracy of 99%, while ArrhythmiNet V2 reached 98%. Both models demonstrated high precision across all arrhythmia classes, though ArrhythmiNet V1 slightly outperforms V2, possibly due to better feature localization capabilities.

Figure 7: (Left) Confusion matrix for ArrhythmiNet V1, showing minimal misclassifications across all classes. (Right) Confusion matrix for ArrhythmiNet V2, exhibiting slightly lower accuracy than the former but still yielding promising results.

Conclusion

The study presents significant advancements in ECG-based arrhythmia classification by developing models that are not only accurate but also resource-efficient and interpretable. These lightweight CNN architectures are poised for integration into wearable and real-time monitoring systems, offering a feasible solution for remote healthcare monitoring and timely diagnosis of cardiac events. Future research may explore deploying these models in multi-lead ECG configurations and considering additional real-world datasets for further validation.

Markdown Report Issue