Towards Explainable Deep Neural Networks (xDNN)

Published 5 Dec 2019 in cs.LG, cs.AI, and cs.CV | (1912.02523v1)

Abstract: In this paper, we propose an elegant solution that is directly addressing the bottlenecks of the traditional deep learning approaches and offers a clearly explainable internal architecture that can outperform the existing methods, requires very little computational resources (no need for GPUs) and short training times (in the order of seconds). The proposed approach, xDNN is using prototypes. Prototypes are actual training data samples (images), which are local peaks of the empirical data distribution called typicality as well as of the data density. This generative model is identified in a closed form and equates to the pdf but is derived automatically and entirely from the training data with no user- or problem-specific thresholds, parameters or intervention. The proposed xDNN offers a new deep learning architecture that combines reasoning and learning in a synergy. It is non-iterative and non-parametric, which explains its efficiency in terms of time and computational resources. From the user perspective, the proposed approach is clearly understandable to human users. We tested it on some well-known benchmark data sets such as iRoads and Caltech-256. xDNN outperforms the other methods including deep learning in terms of accuracy, time to train and offers a clearly explainable classifier. In fact, the result on the very hard Caltech-256 problem (which has 257 classes) represents a world record.

Abstract PDF Upgrade to Chat

Citations (242)

View on Semantic Scholar

Summary

The paper presents xDNN, a novel prototype-based architecture that enhances model explainability while reducing computational requirements.
Experimental evaluations on iRoads and Caltech-256 demonstrate significant gains, with accuracies of 99.59% and 75.41% respectively.
The methodology employs typicality and prototype layers to generate clear, non-iterative representations that improve transparency and trust.

Toward Explainable Deep Neural Networks (xDNN)

The paper "Towards Explainable Deep Neural Networks (xDNN)" introduces an innovative approach to deep learning by emphasizing the explainability and efficiency of neural networks. The authors, Plamen Angelov and Eduardo Soares, present a methodology that directly addresses the limitations of traditional deep learning models, notably their 'black-box' nature and high computational demands. The proposed xDNN model stands out for its non-iterative and non-parametric design, facilitating an architecture that not only streamlines the learning process but also paves the way for human interpretable models.

Core Contributions

The xDNN model deviates from conventional deep learning architectures by employing a prototype-based approach. Prototypes in this context are specific data samples from the training set that serve as local peaks in the empirical data distribution, which is identified through a concept termed 'typicality'. Typicality acts as a closed-form representation akin to a probability distribution function (pdf), derived empirically from the training data. This strategy enables xDNN to achieve a high degree of accuracy without the need for large-scale computational resources such as GPUs, thus reducing both the time and energy costs typically associated with deep learning models.

Experimental Results

The xDNN was evaluated using benchmark datasets, such as iRoads and Caltech-256, demonstrating significant performance improvements over existing methods. On the iRoads dataset, xDNN achieved an accuracy of 99.59%, surpassing other methodologies including the VGG--VD--16 architecture while requiring substantially less training time and no dedicated GPUs. In the challenging Caltech-256 dataset, which features 257 classes, xDNN set a new performance record with an accuracy of 75.41%, surpassing previous approaches such as those employing SVM and Softmax classifiers.

Methodological Innovations

At a technical level, xDNN's architecture is structured in layers that include:

Features Descriptor Layer: Extracts high-level features from images using pre-trained models or more traditional handcrafted methods like GIST or HoG.
Density Layer: Computes data density using a Cauchy form, facilitating the identification of mutual proximity among data points in the feature space.
Typicality Layer: Develops an empirically derived pdf representing typicality, crucial for prototype identification.
Prototypes Layer: Provides the core transparency in the model by forming prototype-based rules, enabling straightforward visualization and interpretation.
MegaClouds Layer: Aggregates data clouds that encapsulate these prototypes into larger groups for enhanced interpretability by users, which are illustrated using Voronoi tessellation.

Implications and Future Directions

By emphasizing explainability and efficiency, xDNN aligns with the growing demand for transparent AI systems, particularly in high-stakes applications such as autonomous vehicles and healthcare. The interpretability of xDNN models is a pivotal advantage, allowing stakeholders to understand and trust decisions made by the network, a vital step forward compared to opaque deep learning models.

From a theoretical perspective, the reliance on prototypes and empirically derived pdfs is a significant shift towards more flexible, data-driven model configurations. The implications for real-time adaptability and scalability in dynamic environments are profound.

Moving forward, xDNN's design principles could inspire further research into creating hybrid models that synergize learning and reasoning. Future work could involve expanding the xDNN framework to handle diverse data modalities or integrating with other AI techniques to enhance adaptability and optimization further. The prospect of utilizing xDNN in creating lightweight AI solutions with robust interpretability opens exciting avenues for both academic exploration and industrial application.

Markdown Report Issue