Benchmark Analysis of Representative Deep Neural Network Architectures

Published 1 Oct 2018 in cs.CV | (1810.00736v2)

Abstract: This work presents an in-depth analysis of the majority of the deep neural networks (DNNs) proposed in the state of the art for image recognition. For each DNN multiple performance indices are observed, such as recognition accuracy, model complexity, computational complexity, memory usage, and inference time. The behavior of such performance indices and some combinations of them are analyzed and discussed. To measure the indices we experiment the use of DNNs on two different computer architectures, a workstation equipped with a NVIDIA Titan X Pascal and an embedded system based on a NVIDIA Jetson TX1 board. This experimentation allows a direct comparison between DNNs running on machines with very different computational capacity. This study is useful for researchers to have a complete view of what solutions have been explored so far and in which research directions are worth exploring in the future; and for practitioners to select the DNN architecture(s) that better fit the resource constraints of practical deployments and applications. To complete this work, all the DNNs, as well as the software used for the analysis, are available online.

Abstract PDF Upgrade to Chat

Citations (656)

View on Semantic Scholar

Summary

The paper demonstrates that high recognition accuracy is achievable with lower computational complexity, as exemplified by SE-ResNeXt-50.
It reveals that increased model complexity does not guarantee better accuracy, with examples like VGG-13 highlighting inefficiencies.
It highlights trade-offs in throughput and memory usage, guiding optimal DNN selection for both high-end and embedded systems.

Analysis of Benchmarking Deep Neural Networks for Image Recognition

This paper presents a comprehensive evaluation of a wide array of deep neural networks (DNNs) applied to image recognition tasks, notably using the ImageNet-1k dataset. The research focuses on a multifaceted analysis of over 40 architectures, emphasizing performance indices such as recognition accuracy, computational and model complexity, memory usage, and inference time. By conducting experiments on disparate hardware configurations—a NVIDIA Titan X Pascal and a NVIDIA Jetson TX1—the study contrasts DNN operations across high-end and embedded systems, offering a nuanced view of architecture efficiency.

Key Findings

Accuracy and Computational Complexity: The study refutes the assumption of a direct correlation between computational complexity (measured in FLOPs) and recognition accuracy. For instance, the SE-ResNeXt-50 (32x4d) achieves high accuracy with relatively low computational demands, highlighting efficiency advancements in network architecture design.
Model Complexity: There is no linear relationship between model complexity and accuracy. This finding is exemplified by models like VGG-13, which exhibit high complexity without a commensurate gain in accuracy.
Parameter Utilization: Efficiency in parameter usage varies among DNNs. Models such as MobileNet-v2 and NASNet-A-Mobile demonstrate substantial accuracy densities, implying adept parameter use compared to larger, less efficient networks.
Throughput Constraints: Desired inference throughput imposes constraints on maximal accuracy. This trade-off is crucial for application scenarios requiring real-time processing, where models like Xception offer favorable accuracy and processing speed.
Memory Utilization: Analyzing memory usage reveals a linear dependency between model complexity and memory footprint, albeit with distinct slopes for different architecture families. This insight assists in estimating resource needs across varying deployment scenarios.

Experimental Setup

The experimentation utilizes easily replicable frameworks, with models trained or converted to PyTorch. The performance indices are measured consistently, including Top-1 and Top-5 accuracy rates, model size, and the resource demands of processing different batch sizes. The robustness of the studies is maintained with statistical validation through multiple runs to ascertain inference times.

Implications and Future Directions

The findings of this paper guide researchers and practitioners in selecting DNN architectures that align with specific resource constraints and application requirements. For practitioners deploying models on resource-constrained devices, insights into memory footprint and throughput become valuable for decision-making. Further, the study suggests the potential of lightweight models like MobileNet and SqueezeNet in achieving efficient performance without significant trade-offs in accuracy.

For future work, extending this benchmark to encompass newer DNN architectures and considering additional performance metrics such as energy efficiency could offer deeper insights. Moreover, as AI continues evolving, understanding how these models adapt to various real-world constraints will be increasingly pertinent. The paper's repository provides an invaluable resource for ongoing research, offering pre-trained models and analysis tools that are readily accessible to the research community.