Adaptive Workload Distribution for Accuracy-aware DNN Inference on Collaborative Edge Platforms
Abstract: DNN inference can be accelerated by distributing the workload among a cluster of collaborative edge nodes. Heterogeneity among edge devices and accuracy-performance trade-offs of DNN models present a complex exploration space while catering to the inference performance requirements. In this work, we propose adaptive workload distribution for DNN inference, jointly considering node-level heterogeneity of edge devices, and application-specific accuracy and performance requirements. Our proposed approach combinatorially optimizes heterogeneity-aware workload partitioning and dynamic accuracy configuration of DNN models to ensure performance and accuracy guarantees. We tested our approach on an edge cluster of Odroid XU4, Raspberry Pi4, and Jetson Nano boards and achieved an average gain of 41.52% in performance and 5.2% in output accuracy as compared to state-of-the-art workload distribution strategies.
- X. Guo et al., “Automated exploration and implementation of distributed cnn inference at the edge,” IEEE IoT Journal, vol. 10, no. 7, 2023.
- A. Goel et al., “Efficient computer vision on edge devices with pipeline-parallel hierarchical neural networks,” in 2022 ASP-DAC, 2022.
- K. Choi et al., “Legion: Tailoring grouped neural execution considering heterogeneity on multiple edge devices,” IEEE ICCD, 2021.
- Y. G. Kim et al., “Autoscale: Energy efficiency optimization for stochastic edge inference using reinforcement learning,” MICRO, 2020.
- S. Shahhosseini et al., “Online learning for orchestration of inference in multi-user end-edge-cloud networks,” ACM TECS, dec 2022.
- M. Sandler et al., “Mobilenetv2: Inverted residuals and linear bottlenecks,” in CVPR, 2018, pp. 4510–4520.
- TensorFlow, “Tensorflow lite,” https://www.tensorflow.org/lite, 2017.
- R. Hall et al., “Secure multiple linear regression based on homomorphic encryption,” Journal of Official Statistics, vol. 27, no. 4, p. 669, 2011.
- F. Xue et al., “EdgeLD: Locally Distributed Deep Learning Inference on Edge Device Clusters,” IEEE CHPCC, pp. 613–619, 2020.
- J. Mao et al., “MoDNN: Local distributed mobile computing system for Deep Neural Network,” DATE, pp. 1396–1401, 2017.
- Z. Zhuoran et al., “DeepThings: Distributed adaptive deep learning inference on resource-constrained IoT edge clusters,” IEEE TCAD, 2018.
- L. Zhou et al., “Adaptive parallel execution of deep neural networks on heterogeneous edge devices,” in IEEE SEC, 2019, p. 195–208.
- Y. Huang et al., “Enabling DNN Acceleration with Data and Model Parallelization over Ubiquitous End Devices,” IEEE IoT, 2021.
- Harkernel, “Smart power 3,” www.hardkernel.com/shop/smartpower-iii/.
- M. Feli, “An energy-efficient semi-supervised approach for on-device photoplethysmogram signal quality assessment,” Smart Health, 2023.
- J. Doe and J. Smith, “Opencv lite: An efficient alternative for computer vision on resource-limited embedded systems,” IEEE ESL, 2023.
- N. Moothedath et al., “Online algorithms for hierarchical inference in deep learning applications at the edge,” arXiv, pp. arXiv–2304, 2023.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.