Smart at what cost? Characterising Mobile Deep Neural Networks in the wild

Published 28 Sep 2021 in cs.LG and cs.PF | (2109.13963v1)

Abstract: With smartphones' omnipresence in people's pockets, Machine Learning (ML) on mobile is gaining traction as devices become more powerful. With applications ranging from visual filters to voice assistants, intelligence on mobile comes in many forms and facets. However, Deep Neural Network (DNN) inference remains a compute intensive workload, with devices struggling to support intelligence at the cost of responsiveness.On the one hand, there is significant research on reducing model runtime requirements and supporting deployment on embedded devices. On the other hand, the strive to maximise the accuracy of a task is supported by deeper and wider neural networks, making mobile deployment of state-of-the-art DNNs a moving target. In this paper, we perform the first holistic study of DNN usage in the wild in an attempt to track deployed models and match how these run on widely deployed devices. To this end, we analyse over 16k of the most popular apps in the Google Play Store to characterise their DNN usage and performance across devices of different capabilities, both across tiers and generations. Simultaneously, we measure the models' energy footprint, as a core cost dimension of any mobile deployment. To streamline the process, we have developed gaugeNN, a tool that automates the deployment, measurement and analysis of DNNs on devices, with support for different frameworks and platforms. Results from our experience study paint the landscape of deep learning deployments on smartphones and indicate their popularity across app developers. Furthermore, our study shows the gap between bespoke techniques and real-world deployments and the need for optimised deployment of deep learning models in a highly dynamic and heterogeneous ecosystem.

Abstract PDF Upgrade to Chat

Citations (43)

View on Semantic Scholar

Summary

The paper provides an empirical study of mobile DNN deployments in over 16,000 Android apps, emphasizing the widespread use of off-the-shelf models like TensorFlow Lite.
It finds that approximately 80.9% of the models are reused without modifications, highlighting a gap between cutting-edge research and practical mobile implementations.
The study reveals significant performance variability across devices, underscoring the need for adaptive, hardware-aware optimization techniques in mobile AI.

Characterization of Mobile Deep Neural Networks in Real-World Applications

The paper presents a comprehensive study on the deployment and performance of Deep Neural Networks (DNNs) across popular mobile applications, primarily focusing on Android devices. As DNNs are increasingly integrated into smartphones, understanding their real-world usage and performance remains vital for both academic and commercial interests in mobile AI.

The research analyzes over 16,000 prominent apps from the Google Play Store, identifying and characterizing the DNN models they deploy. The study lays out the prevalent use of off-the-shelf models, with a strong reliance on frameworks like TensorFlow Lite (TFLite), which constitutes 86.19% of the models. Caffe and nCNN follow, though their lower adoption rates suggest that developers prioritize well-supported frameworks like TFLite, despite Caffe's historical significance in the field. This insight reveals a decided preference for frameworks that offer robust, scalable deployment options, emphasizing ease of use over experimental, cutting-edge alternatives.

The analysis identifies a notable gap between research advancements and actual deployment in mobile systems. While state-of-the-art models involve bespoke architectures and optimizations, much of the real-world deployment hinges on existing pre-trained models with minimal customization. Notably, about 80.9% of DNNs are reused without modifications, and only a small fraction engage in fine-tuning strategies such as transfer learning.

The diversity of tasks powered by DNNs remains largely in the domain of vision applications, encompassing object detection, recognition, and segmentation. NLP and audio processing tasks are represented but with fewer models, reinforcing the continued dominance of vision-based applications in real-world scenarios.

The research also highlights the technical and operational challenges faced by developers when deploying these models on a wide range of hardware with varying capabilities. The reported latency discrepancies across devices underscore the significant variability in executing the same model on different hardware, where low-tier phones lag substantially in performance compared to their high-end counterparts. These findings suggest urgent requirements for adaptive and hardware-aware deployment strategies to ensure uniform user experiences.

The paper explores optimization techniques, revealing that popular methods such as pruning and clustering are underutilized, often due to their minimal runtime performance benefits or due to the extensive data and computational resources they demand during model training. Quantization holds notable promise, yet its applications remain limited in field deployments.

The implications of this study are manifold: As DNN usage surges, driven by easier access to models and frameworks, the need for mobile efficiency increases. Energy consumption also remains a critical concern; as the models become more complex, battery endurance becomes a limiting factor, emphasizing the need for more energy-efficient models.

In conclusion, the study not only presents an empirical view of mobile DNN deployments but also conveys the gap between research and practice. It articulates a strong case for more sophisticated end-to-end solutions that simplify deployment and adaptation across heterogeneous devices. Future research and development should aim to harmonize state-of-the-art DNN designs with the practicalities of mobile AI deployment, fostering seamless, efficient, and impactful AI applications.

Markdown Report Issue