LAECIPS: Large Vision Model Assisted Adaptive Edge-Cloud Collaboration for IoT-based Embodied Intelligence System
Abstract: Embodied intelligence (EI) enables manufacturing systems to flexibly perceive, reason, adapt, and operate within dynamic shop floor environments. In smart manufacturing, a representative EI scenario is robotic visual inspection, where industrial robots must accurately inspect components on rapidly changing, heterogeneous production lines. This task requires both high inference accuracy especially for uncommon defects and low latency to match production speeds, despite evolving lighting, part geometries, and surface conditions. To meet these needs, we propose LAECIPS, a large vision model-assisted adaptive edge-cloud collaboration framework for IoT-based embodied intelligence systems. LAECIPS decouples large vision models in the cloud from lightweight models on the edge, enabling plug-and-play model adaptation and continual learning. Through a hard input mining-based inference strategy, LAECIPS routes complex and uncertain inspection cases to the cloud while handling routine tasks at the edge, achieving both high accuracy and low latency. Experiments conducted on a real-world robotic semantic segmentation system for visual inspection demonstrate significant improvements in accuracy, processing latency, and communication overhead compared to state-of-the-art methods. LAECIPS provides a practical and scalable foundation for embodied intelligence in smart manufacturing, especially in adaptive robotic inspection and quality control scenarios.
- A. Prakash, K. Chitta, and A. Geiger, “Multi-modal fusion transformer for end-to-end autonomous driving,” in CVPR 2021, 2021, pp. 7077–7087.
- J. T. Zhou, J. Du, H. Zhu, X. Peng, Y. Liu, and R. S. M. Goh, “Anomalynet: An anomaly detection network for video surveillance,” IEEE Transactions on Information Forensics and Security, vol. 14, no. 10, pp. 2537–2550, 2019.
- Z. Zhou, X. Chen, E. Li, L. Zeng, K. Luo, and J. Zhang, “Edge intelligence: Paving the last mile of artificial intelligence with edge computing,” Proceedings of the IEEE, vol. 107, no. 8, pp. 1738–1762, 2019.
- M. M. H. Shuvo, S. K. Islam, J. Cheng, and B. I. Morshed, “Efficient acceleration of deep learning inference on resource-constrained edge devices: A review,” Proceedings of the IEEE, 2022.
- Y. Zhang, Y. Yao, P. Ram, P. Zhao, T. Chen, M. Hong, Y. Wang, and S. Liu, “Advancing model pruning via bi-level optimization,” Advances in Neural Information Processing Systems, vol. 35, pp. 18 309–18 326, 2022.
- M. De Lange, R. Aljundi, M. Masana, S. Parisot, X. Jia, A. Leonardis, G. Slabaugh, and T. Tuytelaars, “A continual learning survey: Defying forgetting in classification tasks,” IEEE transactions on pattern analysis and machine intelligence, vol. 44, no. 7, pp. 3366–3385, 2021.
- A. Kirillov, E. Mintun, N. Ravi, H. Mao, C. Rolland, L. Gustafson, T. Xiao, S. Whitehead, A. C. Berg, W.-Y. Lo et al., “Segment anything,” arXiv preprint arXiv:2304.02643, 2023.
- T. Chen, Z. Mai, R. Li, and W. lun Chao, “Segment anything model (sam) enhanced pseudo labels for weakly supervised semantic segmentation,” 2023.
- X. Wang, Y. Han, V. C. Leung, D. Niyato, X. Yan, and X. Chen, “Convergence of edge computing and deep learning: A comprehensive survey,” IEEE Communications Surveys & Tutorials, vol. 22, no. 2, pp. 869–904, 2020.
- S. Duan, D. Wang, J. Ren, F. Lyu, Y. Zhang, H. Wu, and X. Shen, “Distributed artificial intelligence empowered by end-edge-cloud computing: A survey,” IEEE Communications Surveys & Tutorials, 2022.
- Y. Kang, J. Hauswald, C. Gao, A. Rovinski, T. Mudge, J. Mars, and L. Tang, “Neurosurgeon: Collaborative intelligence between the cloud and mobile edge,” ACM SIGARCH Computer Architecture News, vol. 45, no. 1, pp. 615–629, 2017.
- A. E. Eshratifar, M. S. Abrishami, and M. Pedram, “Jointdnn: An efficient training and inference engine for intelligent mobile cloud computing services,” IEEE Transactions on Mobile Computing, vol. 20, no. 2, pp. 565–576, 2019.
- C. Hu, W. Bao, D. Wang, and F. Liu, “Dynamic adaptive dnn surgery for inference acceleration on the edge,” in IEEE INFOCOM 2019. IEEE, 2019, pp. 1423–1431.
- H.-J. Jeong, H.-J. Lee, C. H. Shin, and S.-M. Moon, “Ionn: Incremental offloading of neural network computations from mobile devices to edge servers,” in SoCC 2018, 2018, pp. 401–411.
- Z. Zhao, K. M. Barijough, and A. Gerstlauer, “Deepthings: Distributed adaptive deep learning inference on resource-constrained iot edge clusters,” IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, vol. 37, no. 11, pp. 2348–2359, 2018.
- E. Park, D. Kim, S. Kim, Y.-D. Kim, G. Kim, S. Yoon, and S. Yoo, “Big/little deep neural network for ultra low power inference,” in CODES+ISSS 2015, 2015, pp. 124–132.
- U. Drolia, K. Guo, J. Tan, R. Gandhi, and P. Narasimhan, “Cachier: Edge-caching for recognition applications,” in ICDCS 2017, 2017, pp. 276–286.
- S. Ding, L. Li, Z. Li, H. Wang, and Y. Zhang, “Smart electronic gastroscope system using a cloud–edge collaborative framework,” Future Generation Computer Systems, vol. 100, pp. 395–407, 2019. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S0167739X18324324
- M. Li, Y. Li, Y. Tian, L. Jiang, and Q. Xu, “Appealnet: An efficient and highly-accurate edge/cloud collaborative architecture for dnn inference,” in DAC 2021, 2021, pp. 409–414.
- C. Ding, A. Zhou, Y. Liu, R. N. Chang, C.-H. Hsu, and S. Wang, “A cloud-edge collaboration framework for cognitive service,” IEEE Transactions on Cloud Computing, vol. 10, no. 3, pp. 1489–1499, 2022.
- Z. Cao, Z. Li, Y. Chen, H. Pan, Y. Hu, and J. Liu, “Edge-cloud collaborated object detection via difficult-case discriminator,” in 2023 IEEE 43rd International Conference on Distributed Computing Systems (ICDCS). IEEE, 2023, pp. 259–270.
- A. Kouris, S. I. Venieris, S. Laskaridis, and N. Lane, “Multi-exit semantic segmentation networks,” in ECCV 2022. Cham: Springer Nature Switzerland, 2022, pp. 330–349.
- D. Hendrycks and K. Gimpel, “A baseline for detecting misclassified and out-of-distribution examples in neural networks,” 2018.
- Y. Du, Z. Fu, Q. Liu, and Y. Wang, “Weakly supervised semantic segmentation by pixel-to-prototype contrast,” in CVPR 2022, June 2022, pp. 4320–4329.
- A. Shrivastava, A. Gupta, and R. Girshick, “Training region-based object detectors with online hard example mining,” in CVPR 2016, June 2016.
- H. W. Kuhn and A. W. Tucker, “Nonlinear programming,” Traces and emergence of nonlinear programming, pp. 247–258, 2014.
- M. Mohri and A. Rostamizadeh, “Rademacher complexity bounds for non-i.i.d. processes,” in NIPS 2008, vol. 21. Curran Associates, Inc., 2008.
- V. Koltchinskii and D. Panchenko, “Empirical margin distributions and bounding the generalization error of combined classifiers,” The Annals of Statistics, vol. 30, no. 1, pp. 1–50, 2002.
- G. DeSalvo, M. Mohri, and U. Syed, “Learning with deep cascades,” in Algorithmic Learning Theory: 26th International Conference, ALT 2015, Banff, AB, Canada, October 4-6, 2015, Proceedings 26. Springer, 2015, pp. 254–269.
- Nvidia jetson nano. [Online]. Available: https://developer.nvidia.com/embedded/jetsonnano-developer-kit
- Kubeedge ianvs: Distributed synergy ai benchmarking. [Online]. Available: https://github.com/kubeedge/ianvs
- S. Hu, S. Mao, S. Luo, Z. Huang, Z. Zheng, J. Pu, and F. Wang, “Cloud robotics: a robotic semantic segmentation benchmark for lifelong learning,” [Online]. Available: https://kubeedge-ianvs.github.io/, 2023.
- M. Cordts, M. Omran, S. Ramos, T. Rehfeld, M. Enzweiler, R. Benenson, U. Franke, S. Roth, and B. Schiele, “The cityscapes dataset for semantic urban scene understanding,” in CVPR 2016, 2016, pp. 3213–3223.
- B. Zhou, H. Zhao, X. Puig, S. Fidler, A. Barriuso, and A. Torralba, “Scene parsing through ade20k dataset,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 633–641.
- G. Ros, L. Sellart, J. Materzynska, D. Vazquez, and A. M. Lopez, “The synthia dataset: A large collection of synthetic images for semantic segmentation of urban scenes,” in The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2016.
- L. Sun, K. Yang, X. Hu, W. Hu, and K. Wang, “Real-time fusion network for rgb-d semantic segmentation incorporating unexpected obstacle detection for road-driving images,” IEEE robotics and automation letters, vol. 5, no. 4, pp. 5558–5565, 2020.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.