FetaFix: Automatic Fault Localization and Repair of Deep Learning Model Conversions

Published 22 Dec 2023 in cs.SE, cs.AI, cs.CV, and cs.LG | (2312.15101v4)

Abstract: Converting deep learning models between frameworks is a common step to maximize model compatibility across devices and leverage optimization features that may be exclusively provided in one deep learning framework. However, this conversion process may be riddled with bugs, making the converted models either undeployable or problematic, considerably degrading their prediction correctness. In this paper, we propose an automated approach for fault localization and repair, FetaFix, during model conversion between deep learning frameworks. FetaFix is capable of detecting and fixing faults introduced in model input, parameters, hyperparameters, and the model graph during conversion. FetaFix uses a set of fault types (mined from surveying common conversion issues reported in code repositories and forums) to localize potential conversion faults in the converted target model and then repair them appropriately, e.g., replacing the parameters of the target model with those from the source model. This is done iteratively for every image in the dataset, comparing output label differences between the source model and the converted target model until all differences are resolved. We evaluate the effectiveness of FetaFix in fixing model conversion bugs of three widely used image recognition models converted across four different deep learning frameworks. Overall, FetaFix was able to fix $462$ out of $755$ detected conversion faults, either completely repairing or significantly improving the performance of $14$ out of the $15$ erroneous conversion cases.