FetaFix: Automatic Fault Localization and Repair of Deep Learning Model Conversions
Abstract: Converting deep learning models between frameworks is a common step to maximize model compatibility across devices and leverage optimization features that may be exclusively provided in one deep learning framework. However, this conversion process may be riddled with bugs, making the converted models either undeployable or problematic, considerably degrading their prediction correctness. In this paper, we propose an automated approach for fault localization and repair, FetaFix, during model conversion between deep learning frameworks. FetaFix is capable of detecting and fixing faults introduced in model input, parameters, hyperparameters, and the model graph during conversion. FetaFix uses a set of fault types (mined from surveying common conversion issues reported in code repositories and forums) to localize potential conversion faults in the converted target model and then repair them appropriately, e.g., replacing the parameters of the target model with those from the source model. This is done iteratively for every image in the dataset, comparing output label differences between the source model and the converted target model until all differences are resolved. We evaluate the effectiveness of FetaFix in fixing model conversion bugs of three widely used image recognition models converted across four different deep learning frameworks. Overall, FetaFix was able to fix $462$ out of $755$ detected conversion faults, either completely repairing or significantly improving the performance of $14$ out of the $15$ erroneous conversion cases.
- Accuracy drop between tensorflow model and converted tflite. https://stackoverflow.com/questions/65731362/accuracy-drop-between-tensorflow-model-and-converted-tflite. Accessed 13 Dec. 2023.
- Always getting 0 for prediction from tensorflow lite model. https://stackoverflow.com/questions/67069167/always-getting-0-for-prediction-from-tensorflow-lite-model. Accessed 13 Dec. 2023.
- Batch normalization layers not present in the converted keras model - issue #135 - gmalivenko/pytorch2keras. https://github.com/gmalivenko/pytorch2keras/issues/135. Accessed 13 Dec. 2023.
- Cannot set tensor: Dimension mismatch. https://stackoverflow.com/questions/72516622/cannot-set-tensor-dimension-mismatch. Accessed 13 Dec. 2023.
- Convert to coreml but predict wrong. https://discuss.pytorch.org/t/convert-to-coreml-but-predict-wrong/66355/3. Accessed 13 Dec. 2023.
- Converted keras model has different parameters - issue #127 - gmalivenko/pytorch2keras. https://github.com/gmalivenko/pytorch2keras/issues/127. Accessed 13 Dec. 2023.
- Converted model has different weights than the original model - issue #124 - gmalivenko/pytorch2keras. https://github.com/gmalivenko/pytorch2keras/issues/124. Accessed 13 Dec. 2023.
- Coreml model converted from pytorch model giving the wrong prediction probability. https://stackoverflow.com/questions/64519191/coreml-model-converted-from-pytorch-model-giving-the-wrong-prediction-probabilit. Accessed 13 Dec. 2023.
- Custom converter being wrapped by transpose statements (set_converter) - issue #572 - onnx/keras-onnx. https://github.com/onnx/keras-onnx/issues/572. Accessed 13 Dec. 2023.
- Different accuracy after model conversion from keras to caffe - issue #823 - microsoft/mmdnn. https://github.com/microsoft/MMdnn/issues/823. Accessed 13 Dec. 2023.
- Dimension mismatch during keras to onnx conversion (2d output). https://stackoverflow.com/questions/70861809/dimension-mismatch-during-keras-to-onnx-conversion-2d-output. Accessed 13 Dec. 2023.
- Error: Failing in transpose layer (cannot permute batch dimension. result may be wrong) - issue #31 - gmalivenko/pytorch2keras. https://github.com/gmalivenko/pytorch2keras/issues/31. Accessed 13 Dec. 2023.
- Failed to convert weights to 8 bit precision: ”quantize weights tool only supports tflite models with one subgraph” - issue #35194 - tensorflow/tensorflow. https://github.com/tensorflow/tensorflow/issues/35194. Accessed 13 Dec. 2023.
- How to modify the convolution property to same. - issue #135 - gmalivenko/onnx2keras. https://github.com/gmalivenko/onnx2keras/issues/135. Accessed 13 Dec. 2023.
- Hyperparameter values forcefully converted to strings, thus unable to pass a list - issue #613 - aws/sagemaker-python-sdk. https://github.com/aws/sagemaker-python-sdk/issues/613. Accessed 13 Dec. 2023.
- Incorrect data response in tensorflow. https://stackoverflow.com/questions/76418614/incorrect-data-response-in-tensorflow. Accessed 13 Dec. 2023.
- https://github.com/gmalivenko/pytorch2keras/issues/104. Accessed 13 Dec. 2023.
- Layer weight shape don’t match - issue #78 - gmalivenko/pytorch2keras. https://github.com/gmalivenko/pytorch2keras/issues/78. Accessed 13 Dec. 2023.
- Missing shape information for ’nonzero’ node derived from ’where’ node - issue #1203 - onnx/tensorflow-onnx. https://github.com/onnx/tensorflow-onnx/issues/1203. Accessed 13 Dec. 2023.
- Model gets correct input dimensions, but throws dimension error. https://stackoverflow.com/questions/56292213/model-gets-correct-input-dimensions-but-throws-dimension-error. Accessed 13 Dec. 2023.
- Poor tensorflow lite accuracy in android application. https://stackoverflow.com/questions/69352192/poor-tensorflow-lite-accuracy-in-android-application. Accessed 13 Dec. 2023.
- Pytorch -¿ onnx -¿ tensorflow how to convert from nchw (onnx) to nhwc (tensorflow lite) - issue #862 - onnx/onnx-tensorflow. https://github.com/onnx/onnx-tensorflow/issues/862. Accessed 13 Dec. 2023.
- Reshape after view is wrong - issue #76 - gmalivenko/pytorch2keras. https://github.com/gmalivenko/pytorch2keras/issues/76. Accessed 13 Dec. 2023.
- Strange dimension behaviour: Needs both dimension 2 and 3, unsure why. https://stackoverflow.com/questions/58031343/strange-dimension-behaviour-needs-both-dimension-2-and-3-unsure-why. Accessed 13 Dec. 2023.
- Tensorflow keras valueerror on input shape. https://stackoverflow.com/questions/68837658/tensorflow-keras-valueerror-on-input-shape. Accessed 13 Dec. 2023.
- Tensorflow lite conversion changes model weights. https://stackoverflow.com/questions/54404262/tensorflow-lite-conversion-changes-model-weights. Accessed 13 Dec. 2023.
- Tensorflow to caffe, reshape layer - issue #831 - microsoft/mmdnn. https://github.com/microsoft/MMdnn/issues/831. Accessed 13 Dec. 2023.
- Tflite: Changing weights - issue #31205 - tensorflow/tensorflow. https://github.com/tensorflow/tensorflow/issues/31205. Accessed 13 Dec. 2023.
- Tflite model overflows on gpu, ok on cpu - what are the differences internally? https://stackoverflow.com/questions/62032560/tflite-model-overflows-on-gpu-ok-on-cpu-what-are-the-differences-internally. Accessed 13 Dec. 2023.
- Tflite output different result with pbfile when using only one convolutional layer? - issue #31359 - tensorflow/tensorflow. https://github.com/tensorflow/tensorflow/issues/31359. Accessed 13 Dec. 2023.
- Valueerror: Cannot set tensor: Dimension mismatch (3 but expected 4). https://stackoverflow.com/questions/67068742/valueerror-cannot-set-tensor-dimension-mismatch-got-3-but-expected-4-for-inpu. Accessed 13 Dec. 2023.
- Valueerror: Graph has cycles - issue #2246 - onnx/tensorflow-onnx. https://github.com/onnx/tensorflow-onnx/issues/2246. Accessed 13 Dec. 2023.
- Want to confirm if this is a problem with model or i am doing something wrong (tf). https://stackoverflow.com/questions/73431543/want-to-confirm-if-this-is-a-problem-with-model-or-i-am-doing-something-wrong-tf. Accessed 13 Dec. 2023.
- Weights are not equal when convert model from tensorflow to caffe - issue #297 - microsoft/mmdnn. https://github.com/microsoft/MMdnn/issues/297. Accessed 13 Dec. 2023.
- Why does onnx-tensorflow add transpose layers for each conv2d layer? - issue #782 - onnx/onnx-tensorflow. https://github.com/onnx/onnx-tensorflow/issues/782. Accessed 13 Dec. 2023.
- tflite2onnx. https://github.com/zhenhuaw-me/tflite2onnx, 2020. [Accessed 6-June-2023].
- TensorFlow - Frequently Asked Questions, 2021.
- TF2ONNX. https://github.com/onnx/tensorflow-onnx, 2022. [Accessed 15-Feb-2023].
- onnx2keras. https://github.com/gmalivenko/onnx2keras, 2023. [Accessed 15-Feb-2023].
- onnx2torch. https://github.com/ENOT-AutoDL/onnx2torch, 2023. [Accessed 15-Feb-2023].
- Open Neural Network Exchange. https://onnx.ai/, 2023. [Accessed 8-Dec-2023].
- Open Neural Network Exchange Tools - GitHub. https://github.com/onnx/onnxmltools, 2023. [Accessed 8-Dec-2023].
- TVM Debugger. https://tvm.apache.org/docs/arch/debugger.html, 2023. [Accessed 13-Dec-2023].
- ACervantes. Strides problem on the nvconverter, 2021. Accessed 13 Dec. 2023.
- TVM: An Automated End-to-End Optimizing Compiler for Deep Learning. In 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI 18), pages 578–594, Oct. 2018.
- A Comprehensive Study on Challenges in Deploying Deep Learning Based Software. In Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, pages 750–762, 2020.
- An Empirical Study on Deployment Faults of Deep Learning Based Mobile Applications. In 2021 IEEE/ACM 43rd International Conference on Software Engineering (ICSE), pages 674–685. IEEE, 2021.
- F. e. a. Chollet. Keras. https://keras.io, 2015.
- Imagenet: A Large-Scale Hierarchical Image Database. In 2009 IEEE conference on computer vision and pattern recognition, pages 248–255, 2009.
- DeepFault: Fault Localization for Deep Neural Networks. In Fundamental Approaches to Software Engineering, pages 171–191, 2019.
- E. O. et al. Methodology and application of the kruskal-wallis test. Applied Mechanics and Materials, 611:115 – 120, 2014.
- A. Fontes and G. Gay. Using Machine Learning to Generate Test Oracles: A Systematic Literature Review. In Proceedings of the 1st International Workshop on Test Oracles, page 1–10, 2021.
- V. Gil. After converting the model to .tflite and running it on android, the accuracy drops. https://discuss.tensorflow.org/t/after-converting-the-model-to-tflite-and-running-it-on-android-the-accuracy-drops/1310/2. Accessed 13 Dec. 2023.
- Deep Residual Learning for Image Recognition. CoRR, abs/1512.03385, 2015.
- Identity mappings in deep residual networks. CoRR, abs/1603.05027, 2016.
- Taxonomy of Real Faults in Deep Learning Systems. In Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering, pages 1110–1121, 2020.
- A Comprehensive Study on Deep Learning Bug Characteristics. In Proceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, ESEC/FSE 2019, page 510–520, New York, NY, USA, 2019. Association for Computing Machinery.
- Repairing deep neural networks: Fix patterns and challenges. In Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering, ICSE ’20, page 1135–1146, New York, NY, USA, 2020. Association for Computing Machinery.
- Juan. Valueerror: Cannot set tensor: Dimension mismatch. https://discuss.tensorflow.org/t/valueerror-cannot-set-tensor-dimension-mismatch/15313. Accessed 13 Dec. 2023.
- G. Laage. Extreme model accuracy loss due to tflite conversion w/ quantization. https://discuss.tensorflow.org/t/extreme-model-accuracy-loss-due-to-tflite-conversion-w-quantization/2637/5. Accessed 13 Dec. 2023.
- Enhancing the interoperability between deep learning frameworks by model conversion. In Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, ESEC/FSE 2020, page 1320–1330, New York, NY, USA, 2020. Association for Computing Machinery.
- Assessing Robustness of Image Recognition Models to Changes in the Computational Environment. In NeurIPS ML Safety Workshop, 2022.
- DeltaNN: Assessing the impact of computational environment parameters on the performance of image recognition models. In 39th IEEE International Conference on Software Maintenance and Evolution, pages 1–11, 2023.
- TensorFlow: Large-Scale Machine Learning on Heterogeneous Systems, 2015. Software available from tensorflow.org.
- An Empirical Study of Challenges in Converting Deep Learning Models. In 2022 IEEE International Conference on Software Maintenance and Evolution (ICSME), pages 13–23, 2022.
- An empirical study of challenges in converting deep learning models. In 2022 IEEE International Conference on Software Maintenance and Evolution (ICSME), pages 13–23. IEEE, 2022.
- PyTorch: An Imperative Style, High-Performance Deep Learning Library. CoRR, abs/1912.01703, 2019.
- CRADLE: Cross-Backend Validation to Detect and Localize Bugs in Deep Learning Libraries. In 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE), pages 1027–1038, 2019.
- ImageNet Large Scale Visual Recognition Challenge. International Journal of Computer Vision (IJCV), 115(3):211–252, 2015.
- Inverted Residuals and Linear Bottlenecks: Mobile Networks for Classification, Detection and Segmentation. CoRR, abs/1801.04381, 2018.
- Explaining Deep Neural Networks Using Spectrum-Based Fault Localization. CoRR, abs/1908.02374, 2019.
- Rethinking the Inception Architecture for Computer Vision. CoRR, abs/1512.00567, 2015.
- Supervised Learning Over Test Executions as a Test Oracle. In Proceedings of the 36th Annual ACM Symposium on Applied Computing, pages 1521–1531, 2021.
- Deep Learning Library Testing via Effective Model Generation. In Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, page 788–799, 2020.
- DeepLocalize: Fault Localization for Deep Neural Networks. In Proceedings of the 43rd International Conference on Software Engineering, page 251–262, 2021.
- Machine Learning Testing: Survey, Landscapes and Horizons. IEEE Transactions on Software Engineering, 48(1):1–36, 2022.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.