Transfer Learning with Reconstruction Loss
Abstract: In most applications of utilizing neural networks for mathematical optimization, a dedicated model is trained for each specific optimization objective. However, in many scenarios, several distinct yet correlated objectives or tasks often need to be optimized on the same set of problem inputs. Instead of independently training a different neural network for each problem separately, it would be more efficient to exploit the correlations between these objectives and to train multiple neural network models with shared model parameters and feature representations. To achieve this, this paper first establishes the concept of common information: the shared knowledge required for solving the correlated tasks, then proposes a novel approach for model training by adding into the model an additional reconstruction stage associated with a new reconstruction loss. This loss is for reconstructing the common information starting from a selected hidden layer in the model. The proposed approach encourages the learned features to be general and transferable, and therefore can be readily used for efficient transfer learning. For numerical simulations, three applications are studied: transfer learning on classifying MNIST handwritten digits, the device-to-device wireless network power allocation, and the multiple-input-single-output network downlink beamforming and localization. Simulation results suggest that the proposed approach is highly efficient in data and model complexity, is resilient to over-fitting, and has competitive performances.
- W. Cui and W. Yu, “Transfer learning with input reconstruction loss,” in IEEE Global Commun. Conf. (Globecom), Dec. 2022.
- H. Sun, X. Chen, Q. Shi, M. Hong, X. Fu, and N. D. Sidiropoulos, “Learning to optimize: Training deep neural networks for interference management,” IEEE Trans. Signal Process., vol. 66, no. 20, pp. 5438–5453, Aug. 2018.
- W. Cui, K. Shen, and W. Yu, “Spatial deep learning for wireless scheduling,” IEEE J. Sel. Areas Commun., vol. 37, pp. 1248–1261, June 2019.
- F. Liang, C. Shen, W. Yu, and F. Wu, “Towards optimal power control via ensembling deep neural networks,” IEEE Trans. Commun. (TCOM), vol. 68, no. 3, pp. 1760–1776, Mar. 2020.
- W. Cui, K. Shen, and W. Yu, “Deep learning for robust power control for wireless networks,” in IEEE Int. Conf. Acoust. Speech Signal Process. (ICASSP), May 2020.
- K. Gregor and Y. LeCun, “Learning fast approximations of sparse coding,” in Int. Conf. Mach. Learn. (ICML), Jun. 2010.
- S. Khobahi and M. Soltanalian, “Model-based deep learning for one-bit compressive sensing,” IEEE Trans. Signal Process., vol. 68, pp. 5292–5307, Sep. 2020.
- K. M. Attiah, F. Sohrabi, and W. Yu, “Deep learning for channel sensing and hybrid precoding in TDD massive MIMO OFDM systems,” IEEE Trans. Wireless Commun., vol. 21, no. 12, pp. 10 839–10 853, Dec. 2022.
- F. Sohrabi, T. Jiang, W. Cui, and W. Yu, “Active sensing for communications by learning,” IEEE J. Sel. Areas Commun., vol. 40, no. 6, pp. 1780–1794, Jun. 2022.
- F. Zhuang, Z. Qi, K. Duan, D. Xi, Y. Zhu, H. Zhu, H. Xiong, and Q. He, “A comprehensive survey on transfer learning,” Proc. IEEE, vol. 109, no. 1, pp. 43–76, Jul. 2020.
- M. Long, Y. Cao, J. Wang, and M. I. Jordan, “Learning transferable features with deep adaptation networks,” in Int. Conf. Mach. Learn. (ICML), Jul. 2015.
- Y. Ganin and V. Lempitsky, “Unsupervised domain adaptation by backpropagation,” in Int. Conf. Mach. Learn. (ICML), Jul. 2015.
- B. Sun and K. Saenko, “Deep CORAL: Correlation alignment for deep domain adaptation,” in Eur. Conf. Comput. Vision (ECCV), Oct. 2016.
- K. Gopalakrishnan, S. K. Khaitan, A. Choudhary, and A. Agrawal, “Deep convolutional neural networks with transfer learning for computer vision-based data-driven pavement distress detection,” Construction Building Mater., vol. 157, pp. 322–330, Dec. 2017.
- S. Tammina, “Transfer learning using VGG-16 with deep convolutional neural network for classifying images,” Int. J. Sci. Res. Publications, vol. 9, no. 10, pp. 143–150, Oct. 2019.
- F. Zhuang, P. Luo, H. Xiong, Q. He, Y. Xiong, and Z. Shi, “Exploiting associations between word clusters and document classes for cross-domain text categorization,” Statist. Anal. Data Mining, vol. 4, no. 1, pp. 100–114, Feb. 2011.
- J. Deng, Z. Zhang, E. Marchi, and B. Schuller, “Sparse autoencoder-based feature transfer learning for speech emotion recognition,” in Humaine Assoc. Conf. Affect. Comput. Intell. Interact., Sep. 2013.
- J. Devlin, M. Chang, K. Lee, and K. Toutanova, “BERT: Pre-training of deep bidirectional transformers for language understanding,” in Annu. Conf. North Amer. Chapter Assoc. Comput. Linguistics (NAACL-HLT), Jun. 2019.
- T. Brown et al., “Language models are few-shot learners,” in Conf. Neural Inf. Process. Syst. (NeurIPS), Dec. 2020.
- L. Yu and V. Y. Tan, “Common information, noise stability, and their extensions,” Found. Trends Commun. Inf. Theory, vol. 19, no. 2, pp. 107–389, Apr. 2022.
- S. Ben-David, J. Blitzer, K. Crammer, and F. Pereira, “Analysis of representations for domain adaptation,” in Conf. Neural Inf. Process. Syst. (NeurIPS), Dec. 2006.
- F. Zhuang, X. Cheng, P. Luo, S. J. Pan, and Q. He, “Supervised representation learning: Transfer learning with deep autoencoders,” in Int. Joint Conf. Artif. Intell. (IJCAI), Jul. 2015.
- M. Wang, Y. Lin, Q. Tian, and G. Si, “Transfer learning promotes 6G wireless communications: Recent advances and future challenges,” IEEE Trans. Reliability, vol. 20, no. 3, pp. 1742–1755, Mar. 2021.
- M. A. Ranzato and M. Szummer, “Semi-supervised learning of compact document representations with deep networks,” in Int. Conf. Mach. Learn. (ICML), Jul. 2008.
- T. Robert, N. Thome, and M. Cord, “HybridNet: Classification and reconstruction cooperation for semi-supervised learning,” in Eur. Conf. Comput. Vision (ECCV), Sep. 2018.
- G. Lu, X. Zhao, J. Yin, W. Yang, and B. Li, “Multi-task learning using variational auto-encoder for sentiment classification,” Pattern Recognit. Lett., vol. 132, pp. 115–122, Apr. 2020.
- X. Glorot, A. Bordes, and Y. Bengio, “Domain adaptation for large-scale sentiment classification: A deep learning approach,” in Int. Conf. Mach. Learn. (ICML), Jun. 2011.
- Y. Ganin, E. Ustinova, H. Ajakan, P. Germain, H. Larochelle, F. Laviolette, M. Marchand, and V. Lempitsky, “Domain-adversarial training of neural networks,” J. Mach. Learn. Res., vol. 17, no. 59, pp. 1–35, 2016.
- K. Bousmalis, N. Silberman, D. Dohan, D. Erhan, and D. Krishnan, “Unsupervised pixel-level domain adaptation with generative adversarial networks,” in IEEE Conf. Comput. Vision Pattern Recognit. (CVPR), Jul. 2017.
- J. Hou, X. Ding, J. D. Deng, and S. Cranefield, “Deep adversarial transition learning using cross-grafted generative stacks,” Neural Netw., vol. 149, pp. 172–183, May 2022.
- M. Ghifary, W. B. Kleijn, M. Zhang, and D. Balduzzi, “Domain generalization for object recognition with multi-task autoencoders,” in Int. Conf. Comput. Vision (ICCV), Dec. 2015.
- Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, “Gradient-based learning applied to document recognition,” Proc. IEEE, vol. 86, no. 11, pp. 2278–2324, Nov. 1998.
- H. Yang, J. Jee, G. Kwon, and H. Park, “Deep transfer learning-based adaptive beamforming for realistic communication channels,” in Int. Conf. Inf. Commun. Technol. Convergence (ICTC), Oct. 2020.
- Y. Yuan, G. Zheng, K. Wong, B. Ottersten, and Z. Luo, “Transfer learning and meta learning-based fast downlink beamforming adaptation,” IEEE Trans. Wireless Commun., vol. 20, no. 3, pp. 1742–1755, Mar. 2021.
- N. Tishby and N. Zaslavsky, “Deep learning and the information bottleneck principle,” in IEEE Inf. Theory Workshop (ITW), Apr. 2015.
- R. Shwartz-Ziv and N. Tishby, “Opening the black box of deep neural networks via information,” Mar. 2017, [Online] Available: https://arxiv.org/abs/1703.00810.
- A. M. Saxe, Y. Bansal, J. Dapello, M. Advani, A. Kolchinsky, B. D. Tracey, and D. D. Cox, “On the information bottleneck theory of deep learning,” in Int. Conf. Learn. Representations (ICLR), Feb. 2018.
- R. Caruana, S. Lawrence, and L. Giles, “Overfitting in neural nets: Backpropagation, conjugate gradient, and early stopping,” in Conf. Neural Inf. Process. Syst. (NeurIPS), Dec. 2000.
- M. Chiang, C. W. Tan, D. P. Palomar, D. O’nell, and D. Julian, “Power control by geometric programming,” IEEE Trans. Wireless Commun., vol. 6, no. 7, pp. 2640–2651, Jul. 2007.
- K. Shen and W. Yu, “FPLinQ: A cooperative spectrum sharing strategy for device-to-device communications,” in IEEE Int. Symp. Inf. Theory (ISIT), Jun. 2017, pp. 2323–2327.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.