MD-GAN: Multi-Discriminator Generative Adversarial Networks for Distributed Datasets

Published 9 Nov 2018 in cs.LG and stat.ML | (1811.03850v2)

Abstract: A recent technical breakthrough in the domain of machine learning is the discovery and the multiple applications of Generative Adversarial Networks (GANs). Those generative models are computationally demanding, as a GAN is composed of two deep neural networks, and because it trains on large datasets. A GAN is generally trained on a single server. In this paper, we address the problem of distributing GANs so that they are able to train over datasets that are spread on multiple workers. MD-GAN is exposed as the first solution for this problem: we propose a novel learning procedure for GANs so that they fit this distributed setup. We then compare the performance of MD-GAN to an adapted version of Federated Learning to GANs, using the MNIST and CIFAR10 datasets. MD-GAN exhibits a reduction by a factor of two of the learning complexity on each worker node, while providing better performances than federated learning on both datasets. We finally discuss the practical implications of distributing GANs.

Abstract PDF Upgrade to Chat

Authors (3)

Citations (171)

View on Semantic Scholar

Summary

MD-GAN: Multi-Discriminator Generative Adversarial Networks for Distributed Datasets

The paper introduces MD-GAN, a novel approach to training Generative Adversarial Networks (GANs) over distributed datasets. Unlike traditional GANs, which operate on centralized data, MD-GAN addresses the challenge of distributed data sources, allowing GAN models to leverage datasets distributed across multiple worker nodes without moving the data centrally. This setup is relevant in scenarios where data constitutes privacy concerns or where its sheer volume and geo-distribution prohibit centralization.

The core innovation of MD-GAN is its structured approach to GAN distribution which involves employing a single generator on a central server while using multiple discriminators residing on worker nodes. Workers process local dataset shares and send feedback to the central generator. This design reduces computational complexity on worker nodes significantly by concentrating generator-related tasks on the server, thereby maintaining efficiency in distributed setups. Moreover, the algorithm adopts a novel peer-to-peer swap mechanism for discriminators to counteract the risk of overfitting on local datasets.

Key Contributions

Single Generator Design: MD-GAN centralizes the generator, effectively reducing the complexity on worker nodes.
Peer-to-Peer Discriminator Swap: By swapping discriminators between nodes, the model mitigates overfitting, maintaining high generalization capabilities.
Competitive Learning Strategy: Compared to standalone and federated learning adapted for GANs, MD-GAN demonstrates improved performance and convergence, as indicated by better scores in comprehensive experiments using MNIST and CIFAR10 datasets.

Comparative Analysis

The paper explores MD-GAN alongside standalone GAN and an adapted federated learning approach termed FL-GAN. Experiments reveal that MD-GAN consistently outperforms FL-GAN across various metrics, including the Frechet Inception Distance (FID) and Inception Scores across multiple configurations. Moreover, it retains comparable performance to standalone GAN without necessitating centralized data aggregation or extensive computational resources on worker nodes.

Implications and Future Directions

The proposed MD-GAN framework establishes a paradigm conducive to large-scale and privacy-sensitive applications, where data is inherently distributed across numerous devices or datacenters. Its approach holds potential for optimizing computational efficiency and reducing communication overhead in real-world scenarios.

Future research can explore asynchronous updates to optimize server-worker interactions further, bandwidth-efficient communication schemes, especially in low-bandwidth environments, and mechanisms to incorporate fault-tolerance for improved reliability in distributed environments. Addressing scalability will be crucial, aiming to manage larger numbers of worker nodes while ensuring model performance does not deteriorate.

Conclusively, MD-GAN’s architecture promotes a feasible pathway for deploying GANs in distributed settings, paving the way for practical applications across diverse domains such as edge computing and federated learning spaces, and it constitutes a promising step towards scalable and efficient distributed machine learning systems.