Evolving Deep Neural Networks

Published 1 Mar 2017 in cs.NE and cs.AI | (1703.00548v2)

Abstract: The success of deep learning depends on finding an architecture to fit the task. As deep learning has scaled up to more challenging tasks, the architectures have become difficult to design by hand. This paper proposes an automated method, CoDeepNEAT, for optimizing deep learning architectures through evolution. By extending existing neuroevolution methods to topology, components, and hyperparameters, this method achieves results comparable to best human designs in standard benchmarks in object recognition and language modeling. It also supports building a real-world application of automated image captioning on a magazine website. Given the anticipated increases in available computing power, evolution of deep networks is promising approach to constructing deep learning applications in the future.

Abstract PDF Upgrade to Chat

Citations (860)

View on Semantic Scholar

Summary

The paper introduces CoDeepNEAT, an evolutionary algorithm that coevolves network modules and blueprints to automatically design complex deep neural networks.
It extends NEAT by treating network layers as nodes, evolving both structural topologies and hyperparameters for enhanced architecture optimization.
Empirical results show competitive CIFAR-10 accuracy and a 5% improvement in LSTM performance, demonstrating faster convergence and real-world adaptability.

Evolving Deep Neural Networks: An Essay on Automated Architecture Optimization

The paper "Evolving Deep Neural Networks" by Miikkulainen et al. addresses the increasingly complex challenge of designing optimal neural network architectures as tasks in deep learning grow in complexity. The authors propose CoDeepNEAT, an automated evolutionary method for optimizing deep learning architectures, extending traditional neuroevolution techniques to include topology, components, and hyperparameters.

Overview of CoDeepNEAT

The primary contribution of this research centers on CoDeepNEAT, a coevolutionary algorithm that simultaneously evolves both network modules and blueprints. This approach allows for the discovery of architectures that exhibit the repetitive and deep structures often found in manually designed state-of-the-art DNNs. Unlike standard NEAT, where the chromosome represents individual neurons, DeepNEAT generalizes each node to represent entire network layers with associated hyperparameters.

Key Methodological Components

Extension of NEAT: DeepNEAT modifies NEAT by treating each node as a neural network layer, thus evolving complex DNN topologies. Each chromosome includes both local hyperparameters for the layers and global hyperparameters for the entire network.
CoEvolution Strategy: CoDeepNEAT evolves two distinct populations—modules representing network components and blueprints dictating how these modules are combined. This composite approach permits sophisticated, multi-layered architectures by leveraging redundancy and structural deepening.
Fitness Evaluation: The performance of the evolved networks is assessed through training and validation on benchmark tasks, with fitness determined by classification accuracy or other relevant metrics.

Results and Implications

The paper presents several strong empirical results. For instance, in the CIFAR-10 benchmark, CoDeepNEAT achieved classification errors comparable to state-of-the-art techniques with a notable enhancement in training convergence speed. This underlines the ability of the method to identify architectures that not only perform well but also train efficiently.

In the domain of language modeling using LSTM architectures, CoDeepNEAT discovered innovative network topologies incorporating skip connections, resulting in a performance improvement of 5% over traditional LSTMs. This highlights the method's potential in evolving recurrent network structures for sequential data processing tasks.

Practical Applications

A notable example of CoDeepNEAT's application is the development of an online image captioning system for a magazine website. By training on the MSCOCO dataset and a custom iconic image dataset, the evolved models provided compelling captioning performances, especially for iconic images. This real-world implementation demonstrates the practical utility and adaptability of evolutionary-designed networks in solving tasks that require integrating visual and textual data.

Future Directions

The work posits significant implications for the future of automated neural network design. As computational power becomes more accessible through cloud services and distributed computing (e.g., Sentient's DarkCycle), the evolutionary approach can scale to explore even larger and more complex search spaces. Optimization goals can be diversified to include metrics beyond mere accuracy—such as training speed, memory efficiency, or runtime—which could lead to the discovery of highly efficient and specialized architectures.

Conclusion

The evolutionary optimization of deep neural networks, as demonstrated by CoDeepNEAT, provides a robust framework for automatically designing sophisticated architectures that perform competitively with human-designed models. With advancements in computational resources, this technique holds the promise to exceed human capabilities in architecture design, broadening the horizons of deep learning applications across various domains. This research underscores the potential of evolutionary algorithms to streamline and enhance the process of neural network design, heralding a new era of automated artificial intelligence development.