- The paper’s main contribution is its comprehensive evaluation of one-shot and zero-shot estimators in NAS, highlighting key biases and training dependencies.
- It demonstrates that extended training improves ranking stability for OSEs, while some ZSEs, like relu_logdet, show improper biases in topological search spaces.
- The authors propose mitigation strategies such as temporal ensembles and sampling fairness adjustments, charting a course for more robust future NAS methodologies.
The paper "Evaluating Efficient Performance Estimators of Neural Architectures" by Xuefei Ning et al., presents a comprehensive evaluation of one-shot and zero-shot estimators (OSEs and ZSEs, respectively) within the domain of neural architecture search (NAS). The research is conducted across multiple NAS benchmarks, namely NAS-Bench-101, NAS-Bench-201, NAS-Bench-301, and two non-topological search spaces, NDS ResNet and ResNeXt-A. The pressing motivation behind this work is to critically examine the efficiency and efficacy of these estimators, especially given the substantial resource demands of traditional NAS methods.
The paper makes significant strides in dissecting the performance estimations provided by OSEs and ZSEs. It leverages a set of NAS-oriented criteria to evaluate their ranking quality, bias, and variance, providing a robust analysis of their strengths and limitations. The study highlights inherent biases within these estimators, such as the complexity-level bias and operation-level bias of OSEs, and improper biases of ZSEs, emphasizing the nuanced challenges associated with accurately predicting architecture performances without extensive training or data.
Major Findings
- OSE Analysis: The paper reveals noticeable biases and variances in OSEs. The training sufficiency is pivotal; longer training tends to mitigate bias toward simpler architectures and improves the ranking stability of the OSEs. Notably, OSEs exhibited a significant correlation gap on NAS-Bench-301 when compared to other search spaces. The efficacy of OSEs is markedly contingent upon sufficient training; however, trade-offs between training time and quality are inevitable, especially in large search spaces like NB301.
- ZSE Analysis: ZSEs, primarily designed to function without training, were evaluated in various contexts. The results indict ZSEs' performance, especially when compared to OSEs. Interestingly, architecture-level ZSEs such as jacob_cov and relu_logdet showed relatively better performance compared to parameter-level ZSEs like synflow and plain. However, the paper also cautions about the improper biases of ZSEs — relu_logdet preferred architectures with a smaller receptive field, which can erroneously affect performance ranking, particularly in topological search spaces.
- Mitigation Strategies: The authors propose several strategies to address the inherent biases of OSEs, such as reducing parameter sharing and enhancing sampling fairness. Techniques like temporal ensemble, Monte Carlo sampling, and post-de-isomorphism adjustments were discussed to enhance prediction accuracy and ranking stability. The research underscores that merely increasing training does not always resolve biases, particularly biases stemming from representation inequities.
- General Insights and Suggestions: The paper advocates for the application of loss values rather than accuracy for OSEs in specific search spaces, as evidenced by the improvement noted on NAS-Bench-301. It calls attention to using ample validation data to avoid faulty architecture comparisons, a crucial insight for future NAS applications.
Implications for Future Research
This paper sets a strong foundation for future studies exploring NAS-related performance estimators. There are several open avenues identified, notably the development of more sophisticated ZSEs capable of better distinguishing top-performing architectures across various spaces. The adaptation of structure-level analysis over parameter-level analysis, the inclusion of receptive field considerations into ZSEs, and finding a universal ZSE applicable across multiple types of search spaces are pertinent future directions.
For the domain of NAS, this paper not only accentuates the nuanced facets of existing estimators but also bridges the evaluation methodologies to more closely reflect useful, deployable models in practice. Researchers are urged to leverage the proposed analysis framework to devise and assess cutting-edge performance estimators that align with the increasingly complex landscape of NAS.
In conclusion, the paper by Ning et al. provides an expansive inquiry into efficient performance estimation, revealing substantial insights while highlighting critical areas for further innovation. It serves as a vital resource for researchers aiming to refine NAS methodologies and enhance the efficiency of neural architecture evaluations.