nas神经网络架构搜索_神经建筑搜索(NAS)基础

nas神经网络架构搜索

机器学习 (Machine Learning)

Neural Architecture Search (NAS) has become a popular subject in the area of machine-learning science. Commercial services such as Google’s AutoML and open-source libraries such as Auto-Keras [1] make NAS accessible to the broader machine learning environment. We explore the ideas and approaches of NAS in this blog post to help readers to understand the field better and find possibilities of real-time applications.

神经体系结构搜索(NAS)已成为机器学习科学领域的热门主题。 诸如Google的AutoML之类的商业服务以及诸如Auto-Keras [1]之类的开放源代码库使NAS可以访问更广泛的机器学习环境。 我们在此博客中探讨了NAS的思想和方法,以帮助读者更好地了解该领域并发现实时应用的可能性。

什么是神经架构搜索(NAS)? (What is Neural Architecture Search (NAS)?)

Modern deep neural networks sometimes contain several layers of numerous types [2]. Skip connections [2] and sub-modules [3] are also being used to promote model convergence. There is no limit to the space of possible model architectures. Most of the deep neural network structures are currently created based on human experience, require a long and tedious trial and error process. NAS tries to detect effective architectures for a specific deep learning problem without human intervention.

现代深度神经网络有时包含许多类型的多层[2]。 跳过连接[2]和子模块[3]也用于促进模型收敛。 对可能的模型架构的空间没有限制。 目前,大多数深度神经网络结构都是根据人类经验创建的,需要漫长而乏味的反复试验过程。 NAS尝试在没有人工干预的情况下检测针对特定深度学习问题的有效架构。

Generally, NAS can be categorized into three dimensions- search space, a search strategy, and a performance estimation strategy [4].

一般情况下,NAS可以分为三种尺寸─搜索空间, 搜索策略, 以及性能评估策略 [4]。

搜索空间: (Search Space:)

The search space determines which neural architectures to be assessed. Better search space may reduce the complexity of searching for suitable neural architectures. In general, not only a constrained but also flexible search space is needed. Constraints eliminate non-intuitive neural architecture to create a finite space for searching. The search space contains every architecture design (often an infinite number) that can be originated from the NAS approaches. It may involve all sets of layer configurations stacked on each other (Figure 2a) or more complicated architectures that include skipping connections (Figure 2b). To reduce the search space dimension, it may also involve sub-modules design. Later sub-modules are stacked together to generate model architecture (Figure 2c).

搜索空间确定要评估的神经体系结构。 更好的搜索空间可以减少搜索合适的神经体系结构的复杂性。 通常,不仅需要受限的而且灵活的搜索空间。 约束消除了非直觉的神经体系结构,从而创建了一个有限的搜索空间。 搜索空间包含可以源自NAS方法的每个体系结构设计(通常是无限个)。 它可能涉及彼此堆叠的所有层配置集(图2a)或更复杂的体系结构,包括跳过连接(图2b)。 为了减小搜索空间尺寸,它可能还涉及子模块设计。 以后的子模块堆叠在一起以生成模型架构(图2c)。

Figure 2: Illustration of different architectures space. Image is taken from [4] 图2:不同架构空间的图示。 图片取自[4]

绩效评估策略: (Performance Estimation Strategy:)

It will provide a number that reflects the efficiency of all architectures in the search space. It is usually the accuracy of a model architecture when a reference dataset is trained over a predefined number of epochs followed by testing. The performance estimation technique can also often consider some factors such as the computational difficulty of training or inference. In any case, it’s computationally expensive to assess the performance of architecture.

它将提供一个反映搜索空间中所有体系结构效率的数字。 当参考数据集经过预定义数量的纪元训练后再进行测试时,通常是模型体系结构的准确性。 性能估计技术通常还可以考虑一些因素,例如训练或推理的计算难度。 无论如何,评估架构的性能在计算上都是昂贵的。

搜索策略: (Search Strategy:)

NAS actually relies on search strategies. It should identify promising architectures for estimating performance and avoid testing of bad architectures. Throughout the following article, we discuss numerous search strategies, including random and grid search, gradient-based strategies, evolutionary algorithms, and reinforcement learning strategies.

NAS实际上依赖于搜索策略。 它应该确定有前途的体系结构以评估性能,并避免测试不良的体系结构。 在整篇文章中,我们将讨论众多搜索策略,包括随机和网格搜索,基于梯度的策略,进化算法和强化学习策略。

A grid search is to follow the systematic search. In contrast, random search, randomly pick architectures from the search space and then test the accuracy of corresponding architecture through performance estimation strategy. These are both feasible for minimal search areas, especially when the issue at hand involves the tuning of a small number of hyper-parameters (with random search usually superior to grid search).

网格搜索应遵循系统搜索。 相反,随机搜索是从搜索空间中随机选择架构,然后通过性能评估策略测试相应架构的准确性。 这对于最小的搜索区域都是可行的,特别是当眼前的问题涉及到少量超参数的调整时(随机搜索通常优于网格搜索)。

As an optimization issue, NAS can be easily formulated through a gradient-based search [5]. Typically, the NAS optimization target to maximize validation accuracy. As NAS used discrete search space, it is challenging to achieve gradients. Therefore, it requires a transformation of a discrete architectural space into a continuous space and architectures derived from their continuous representations. The NAS can get gradients from the optimization target based on the transformed continuous space. Theoretical foundations for gradient search on the NAS are unusual. It is also difficult to certify that the global optimum converges. However, this method shows excellent search effectiveness in practical applications.

作为一个优化问题,可以通过基于梯度的搜索轻松创建NAS [5]。 通常,NAS优化目标是使验证准确性最大化。 由于NAS使用离散搜索空间,因此实现梯度具有挑战性。 因此,它需要将离散的建筑空间转换为连续的空间,并需要从其连续表示中导出建筑。 NAS可以基于变换后的连续空间从优化目标中获取渐变。 NAS上进行梯度搜索的理论基础并不常见。 还要证明全局最优收敛。 但是,该方法在实际应用中显示出极好的搜索效果。

Evolutionary algorithms are motivated by bio evolution. Model architecture applies to individuals who can produce offspring (other architecture) or die and be excluded from the population. An evolutionary NAS algorithm (NASNet architecture [6]) derived through the following process (Figure 3).

进化算法是由生物进化驱动的。 模型体系结构适用于可能产生后代(其他体系结构)或死亡并被排除在种群之外的个体。 通过以下过程(图3)得出的一种演化NAS算法(NASNet体系结构[6])。

I. Random architectures create an initial population of N models. Every individual’s output (i.e. architecture) is assessed according to the performance evaluation strategy.

I.随机体系结构创建了N个模型的初始种群。 根据绩效评估策略评估每个人的输出(即架构)。

II. Individuals with maximum performance are chosen as parents. A copy of the respective parents could be made for the new generation of architectures with induced “mutation”, or they might come from parental combinations. The performance evaluation strategy evaluates the offspring’s performance. Operations like adding or removing the layer, adding or removing a connection, changing the layer size or another hyper-parameter may be included in the list of possible mutations.

二。 表现最佳的个人被选为父母。 可以为具有诱因“突变”的新一代体系结构制作相应的父级副本,或者它们可以来自父级组合。 绩效评估策略评估后代的绩效。 诸如添加或删除层,添加或删除连接,更改层大小或其他超参数之类的操作可能包含在可能的突变列表中。

III. N architectures are selected to remove, which could be the worst individuals or older individuals in the population. The offspring substitute the removed architectures and restarts the cycle.

三, 选择要删除的N个架构,这可能是人口中最差的个人或年龄较大的个人。 后代将替代已删除的体系结构并重新开始循环。

Figure 3: Illustration the evolutionary algorithms for NAS 图3:图解NAS的进化算法

Evolutionary algorithms reveal capable outcomes and have generated state of the art models [7].

进化算法揭示了有力的结果并生成了最先进的模型[7]。

NAS methods based on reinforcement learning [8] gained popularity in recent years. A network controller, typically a recurrent neural network (RNN), may be used to sample from the search field with a specific probability distribution. The sampled architecture is formed and evaluated using a strategy for performance evaluation. The resultant performance is used as a reward to update the properties of the controller network (Figure 4). This cycle is iterated before a timeout or convergence occurs.

近年来,基于强化学习的NAS方法[8]受到欢迎。 网络控制器,通常是递归神经网络(RNN),可以用于以特定的概率分布从搜索字段进行采样。 使用性能评估策略来形成并评估样本架构。 由此产生的性能被用作奖励,以更新控制器网络的属性(图4)。 在超时或收敛发生之前重复此循环。

Figure 4: Basic design of a reinforcement learning approach for NAS. Image is taken from [8] 图4:NAS强化学习方法的基本设计。 图片取自[8]

Reinforcement learning able to build network architectures that surpass the hand-made model based on popular benchmark datasets, similar to evolutionary algorithms.

强化学习能够建立基于流行基准数据集的超越手工模型的网络架构,类似于进化算法。

Conclusion:

结论:

NAS has successfully established deeper neural network architectures that surpass the accuracy of architectures constructed manually. The state-of-the-art architectures generated by NAS have been developed using evolutionary algorithms and reinforcement learning, specifically in the field of the image classification task. It is expensive because hundreds or thousands of specific deep neural networks need to be trained and tested before the NAS produces successful results. NAS methods are too expensive for most realistic applications. Therefore, further research is required to make the NAS more generic.

NAS已成功建立了更深层的神经网络体系结构,其准确性超过了人工构建的体系结构。 NAS生成的最新架构是使用进化算法和强化学习开发的,特别是在图像分类任务领域。 这很昂贵,因为在NAS获得成功结果之前,需要对数百或数千个特定的深度神经网络进行培训和测试。 对于大多数实际应用而言,NAS方法过于昂贵。 因此,需要进一步研究以使NAS更通用。

[1] H. Jin, Q. Song and X. Hu, Auto-Keras: Efficient Neural Architecture Search with Network Morphism, arXiv, 2018.

[1] H. Jin,Q。Song和X. Hu,Auto-Keras:具有网络形态的高效神经体系结构搜索,arXiv,2018年。

[2] K. He, X. Zhang, S. Ren and J. Sun, Deep Residual Learning for Image Recognition, arXiv, 2015.

[2] K. He,X。Zhang,S。Ren和J. Sun,图像识别的深度残差学习,arXiv,2015年。

[3] C. Szegedy et al., Going Deeper with Convolutions, arXiv, 2014.

[3] C. Szegedy等人,《通过卷积更深入》,arXiv,2014年。

[4] T. Elsken, J.H. Metzen and F. Hutter, Neural Architecture Search: A Survey, Journal of Machine Learning Research, 2019.

[4] T. Elsken,JH Metzen和F.Hutter,《神经体系结构搜索:调查》,机器学习研究期刊,2019年。

[5] H. Liu, K. Simonyan and Y. Yang, DARTS: Differentiable Architecture Search, arXiv, 2019.

[5] H. Liu,K。Simonyan和Y. Yang,DARTS:差异化架构搜索,arXiv,2019年。

[6] B. Zoph, V. Vasudevan, J. Shlens and Q.V. Le, Learning Transferable Architectures for Scalable Image Recognition, Proceedings Conference on Computer Vision and Pattern Recognition, 2018.

[6] B. Zoph,V。Vasudevan,J。Shlens和QV Le,“学习可扩展架构以实现可伸缩图像识别”,计算机视觉和模式识别会议论文集,2018年。

[7] E. Real et al., Large-scale evolution of image classifiers, Proceedings of the 34th International Conference on Machine Learning, 2017.

[7] E. Real等人,图像分类器的大规模发展,第34届国际机器学习会议论文集,2017年。

[8] B. Zoph and Q.V. Le, Neural architecture search with reinforcement learning, arXiv 2016.

[8] B. Zoph和QV Le,《通过强化学习的神经架构搜索》,arXiv,2016年。

翻译自: https://medium.com/towards-artificial-intelligence/the-fundamentals-of-neural-architecture-search-nas-9bb25c0b75e2

nas神经网络架构搜索

你可能感兴趣的:(神经网络,python,深度学习,人工智能,java)