元学习发展

来自维基百科

Some approaches which have been viewed as instances of meta learning:

  • Recurrent neural networks (RNNs) are universal computers. In 1993, Jürgen Schmidhuber showed how "self-referential" RNNs can in principle learn by backpropagation to run their own weight change algorithm, which may be quite different from backpropagation.[7] In 2001, Sepp Hochreiter & A.S. Younger & P.R. Conwell built a successful supervised meta learner based on Long short-term memory RNNs. It learned through backpropagation a learning algorithm for quadratic functions that is much faster than backpropagation.[8][2] Researchers at Deepmind (Marcin Andrychowicz et al.) extended this approach to optimization in 2017.[9]
  • In the 1990s, Meta Reinforcement Learning or Meta RL was achieved in Schmidhuber's research group through self-modifying policies written in a universal programming language that contains special instructions for changing the policy itself. There is a single lifelong trial. The goal of the RL agent is to maximize reward. It learns to accelerate reward intake by continually improving its own learning algorithm which is part of the "self-referential" policy.[10][11]
  • An extreme type of Meta Reinforcement Learning is embodied by the Gödel machine, a theoretical construct which can inspect and modify any part of its own software which also contains a general theorem prover[disambiguation needed]. It can achieve recursive self-improvement in a provably optimal way.[12][2]
  • Model-Agnostic Meta-Learning (MAML) was introduced in 2017 by Chelsea Finn et al.[13] Given a sequence of tasks, the parameters of a given model are trained such that few iterations of gradient descent with few training data from a new task will lead to good generalization performance on that task. MAML "trains the model to be easy to fine-tune."[13] MAML was successfully applied to few-shot image classification benchmarks and to policy gradient-based reinforcement learning.[13]
  • Discovering meta-knowledge works by inducing knowledge (e.g. rules) that expresses how each learning method will perform on different learning problems. The metadata is formed by characteristics of the data (general, statistical, information-theoretic,... ) in the learning problem, and characteristics of the learning algorithm (type, parameter settings, performance measures,...). Another learning algorithm then learns how the data characteristics relate to the algorithm characteristics. Given a new learning problem, the data characteristics are measured, and the performance of different learning algorithms are predicted. Hence, one can predict the algorithms best suited for the new problem.
  • Stacked generalisation works by combining multiple (different) learning algorithms. The metadata is formed by the predictions of those different algorithms. Another learning algorithm learns from this metadata to predict which combinations of algorithms give generally good results. Given a new learning problem, the predictions of the selected set of algorithms are combined (e.g. by (weighted) voting) to provide the final prediction. Since each algorithm is deemed to work on a subset of problems, a combination is hoped to be more flexible and able to make good predictions.
  • Boosting is related to stacked generalisation, but uses the same algorithm multiple times, where the examples in the training data get different weights over each run. This yields different predictions, each focused on rightly predicting a subset of the data, and combining those predictions leads to better (but more expensive) results.
  • Dynamic bias selection works by altering the inductive bias of a learning algorithm to match the given problem. This is done by altering key aspects of the learning algorithm, such as the hypothesis representation, heuristic formulae, or parameters. Many different approaches exist.
  • Inductive transfer studies how the learning process can be improved over time. Metadata consists of knowledge about previous learning episodes and is used to efficiently develop an effective hypothesis for a new task. A related approach is called learning to learn, in which the goal is to use acquired knowledge from one domain to help learning in other domains.
  • Other approaches using metadata to improve automatic learning are learning classifier systems, case-based reasoning and constraint satisfaction.
  • Some initial, theoretical work has been initiated to use Applied Behavioral Analysisas a foundation for agent-mediated meta-learning about the performances of human learners, and adjust the instructional course of an artificial agent.[14]
  • AutoML such as Google Brain's "AI building AI" project, which according to Google briefly exceeded existing ImageNet benchmarks in 2017.[15][16]

一些被视为元学习实例的方法:

  • 递归神经网络(RNN)是通用计算机。1993年,JürgenSchmidhuber展示了“自我参照”RNN原则上可以通过反向传播学习运行自己的体重变化算法,这可能与反向传播有很大不同。[7] 2001年,Sepp Hochreiter和AS Younger&PR Conwell基于Long短期记忆RNN建立了一个成功的监督元学习者。它通过反向传播学习了二次函数的学习算法,该算法比反向传播快得多。[8] [2] Deepmind的研究人员(Marcin Andrychowicz等人)在2017年将这种方法扩展到优化。[9]
  • 在20世纪90年代,通过使用通用编程语言编写的自修改策略在Schmidhuber的研究小组中实现了Meta Reinforcement Learning或Meta RL,该语言包含用于更改策略本身的特殊指令。有一个终身的试验。RL代理的目标是最大化奖励。它学会通过不断改进自己的学习算法来加速奖励摄入,这是“自我指涉”政策的一部分。[10] [11]
  • Gödel机器体现了极端类型的Meta 强化学习,这是一种理论构造,可以检查和修改自己软件的任何部分,其中还包含一般定理证明[ 需要消歧 ]。它可以以可证明的最佳方式实现递归自我改进。[12] [2]
  • Model-Agnostic Meta-Learning(MAML)由Chelsea Finn等人于2017年推出。[13]鉴于任务的顺序,一个给定的模型的参数进行培训,以便与新任务数的训练数据的梯度下降几个迭代将导致对这项任务具有良好的泛化性能。MAML“训练模型易于微调。” [13] MAML成功应用于少数图像分类基准和基于政策梯度的强化学习。[13]
  • 通过诱导知识(例如规则)来发现元知识,这些知识表达了每种学习方法在不同学习问题上的表现。元数据由学习问题中的数据特征(一般,统计,信息理论......)和学习算法的特征(类型,参数设置,性能测量......)形成。然后,另一种学习算法学习数据特征如何与算法特征相关。给定新的学习问题,测量数据特征,并预测不同学习算法的性能。因此,可以预测最适合新问题的算法。
  • 堆叠泛化通过组合多个(不同的)学习算法来工作。元数据由这些不同算法的预测形成。另一种学习算法从该元数据中学习以预测哪些算法组合通常给出了良好的结果。给定新的学习问题,组合所选算法组的预测(例如,通过(加权)投票)以提供最终预测。由于每种算法都被视为处理问题的一个子集,因此希望组合更灵活并能够做出良好的预测。
  • Boosting与堆叠泛化有关,但是多次使用相同的算法,其中训练数据中的示例在每次运行中获得不同的权重。这产生了不同的预测,每个预测集中在正确地预测数据的子集,并且组合这些预测导致更好(但更昂贵)的结果。
  • 动态偏差选择通过改变学习算法的归纳偏差来匹配给定问题。这是通过改变学习算法的关键方面来完成的,例如假设表示,启发式公式或参数。存在许多不同的方法。
  • 归纳转移研究如何随着时间的推移改善学习过程。元数据由关于先前学习事件的知识组成,并且用于有效地为新任务开发有效假设。相关的方法被称为学习学习,其目标是使用来自一个领域的获得的知识来帮助在其他领域中学习。
  • 使用元数据来改进自动学习的其他方法是学习分类器系统,基于案例的推理和约束满足。
  • 已经开始进行一些初步的理论工作,将应用行为分析作为代理人介导的关于人类学习者表现的元学习的基础,并调整人工智能的教学过程。[14]
  • Auto Brain,例如Google Brain的“AI building AI”项目,据Google称,该项目在2017年短暂超过了现有的ImageNet基准。[15] [16]

 

References

  1. ^ Jump up to:a b c d e Schmidhuber, Jürgen (1987). "Evolutionary principles in self-referential learning, or on learning how to learn: the meta-meta-... hook" (PDF). Diploma Thesis, Tech. Univ. Munich.
  2. ^ Jump up to:a b c Schaul, Tom; Schmidhuber, Jürgen (2010). "Metalearning". Scholarpedia5 (6): 4650. doi:10.4249/scholarpedia.4650.
  3. ^ P. E. Utgoff (1986). "Shift of bias for inductive concept learning". In R. Michalski, J. Carbonell, & T. Mitchell: Machine Learning: 163–190.
  4. ^ Bengio, Yoshua; Bengio, Samy; Cloutier, Jocelyn (1991). Learning to learn a synaptic rule (PDF). IJCNN'91.
  5. ^ Lemke, Christiane; Budka, Marcin; Gabrys, Bogdan (2013-07-20). "Metalearning: a survey of trends and technologies". Artificial Intelligence Review44 (1): 117–130.doi:10.1007/s10462-013-9406-y. ISSN 0269-2821. PMC 4459543. PMID 26069389.
  6. ^ Brazdil, Pavel; Carrier, Christophe Giraud; Soares, Carlos; Vilalta, Ricardo (2009).Metalearning - Springer. Cognitive Technologies. doi:10.1007/978-3-540-73263-1.ISBN 978-3-540-73262-4.
  7. ^ Schmidhuber, Jürgen (1993). "A self-referential weight matrix". Proceedings of ICANN'93, Amsterdam: 446-451.
  8. ^ Hochreiter, Sepp; Younger, A. S.; Conwell, P. R. (2001). "Learning to Learn Using Gradient Descent". Proceedings of ICANN'01: 87-94.
  9. ^ Andrychowicz, Marcin; Denil, Misha; Gomez, Sergio; Hoffmann, Matthew; Pfau, David; Schaul, Tom; Shillingford, Brendan; de Freitas, Nando (2017). "Learning to learn by gradient descent by gradient descent". Proceedings of ICML'17, Sydney, Australia.
  10. ^ Schmidhuber, Jürgen (1994). "On learning how to learn learning strategies". Technical Report FKI-198-94, Tech. Univ. Munich.
  11. ^ Schmidhuber, Jürgen; Zhao, J.; Wiering, M. (1997). "Shifting inductive bias with success-story algorithm, adaptive Levin search, and incremental self-improvement". Machine Learning28: 105-130.
  12. ^ Schmidhuber, Jürgen (2006). "Gödel machines: Fully Self-Referential Optimal Universal Self-Improvers". In B. Goertzel & C. Pennachin, eds.: Artificial General Intelligence: 199-226.
  13. ^ Jump up to:a b c Finn, Chelsea; Abbeel, Pieter; Levine, Sergey (2017). "Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks". arXiv:1703.03400 [cs.LG].
  14. ^ Begoli, Edmon (May 2014). Procedural-Reasoning Architecture for Applied Behavior Analysis-based Instructions. Knoxville, Tennessee, USA: University of Tennessee, Knoxville. pp. 44–79. Retrieved 14 October 2017.
  15. ^ "Robots Are Now 'Creating New Robots,' Tech Reporter Says". NPR.org. 2018. Retrieved 29 March 2018.
  16. ^ "AutoML for large scale image classification and object detection". Google Research Blog. November 2017. Retrieved 29 March 2018.

External links[edit]

  • Metalearning article in Scholarpedia
  • Vilalta R. and Drissi Y. (2002). A perspective view and survey of meta-learning, Artificial Intelligence Review, 18(2), 77—95.
  • Giraud-Carrier, C., & Keller, J. (2002). Dealing with the data flood, J. Meij (ed), chapter Meta-Learning. STT/Beweton, The Hague.
  • Brazdil P., Giraud-Carrier C., Soares C., Vilalta R. (2009) Metalearning: applications to data mining, chapter Metalearning: Concepts and Systems, Springer

 

近200篇小样本学习领域/元学习领域论文集

元学习发展_第1张图片

 [email protected]申请

你可能感兴趣的:(深度学习,元学习)