FASTER OPTIMIZATION-BASED META-LEARNING  ADAPTATION PHASE

K. S. Khabarlak

doi:10.15588/1607-3274-2022-1-10

Authors

K. S. Khabarlak Dnipro University of Technology, Dnipro, Ukraine, Ukraine

DOI:

https://doi.org/10.15588/1607-3274-2022-1-10

Keywords:

few-shot learning, meta-learning, Model-Agnostic Meta-Learning, MAML, adaptation time, adaptation speed, optimization-based meta-learning

Abstract

Context. Neural networks require a large amount of annotated data to learn. Meta-learning algorithms propose a way to decrease number of training samples to only a few. One of the most prominent optimization-based meta-learning algorithms is MAML. However, its adaptation to new tasks is quite slow. The object of study is the process of meta-learning and adaptation phase as defined by the MAML algorithm.
Objective. The goal of this work is creation of an approach, which should make it possible to: 1) increase the execution speed of MAML adaptation phase; 2) improve MAML accuracy in certain cases. The testing results will be shown on a publicly available few-shot learning dataset CIFAR-FS.
Method. In this work an improvement to MAML meta-learning algorithm is proposed. Meta-learning procedure is defined in terms of tasks. In case of image classification problem, each task is to try to learn to classify images of new classes given only a few training examples. MAML defines 2 stages for the learning procedure: 1) adaptation to the new task; 2) meta-weights update. The whole training procedure requires Hessian computation, which makes the method computationally expensive. After being trained, the network will typically be used for adaptation to new tasks and the subsequent prediction on them. Thus, improving adaptation time is an important problem, which we focus on in this work. We introduce lambda pattern by which we restrict which weight we update in the network during the adaptation phase. This approach allows us to skip certain gradient computations. The pattern is selected given an allowed quality degradation threshold parameter. Among the pattern that fit the criteria, the fastest pattern is then selected. However, as it is discussed later, quality improvement is also possible is certain cases by a careful pattern selection.
Results. The MAML algorithm with lambda pattern adaptation has been implemented, trained and tested on the open CIFAR-FS dataset. This makes our results easily reproducible.
Conclusions. The experiments conducted have shown that via lambda adaptation pattern selection, it is possible to significantly improve the MAML method in the following areas: adaptation time has been decreased by a factor of 3 with minimal accuracy loss. Interestingly, accuracy for one-step adaptation has been substantially improved by using lambda patterns as well. Prospects for further research are to investigate a way of a more robust automatic pattern selection scheme.

Author Biography

K. S. Khabarlak, Dnipro University of Technology, Dnipro, Ukraine

Post-graduate student of the Department of System Analysis and Control

References

He K., Zhang X. , Ren S. et al. Deep Residual Learning for Image Recognition, 2016 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016. Las Vegas, NV, USA, June 27–30, 2016. IEEE Computer Society, 2016. pp. 770–778. DOI: 10.1109/CVPR.2016.90.

Deng J., Dong W., Socher R. et al. ImageNet: A large-scale hierarchical image database, 2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2009), 20–25 June 2009. Miami, Florida, USA, IEEE Computer Society, 2009, pp. 248–255. DOI: 10.1109/CVPR.2009.5206848.

Huang G., Liu Z., Maaten L. et al. Densely Connected Convolutional Networks, 2017 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017. Honolulu, HI, USA, July 21–26, 2017, IEEE Computer Society, 2017, pp. 2261–2269. DOI: 10.1109/CVPR.2017.243.

Zagoruyko S., Komodakis N. Wide Residual Networks, Proceedings of the British Machine Vision Conference (BMVC). BMVA Press, 2016, pp. 87.1–87.12. DOI: 10.5244/C.30.87.

Finn C., Abbeel P., Levine S. Model-Agnostic MetaLearning for Fast Adaptation of Deep Networks, Proceedings of the 34th International Conference on Machine Learning, ICML 2017, Sydney, NSW, Australia, 6–11 August 2017, Proceedings of Machine Learning Research. PMLR, 2017, Vol. 70, pp. 1126–1135.

Rajeswaran A., Finn C., Kakade S. et al. Meta-Learning with Implicit Gradients, Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, NeurIPS 2019, December 8–14, 2019. Vancouver, BC, Canada, 2019, pp. 113–124.

Khabarlak K., Koriashkina L. Fast Facial Landmark Detection and Applications: A Survey [Electronic resource], arXiv:2101.10808 [cs], 2021. Access mode: https://arxiv.org/abs/2101.10808

[Bertinetto L., Henriques J., Torr P. et al. Meta-learning with differentiable closed-form solvers [Electronic resource], 7th International Conference on Learning Representations, ICLR 2019. New Orleans, LA, USA, May 6–9, 2019. Access mode: https://openreview.net/forum?id=HyxnZh0ct7.

Snell J., Swersky K., Zemel R.S. // Prototypical Networks for Few-shot Learning, Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, December 4–9, 2017. Long Beach, CA, USA, 2017, pp. 4077–4087.

Ravi S., Larochelle H. Optimization as a Model for FewShot Learning [Electronic resource], 5th International Conference on Learning Representations, ICLR 2017. Toulon, France, April 24–26, 2017, Conference Track Proceedings. Access mode: https://openreview.net/forum?id=rJY0-Kcll.

Antoniou A., Edwards H., Storkey A. J. How to train your MAML [Electronic resource], 7th International Conference on Learning Representations, ICLR 2019. New Orleans, LA, USA, May 6–9, 2019. Access mode: https://openreview.net/forum?id=HJGven05Y7.

Weng L. Meta-Learning: Learning to Learn Fast [Electronic resource]. Access mode: https://lilianweng.github.io/lil-log/2018/11/30/meta-learning.html.

Yin W. Meta-learning for Few-shot Natural Language Processing: A Survey [Electronic resource], CoRR, 2020, Vol. abs/2007.09604. Access mode: https://arxiv.org/abs/2007.09604.

Wang Y., Yao Q. , Kwok J. et al. Generalizing from a Few Examples: A Survey on Few-shot Learning, ACM Comput. Surv, 2020, Vol. 53, No. 3, pp. 63:1–63:34. DOI: 10.1145/3386252.

Guo Y., Zhang L. One-shot Face Recognition by Promoting Underrepresented Classes [Electronic resource], CoRR, 2017, Vol. abs/1707.05574. Access mode: http://arxiv.org/abs/1707.05574.

Koch G., Zemel R., Salakhutdinov R. Siamese neural networks for one-shot image recognition / G. Koch, // ICML deep learning workshop. Lille, 2015, Vol. 2.

Vinyals O., Blundell C., Lillicrap T. et al. Matching Networks for One Shot Learning, Advances in Neural Information Processing Systems 29, Annual Conference on Neural Information Processing Systems 2016, December 5–10, 2016. Barcelona, Spain, 2016, pp. 3630–3638.

Santoro A., Bartunov S., Botvinick M. et al. Meta-Learning with Memory-Augmented Neural Networks, Proceedings of the 33nd International Conference on Machine Learning, ICML 2016, New York City, NY, USA, June 19–24, 2016, JMLR Workshop and Conference Proceedings. JMLR.org, 2016, Vol. 48, pp. 1842–1850.

Lake B. M., Salakhutdinov R., Tenenbaum J. B. Humanlevel concept learning through probabilistic program induction, Science, 2015, Vol. 350, No. 6266, pp. 1332–1338. DOI: 10.1126/science.aab3050.

Nichol A., Achiam J., Schulman J. On First-Order MetaLearning Algorithms [Electronic resource], CoRR, 2018, Vol. abs/1803.02999. Access mode: http://arxiv.org/abs/1803.02999.

Li Z., Zhou F., Chen F. et al. Meta-SGD: Learning to Learn Quickly for Few Shot Learning [Electronic resource], CoRR. 2017, Vol. abs/1707.09835. Access mode: http://arxiv.org/abs/1707.09835.

Zeiler M.D., Fergus R. Visualizing and Understanding Convolutional Networks, Computer Vision, ECCV 2014, 13th European Conference, Zurich, Switzerland, September 6–12, 2014, Proceedings, Part I, Lecture Notes in Computer Science. Springer, 2014, Vol. 8689, pp. 818–833. DOI: 10.1007/978-3-319-10590-1_53.

Ioffe S., Szegedy C. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift, Proceedings of the 32nd International Conference on Machine Learning, ICML 2015, Lille, France, 6–11 July 2015: JMLR Workshop and Conference Proceedings. – JMLR.org, 2015, Vol. 37, pp. 448–456.

Kingma D. P., Ba J. Adam: A Method for Stochastic Optimization, 3rd International Conference on Learning Representations, ICLR 2015. San Diego, CA, USA, May 7–9, 2015, Conference Track Proceedings, 2015.

Krizhevsky A. Learning multiple layers of features from tiny images [Electronic resource], University of Toronto, 2009, Access mode: https://www.cs.toronto.edu/~kriz/learning-features-2009-TR.pdf

FASTER OPTIMIZATION-BASED META-LEARNING ADAPTATION PHASE

Authors

DOI:

Keywords:

Abstract

Author Biography

K. S. Khabarlak, Dnipro University of Technology, Dnipro, Ukraine

References

Downloads

Published

How to Cite

Issue

Section

License

Creative Commons Licensing Notifications in the Copyright Notices

Information

Current Issue