CONVOLUTIONAL NEURAL NETWORK SCALING METHODS IN SEMANTIC SEGMENTATION
DOI:
https://doi.org/10.15588/1607-3274-2024-2-6Keywords:
convolutional neural network, scaling method, asymmetric scaling, semantic segmentation, encoder-decoder, imageAbstract
Context. Designing a new architecture is difficult and time-consuming process, that in some cases can be replaced by scaling existing model. In this paper we examine convolutional neural network scaling methods and aiming on the development of the method that allows to scale original network that solves segmentation task into more accurate network.
Objective. The goal of the work is to develop a method of scaling a convolutional neural network, that achieve or outperform existing scaling methods, and to verify its effectiveness in solving semantic segmentation task.
Method. The proposed asymmetric method combines advantages of other methods and provides same high accuracy network in the result as combined method and even outperform other methods. The method is developed to be appliable for convolutional neural networks which follows encoder-decoder architecture designed to solve semantic segmentation task. The method is enhancing feature extraction potential of the encoder part, meanwhile preserving decoder part of architecture. Because of its asymmetric nature, proposed method more efficient, since it results in smaller increase of parameters amount.
Results. The proposed method was implemented on U-net architecture that was applied to solve semantic segmentation task. The evaluation of the method as well as other methods was performed on the semantic dataset. The asymmetric scaling method showed its efficiency outperformed or achieved other scaling methods results, meanwhile it has fewer parameters.
Conclusions. Scaling techniques could be beneficial in cases where some extra computational resources are available. The proposed method was evaluated on the solving semantic segmentation task, on which method showed its efficiency. Even though scaling methods improves original network accuracy they highly increase network requirements, which proposed asymmetric method dedicated to decrease. The prospects for further research may include the optimization process and investigation of tradeoff between accuracy gain and resources requirements, as well as a conducting experiment that includes several different architectures.
References
Smelyakov K., Chupryna A., Bohomolov O. et al. The Neural Network Models Effectiveness for Face Detection and Face Recognition, 2021 IEEE Open Conference of Electrical, Electronic and Information Sciences (eStream), Lithuania, 22 April 2021 : proceedings. Vilnius, IEEE, 2021, pp. 1–7. DOI: 10.1109/estream53087.2021.9431476.
Smelyakov K. , Sandrkin D., Ruban I. et al. Search by Image. New Search Engine Service Model, Problems of Infocommunications. Science and Technology (PIC S&T) : 2018 International Scientific-Practical Conference, Ukraine, 9–12 October 2018 : proceedings. Kharkiv, IEEE, 2018, pp. 181–186. DOI: 10.1109/infocommst.2018.8632117.
Hao S., Zhou Y., Guo Y. A Brief Survey on Semantic Segmentation with Deep Learning, Neurocomputing, 2020, Vol. 406, pp. 302–321. DOI: 10.1016/j.neucom.2019.11.118 2020.
Hafiz A. M., Bhat G. M. A survey on instance segmentation: state of the art, International Journal of Multimedia Information Retrieval, 2020, Vol. 9, No. 3, pp. 171–189. DOI: 10.1007/s13735-020-00195-x.
Minaee S., Wang Y. An ADMM Approach to Masked Signal Decomposition Using Subspace Representation, IEEE Transactions on Image Processing, 2019, Vol. 28, No. 7, pp. 3192–3204. DOI: 10.1109/tip.2019.2894966.
Dhanachandra N., Manglem Khumanthem, Chanu Y. J. Image Segmentation Using K-means Clustering Algorithm and Subtractive Clustering Algorithm, Procedia Computer Science, 2015, Vol. 54, pp. 764–771. DOI: 10.1016/j.procs.2015.06.090.
Yu Z., Wong H.-S., Wen G. A modified support vector machine and its application to image segmentation, Image and Vision Computing, 2011, Vol. 29, No. 1, pp. 29–40. DOI: 10.1016/j.imavis.2010.08.003.
Hatami T., Hamghalam M., Reyhani-Galangashi O. et al. A Machine Learning Approach to Brain Tumors Segmentation Using Adaptive Random Forest Algorithm, 2019 5th Conference on Knowledge Based Engineering and Innovation (KBEI), Iran, 28 February – 1 March 2019 : proceedings. Tehran, IEEE, 2019, pp. 76–82. DOI: 10.1109/kbei.2019.8735072.
Minaee S. , Boykov Y., Porikli F. et al. Image Segmentation Using Deep Learning: A Survey, IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021, P. 1. DOI: 10.1109/tpami.2021.3059968.
Ulku I., Akagündüz E. A Survey on Deep Learning-based Architectures for Semantic Segmentation on 2D Images, Applied Artificial Intelligence, 2022, pp. 1–45. DOI: 10.1080/08839514.2022.2032924.
Ronneberger O., Philipp F., Thomas B. U-Net: Convolutional Networks for Biomedical Image Segmentation, Lecture Notes in Computer Science. Cham, 2015, pp. 234–241. DOI: 10.1007/978-3-319-24574-4_28.
Huang H., Lin L., Tong R. et al. UNet 3+: A Full-Scale Connected UNet for Medical Image Segmentation, ICASSP 2020 – 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Spain, 4–8 May 2020 : proceedings. Barcelona, IEEE, 2020, pp. 1055–1059. DOI: 10.1109/icassp40776.2020.9053405.
Cao H., Wang Y., Chen J., et al. Swin-Unet: Unet-Like Pure Transformer for Medical Image Segmentation, Lecture Notes in Computer Science. Cham, 2023, pp. 205–218. DOI: 10.1007/978-3-031-25066-8_9.
Zhang S., Zhang C. Modified U-Net for plant diseased leaf image segmentation, Computers and Electronics in Agriculture, 2023, Vol. 204, P. 107511. DOI: 10.1016/j.compag.2022.107511.
Kozal J., Wozniak M. Increasing depth of neural networks for life-long learning, Information Fusion, 2023, P. 101829. DOI: 10.1016/j.inffus.2023.101829.
Yang G., Hu E. Tensor programs IV: Feature learning in infinite-width neural networks, International Conference on Machine Learning : 38th International Conference, 18–24 July 2021 : proceedings. San Diego, PMLR, 2021, pp. 11727–11737.
Sabottke C. F., Spieler B. M. The Effect of Image Resolution on Deep Learning in Radiography, Radiology: Artificial Intelligence, 2020, Vol. 2, No. 1, P. e190015. DOI: 10.1148/ryai.2019190015.
Thambawita V., Strümke I., Hicks S. et al. Impact of Image Resolution on Deep Learning Performance in Endoscopy Image Classification: An Experimental Study Using a Large Dataset of Endoscopic Images, Diagnostics, 2021, Vol. 11, No. 12, P. 2183. DOI: 10.3390/diagnostics11122183.
Smelyakov K., Shupyliuk M., Martovytskyi V. et al. Efficiency of image convolution, Advanced Optoelectronics and Lasers (CAOL) : 8th International Conference, Bulgaria, 6–8 September 2019 : proceedings. Sozopol, IEEE, 2019, pp. 578–583. DOI: 10.1109/caol46282.2019.9019450.
Jadon S. A survey of loss functions for semantic segmentation, 2020 IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB), Chile, 27–29 October 2020 : proceedings. Viña del Mar, IEEE, 2020, pp. 1–7. DOI: 10.1109/cibcb48159.2020.9277638.
Furusho Y., Ikeda K. Theoretical analysis of skip connections and batch normalization from generalization and optimization perspectives, APSIPA Transactions on Signal and Information Processing, 2020, Vol. 9. DOI: 10.1017/atsip.2020.7.
Cordts M., Omran M., Ramos S. et al. The Cityscapes Dataset for Semantic Urban Scene Understanding, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), USA, 27–30 June 2016 : proceedings. Las Vegas, IEEE, 2016, pp. 3213–3223. DOI: 10.1109/cvpr.2016.350.
Lingjiao C., Wang H., Zhao J., et al. The Effect of Network Width on the Performance of Large-batch Training, Advances in Neural Information Processing Systems 31, 2018.
Jerubbaal J., Rajkumar J., Mahesh B. Impact of image size on accuracy and generalization of convolutional neural networks, IJRAR, 2019, Vol. 6, No. 1, pp. 70–80.
Guocheng Q., Li Y., Peng H. et al. PointNeXt: Revisiting PointNet++ with Improved Training and Scaling Strategies, Advances in Neural Information Processing Systems 35, 2022, pp. 23192–23204.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2024 I. O. Hmyria, N. S. Kravets
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Creative Commons Licensing Notifications in the Copyright Notices
The journal allows the authors to hold the copyright without restrictions and to retain publishing rights without restrictions.
The journal allows readers to read, download, copy, distribute, print, search, or link to the full texts of its articles.
The journal allows to reuse and remixing of its content, in accordance with a Creative Commons license СС BY -SA.
Authors who publish with this journal agree to the following terms:
-
Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License CC BY-SA that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
-
Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
-
Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work.