URBAN SCENE SEGMENTATION USING HOMOGENEOUS U-NET ENSEMBLE: A STUDY ON THE CITYSCAPES DATASET

I. O. Hmyria; N. S.  Kravets

doi:10.15588/1607-3274-2025-3-7

Authors

I. O. Hmyria Kharkiv National University of Radio Electronics, Kharkiv, Ukraine, Ukraine
N. S. Kravets Kharkiv National University of Radio Electronics, Kharkiv, Ukraine, Ukraine

DOI:

https://doi.org/10.15588/1607-3274-2025-3-7

Keywords:

convolutional neural network, semantic segmentation, U-Net, ensemble learning, data augmentation techniques, model initialization, Cityscapes, urban scenes

Abstract

Context. Semantic segmentation plays a critical role in computer vision tasks such as autonomous driving and urban scene understanding. While designing new model architectures can be complex, improving performance through ensemble techniques applied to existing models has shown promising potential. This paper investigates ensemble learning as a strategy to enhance segmentation accuracy without modifying the underlying U-Net architecture.
Objective. The aim of this work is to develop and evaluate a homogeneous ensemble of U-Net models trained with distinct initialization and data augmentation techniques, and to assess the effectiveness of various ensemble aggregation strategies in
improving segmentation performance on complex urban dataset.
Method. The proposed approach constructs an ensemble of five structurally identical U-Net models, each trained with unique weight initialization and augmentation schemes to ensure prediction diversity. Several ensemble strategies are examined, including softmax averaging, max voting, proportional weighting, exponential weighting, and optimized weighted voting. Evaluation is conducted on the Cityscapes dataset using a range of segmentation metrics.
Results. Experimental findings demonstrate that ensemble models outperform individual U-Net instances and the baseline in terms of accuracy, mean IoU, and specificity. The optimized weighted ensemble achieved the highest accuracy (87.56%) and mean IoU (0.6504), exceeding the best individual model by approximately 3%. However, these improvements come with a notable increase in inference time, highlighting a trade-off between accuracy and computational efficiency.
Conclusions. The ensemble-based approach effectively enhances segmentation accuracy while leveraging existing model architectures. Although the increased computational cost presents a limitation for real-time applications, the method is well-suited for high-precision tasks. Future research will focus on reducing inference time and extending the ensemble methodology to other architectures and datasets.

Author Biographies

I. O. Hmyria, Kharkiv National University of Radio Electronics, Kharkiv, Ukraine

Post-graduate student of the Department of Software Engineering

N. S. Kravets, Kharkiv National University of Radio Electronics, Kharkiv, Ukraine

Associate Professor, Associate Professor of the Department of Software Engineering

References

Cordts M., Omran M., Ramos S. et al. The Cityscapes Dataset for Semantic Urban Scene Understanding, In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), USA, 27–30 June 2016 : proceedings. Las Vegas, IEEE, 2016, pp. 3213–3223. DOI: 10.1109/cvpr.2016.350.

A Comprehensive Survey on Ensemble Methods / Suyash Kumar, Prabhjot Kaur, Anjana Gosain, 2022 IEEE 7th International Conference for Convergence in Technology (I2CT). Mumbai, India, 7–9 April 2022. [S.l.], 2022. DOI: 10.1109/i2ct54291.2022.9825269.

Hao S., Zhou Y., Guo Y. A brief survey on semantic segmentation with deep learning, Neurocomputing, 2020, Vol. 406, pp. 302–321. DOI: 10.1016/j.neucom.2019.11.118.

Pare S. [et al.]Image Segmentation Using Multilevel Thresholding: A Research Review, Iranian Journal of Science and Technology, Transactions of Electrical Engineering, 2019, Vol. 44, No. 1, pp. 1–29. DOI: 10.1007/s40998-019-00251-1.

Tang Jun A color image segmentation algorithm based on region growing, 2010 2nd International Conference on Computer Engineering and Technology. Chengdu, China, 16–18 April 2010. [S. l.], 2010. DOI: 10.1109/iccet.2010.5486012.

Jeyalaksshmi S., Prasanna S. A Review of Edge Detection Techniques for Image Segmentation, International Journal of Data Mining Techniques and Applications, 2016, Vol. 5, No. 2, pp. 140–142. DOI: 10.20894/ijdmta.102.005.002.008.

Liu Xiangbin et al. A Review of Deep-Learning-Based Medical Image Segmentation Methods, Sustainability, 2021, Vol. 13, No. 3, P. 1224. DOI: 10.3390/su13031224.

Guo Zhe et al. Deep Learning-Based Image Segmentation on Multimodal Medical Imaging, IEEE Transactions on Radiation and Plasma Medical Sciences, 2019, Vol. 3, No. 2, pp. 162–169. DOI: 10.1109/trpms.2018.2890359.

Mzoughi O., Mzoughi Olfa, Yahiaoui Itheri Deep learning-based segmentation for disease identification, Ecological Informatics, 2023, pp. 102000. DOI: 10.1016/j.ecoinf.2023.102000.

Sultana Farhana, Sufian Abu, Dutta Paramartha Evolution of Image Segmentation using Deep Convolutional Neural Network: A Survey, KnowledgeBased Systems, 2020, Vol. 201–202, P. 106062. DOI: 10.1016/j.knosys.2020.106062.

Shelhamer E. et al. Fully Convolutional Networks for Semantic Segmentation, IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, Vol. 39, No. 4, pp. 640–651. DOI: 10.1109/tpami.2016.2572683.

Li Xiaojin et al. Image Segmentation Based on Improved Unet, Journal of Physics: Conference Series, 2021, Vol. 1815, No. 1, P. 012018. DOI: 10.1088/1742- 6596/1815/1/012018.

Mobarakol Islam et al. Brain Tumor Segmentation and Survival Prediction Using 3D Attention UNet. Brainlesion: Glioma, Multiple Sclerosis, Stroke and Traumatic Brain Injuries. Cham, 2020, pp. 262–272. DOI: 10.1007/978-3-030-46640-4_25.

Karaali A. et al. DR-VNet: Retinal Vessel Segmentation via Dense Residual UNet, Pattern Recognition and Artificial Intelligence. Cham, 2022, pp. 198–210. DOI: 10.1007/978-3-031-09037-0_17.

Ngo G. et al. Evolutionary bagging for ensemble learning, Neurocomputing, 2022. DOI: 10.1016/j.neucom.2022.08.055.

Drucker Harris et al. Boosting and Other Ensemble Methods, Neural Computation, 1994, Vol. 6, No. 6, pp. 1289–1301. DOI: 10.1162/neco.1994.6.6.1289.

Verma Anurag Kumar, Pal Saurabh Prediction of Skin Disease with Three Different Feature Selection Techniques Using Stacking Ensemble Method, Applied Biochemistry and Biotechnology, 2019, Vol. 191, No. 2, pp. 637–656. DOI: 10.1007/s12010-019-03222-8.

Bodyanskiy Y. V., Lipianina-Honcharenko K. V., Sachenko A. O. Ensemble of Adaptive Predictors for Multivariate Nonstationary Sequences and its Online Learning, Radio Electronics, Computer Science, Control, 2024, No. 4, P. 91. DOI: 10.15588/1607-3274-2023-4-9.

Ahmad Numan, Behram Wali, Khattak Asad J. Heterogeneous ensemble learning for enhanced crash forecasts – A frequentist and machine learning based stacking framework, Journal of Safety Research, 2022, DOI: 10.1016/j.jsr.2022.12.005.

Lo Hung-Yi, Wang Ju-Chiang, Wang Hsin-Min Homogeneous segmentation and classifier ensemble for audio tag annotation and retrieval, 2010 IEEE International Conference on Multimedia and Expo (ICME), Singapore. Singapore, 19–23 July 2010. [S. l.], 2010. DOI: 10.1109/icme.2010.5583009.

Bian Shun, Wang Wenjia Investigation on Diversity in Homogeneous and Heterogeneous Ensembles, The 2006 IEEE International Joint Conference on Neural Network Proceedings, Vancouver, BC, Canada, 16–21 July 2006. [S. l.], 2006. DOI: 10.1109/ijcnn.2006.247268.

Kamnitsas Konstantinos et al. Efficient multi-scale 3D CNN with fully connected CRF for accurate brain lesion segmentation, Medical Image Analysis, 2017, Vol. 36, pp. 61–78. DOI: 10.1016/j.media.2016.10.004.

Gautam Nandita et al. An Ensemble of UNet Frameworks for Lung Nodule Segmentation, Current Problems in Applied Mathematics and Computer Science and Systems. Cham, 2023, pp. 450–461. DOI: 10.1007/978-3-031-34127-4_44.

Smelyakov K. et al. Adaptive Image Enhancement Model for the Robot Vision System, ENVIRONMENT. TECHNOLOGIES. RESOURCES. Proceedings of the International Scientific and Practical Conference, 2023, Vol. 3, pp. 246–251.

Abdullahi A. M. et al. A comparison of weight initializers in deep learning, 2023 IEEE 21st Student Conference on Research and Development (SCOReD); 2023 Dec 13–14. Kuala Lumpur, Malaysia, [S.l.], 2023. DOI: 10.1109/scored60679.2023.10563215.

Lee H. et al. Improved weight initialization for deep and narrow feedforward neural network, Neural Networks, 2024, P. 106362. DOI: 10.1016/j.neunet.2024.106362.

LeCun Y. et al. Efficient BackProp, In: Lecture Notes in Computer Science. Berlin, Heidelberg, 1998, pp. 9–50. DOI: 10.1007/3-540-49430-8_2.

Kramer O. Iterated local search with Powell’s method: a memetic algorithm for continuous global optimization, Memetic Computing, 2010, Vol. 2, No. 1, pp. 69–83. DOI: 10.1007/s12293-010-0032-9.

URBAN SCENE SEGMENTATION USING HOMOGENEOUS U-NET ENSEMBLE: A STUDY ON THE CITYSCAPES DATASET

Authors

DOI:

Keywords:

Abstract

Author Biographies

I. O. Hmyria, Kharkiv National University of Radio Electronics, Kharkiv, Ukraine

N. S. Kravets, Kharkiv National University of Radio Electronics, Kharkiv, Ukraine

References

Downloads

Published

How to Cite

Issue

Section

License

Creative Commons Licensing Notifications in the Copyright Notices

Information

Current Issue

Announcements