IDENTIFICATION AND LOCALIZATION OF VULNERABILITIES IN SMART CONTRACTS USING ATTENTION VECTORS ANALYSIS IN A BERT-BASED MODEL
DOI:
https://doi.org/10.15588/1607-3274-2024-3-15Keywords:
smart contracts, vulnerabilities, blockchain, machine learning, attention vector analysis, transformers, code security, code auditAbstract
Context. With the development of blockchain technology and the increasing use of smart contracts, which are automatically executed in blockchain networks, the significance of securing these contracts has become extremely relevant. Traditional code auditing methods often prove ineffective in identifying complex vulnerabilities, which can lead to significant financial losses. For example, the reentrancy vulnerability that led to the DAO attack in 2016 resulted in the loss of 3.6 million ethers and the split of the Ethereum blockchain network. This underscores the necessity for early detection of vulnerabilities.
Objective. The objective of this work is to develop and test an innovative approach for identifying and localizing vulnerabilities in smart contracts based on the analysis of attention vectors in a model using BERT architecture.
Method. The methodology described includes data preparation and training a transformer-based model for analyzing smart contract code. The proposed attention vector analysis method allows for the precise identification of vulnerable code segments. The use of the CodeBERT model significantly improves the accuracy of vulnerability identification compared to traditional methods. Specifically, three types of vulnerabilities are considered: reentrancy, timestamp dependence, and tx.origin vulnerability. The data is preprocessed, which includes the standardization of variables and the simplification of functions.
Results. The developed model demonstrated a high F-score of 95.51%, which significantly exceeds the results of contemporary approaches, such as the BGRU-ATT model with an F-score of 91.41%. The accuracy of the method in the task of localizing reentrancy vulnerabilities was 82%.
Conclusions. The experiments conducted confirmed the effectiveness of the proposed solution. Prospects for further research include the integration of more advanced deep learning models, such as GPT-4 or T5, to improve the accuracy and reliability of vulnerability detection, as well as expanding the dataset to cover other smart contract languages, such as Vyper or LLL, to enhance the applicability and efficiency of the model across various blockchain platforms.
Thus, the developed CodeBERT-based model demonstrates high results in detecting and localizing vulnerabilities in smart contracts, which opens new opportunities for research in the field of blockchain platform security.
References
Komleva N. O., Tereshchenko O. I. Requirements for the development of smart contracts and an overview of smart contract vulnerabilities at the Solidity code level on the Ethereum platform, Herald of Advanced Information Technology, 2023, Vol. 6, 1, pp. 54-68. DOI: 10.15276/hait.06.2023.4
Huang T. H.-D. Hunting the ethereum smart contract: Colorinspired inspection of potential attacks [Electronic resource]. Access mode: https://arxiv.org/abs/1807.01868
Liao J.-W., Tsai T.-T., He C.-K. et al. Soliaudit: Smart contract vulnerability assessment based on machine learning and fuzz testing, IOTSMS 2019 : Sixth International Conference on Internet of Things: Systems, Management and Security, Granada, 22-25 October 2019 : proceedings. New York, NY, IEEE Press, 2019, pp. 458-465. DOI: 10.1109/IOTSMS48152.2019.8939256
Yu X., Hou B., Ying Z. et al. Deep learning-based solution for smart contract vulnerabilities detection, Scientific Reports, 2023, Vol. 13, P. 20106. DOI: 10.1038/s41598-02347219-0
Gao Z., Jiang L., Xia X. et al. Checking Smart Contracts With Structural Code Embedding, IEEE Transactions on Software Engineering, 2021, Vol. 47, 12, pp. 2874-2891. DOI: 10.1109/TSE.2020.2971482
Zhang L., Wang J., Wang W. et al. A Novel Smart Contract Vulnerability Detection Method Based on Information Graph and Ensemble Learning, Sensors, 2022, Vol. 22, P. 3581. DOI: 10.3390/s22093581
Sendner C., Chen H., Fereidooni H. et al. Smarter Contracts: Detecting Vulnerabilities in Smart Contracts with Deep Transfer Learning, Network and Distributed System Security : Symposium 2023, San Diego, 27-03 February-March 2023 : proceedings. Reston, VA: The Internet Society, 2023.
Zhuang Y., Liu Z., Qian P. et al. Smart Contract Vulnerability Detection using Graph Neural Network [Electronic resource], IJCAI'20: Twenty-Ninth International Joint Conference on Artificial Intelligence, 07-15 January 2021 : proceedings. Electronic resource, IJCAI, 2021, pp. 3283-3290. DOI: 10.24963/ijcai.2020/454
Park D., Zhang Y., Saxena M. et al. A formal verification tool for ethereum vm bytecode, ESEC/FSE '18: 26th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering, Lake Buena Vista FL, 04-09 November 2018 : proceedings. New York, Association for Computing Machinery, 2018, pp. 912-915. DOI: 10.1145/3236024.3264591
Vaswani A., Shazeer N., Parmar N., Uszkoreit J., Jones L., Gomez A. N., Kaiser L., Polosukhin I. Attention Is All You Need [Electronic resource]. Access mode: https://arxiv.org/abs/1706.03762. DOI: 10.48550/arXiv.1706.03762
Tereshchenko O. I., Komleva N. O. Vulnerability Detection of Smart Contracts Based on Bidirectional GRU and Attention Mechanism, Information and Communication Technologies in Education, Research, and Industrial Applications 2023: 18th International Conference, Ivano-Frankivsk, 18-22 September 2023 : proceedings. Berlin, Springer, 2023, Vol. 1980, pp. 276-287. DOI: 10.1007/978-3-03148325-7_21
Liu Y., Ott M., Goyal N., Du J., Joshi M., Chen D., Levy O., Lewis M., Zettlemoyer L., Stoyanov V. Roberta: A robustly optimized bert pretraining approach [Electronic resource]. Access mode: https://arxiv.org/abs/1907.11692
Yu X., Zhao H., Hou B. et al. DeeSCVHunter: A deep learning-based framework for smart contract vulnerability detection, IJCNN '21 : 2021 International Joint Conference on Neural Networks, Shenzhen, 18-22 July 2021 : proceedings. New York, NY, IEEE Press, 2021, pp. 1-8. DOI: 10.1109/IJCNN52387.2021.9534324
Harer J. A., Ozdemir O., Lazovich T. et al. Learning to repair software vulnerabilities with generative adversarial networks, NeurIPS 2018 : 32nd Conference on Neural Information Processing Systems, Montreal, 03-08 December 2018 : proceedings. Red Hook, NY, Curran Associates Inc., 2018, pp. 7933-7943. DOI: 10.48550/arXiv.1805.07475.
Zhou Y., Liu S., Siow J. et al. Devign: Effective vulnerability identification by learning comprehensive program semantics via graph neural networks, NeurIPS 2019 : 33rd Conference on Neural Information Processing Systems, Vancouver, 08-14 December 2019 : proceedings. Red Hook, NY: Curran Associates Inc., 2019, Vol. 32. DOI: 10.48550/arXiv.1909.03496
Tsankov P., Dan A., Drachsler-Cohen D. et al. Securify: Practical security analysis of smart contracts, CCS '18 : 2018 ACM SIGSAC Conference on Computer and Communications Security, Toronto, 15-19 October 2018 : proceedings. New York, NY, ACM, 2018, pp. 67-82. DOI: 10.1145/3243734.3243780
Huang J., Han S., You W. et al. Hunting vulnerable smart contracts via graph embedding based bytecode matching, IEEE Transactions on Information Forensics and Security, 2021, Vol. 16, pp. 2144-2156. DOI: 10.1109/TIFS.2021.3050051
Yuan X., Lin G., Tai Y. et al. Deep neural embedding for software vulnerability discovery: Comparison and optimization, Security and Communication Networks, 2022, pp. 1-12. DOI: 10.1155/2022/5203217
Feist. J., Greico G., Groce A. Slither: A static analysis framework for smart contracts, WETSEB '19: 2nd International Workshop on Emerging Trends in Software Engineering for Blockchain, Monteal, Quebec, 27 May 2019 : proceedings, 2019. New York, NY, IEEE Press, 2019, pp. 8-15. DOI: 10.48550/arXiv.1908.09878
Lutz O., Chen H., Fereidooni H., Sendner C. ESCORT: Ethereum Smart COntRacTs Vulnerability Detection using Deep Neural Network and Transfer Learning [Electronic resource]. Access mode: https://arxiv.org/abs/2103.12607. DOI: 10.48550/arXiv.2103.12607
Rodler M., Li W., Karame G. O. et al. Sereum: Protecting Existing Smart Contracts Against Re-Entrancy Attacks, Network and Distributed System Security : Symposium 2019, San Diego, 24-27 February 2019 : proceedings. Reston, VA, The Internet Society, 2023. DOI: 10.14722/ndss.2019.23413
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2024 О. І. Терещенко, Н. О. Комлева
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Creative Commons Licensing Notifications in the Copyright Notices
The journal allows the authors to hold the copyright without restrictions and to retain publishing rights without restrictions.
The journal allows readers to read, download, copy, distribute, print, search, or link to the full texts of its articles.
The journal allows to reuse and remixing of its content, in accordance with a Creative Commons license СС BY -SA.
Authors who publish with this journal agree to the following terms:
-
Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License CC BY-SA that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
-
Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
-
Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work.