DEVELOPMENT OF METHOD TO IDENTIFY THE COMPUTER SYSTEM STATE BASED ON THE «ISOLATION FOREST» ALGORITHM
DOI:
https://doi.org/10.15588/1607-3274-2021-1-11Keywords:
computer system, operating system events, abnormal state, identification, machine learning, Isolation Forest algorithm.Abstract
Context. The problem of identification a computer system state was investigated. The object of the research is the identification process of the computer system state. The subject of the research is computer system state identifying means and methods.
Objective. The purpose of the work is to develop a method for identifying the computer system state.
Method. The method has been developed for identifying a computer system state based on integrated use the procedure for grouping unlabeled initial data and using machine learning technology based on the «Isolation Forest» algorithm, which provides to identify a computer system state and to distinguished the process name that initiated the abnormal state. Therefore, for collecting statistical data in the form of operating system functioning events, data method has been proposed and developed along with software. The analysis of functioning events has been performed. The result of analysis showed that the most informative are read and write operations. To set up a single dataset, read and write operations compared with the process name and combined into one array of event groups, so that it is possible to single out the process that causes the abnormal state of the computer system. As a result of the research, the «Isolation Forest» algorithm has been selected as a component of the method for identifying the computer system state. An accuracy and efficiency assessment of the developed method of identifying a computer system state has been carried out.
Results. The developed method is implemented and investigated when solving the problem of identifying anomalies in the functioning of computer systems.
Conclusions. The experiments carried out confirmed the efficiency of the proposed method. It allows us recommended the method for practical use in order to improve efficiency of identifying the computer system state and use it as an express method. Areas for further research may lie in the creation of the ensemble of fuzzy trees based on the proposed method and optimization of this software implementation.
References
Kelleher, J., B. Namee, A. Archi Fundamentals of Machine Learning for Predictive Data Analytics: Algorithms, Worked Examples, and Case Studies, The MIT Pres, 2015, 642 p.
Gavrylenko S., Semenov S., Sira O., Kuchuk N. Identification of the state of an object under conditions of fuzzy input data. Eastern-European Journal of Enterprise Technologies, 2019, Vol. 1, No. 4 (97), pp. 22–29. DOI: 10.15587/1729-4061.2019.157085
Subbotin S.O. Podannya j obrobka znan u sistemah shtuchnogo intelektu ta pidtrimki prijnyattya rishen. Zaporizhzhya, ZNTU, 2008, 341 p.
Bolshakov A.S., Gubankova E.V. Obnaruzhenie anomalij v kompyuternyh setyah s ispolzovaniem metodov mashinnogo obucheniya. Telekommunikacionnye ustrojstva i sistemy, 2020, Vol. 10, No. 1, pp. 37–42.
Lindigrin A. N. Sravnitelnyj analiz metodov mashinnogo obucheniya v zadachah obnaruzheniya setevyh anomalij, Izvestiya Tulskogo gosudarstvennogo universiteta. Tehnicheskie nauki, 2019, No. 12, pp. 400–404.
Wang S., Jiang L., Li C. Adapting naive Bayes tree classification, Knowledge and Information system, Vol. 44, No. 1, pp. 77–89. DOI: 10.1007/s10115-014-0746y
Kokoreva Ya., Makarov A. Poetapnyj process klasternogo analiza dannyh na osnove algoritma klasterizacii k-means, Molodoj uchenyj, 2015, No. 13, pp. 126–128.
Carlos A., Catania, Facundo Bromberg, Carlos Garcia Garino. An Autonomous Labelling Approach to Support Vector Machine Algorithms for Network Traffic Anomaly Detection, Expert Systems lications: An International Journal Archive, 2012, No. 39, рр. 45–49. DOI: 10.1016/j.eswa.2011.08.068
Malhotra Pankaj, Long Short Term Memory Networks for Anomaly Detection inTime Series, ESANN 2015 proceedings, European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning, 2015.
Irad Ben-Gal, Alexandra Dana, Niv Shkolnik, Gonen Singer. Efficient Construction of Decision Trees by the Dual Information Distance Method, Quality Technology & Quantitative Management, 2014, Vol. 11, No. 1, pp. 133– 147. DOI: 10.1080/16843703 .2014.11673330
Aggarwal C. C., Sathe S. Theoretical foundations and algorithms for outlier ensembles, ACM SIGKDD Explo- rations Newsletter, 2015, Vol. 17, No. 1, pp. 24–47. DOI: 10.1145/2830544.2830549
Zimek A., Campello R. J. G. B., Sander J. Ensembles for unsupervised outlier detection: challenges and research questions a position paper, Acm Sigkdd Explorations Newsletter, 2014, Vol. 15, No. 1, pp. 11–22. DOI: 10.1145/2594473.2594476
Aggarwal C. C. Outlier ensembles: position paper, ACMSIGKDD Explorations Newsletter, 2017, Vol. 14, No. 2, pp. 49–58. DOI: 10.1145/2481244.2481252
Boutalbi Rafika, Chitibi Kheir Eddine. Boosted Decision Trees for Lithiasis Type Identification, International Journal of Advanced Computer Science and Applications, 2015, Vol. 6, No. 6, рp. 197–202.
Chandola V., Banerjee А., Kumar V. Anomaly detection:survey, ACM computing surveys (CSUR), 2009, No. 41, pp. 15–58. DOI: 10.1145/1541880.1541882.
Chowdhury M. Malware Analysis and Detection Using Data Mining and Machine Learning Classification, International Conference on Applications and Techniques in Cyber Security and Intelligence, ATCI, 2018, pp. 266–274.
Breiman, L. Random Forests, Machine Language, 2001, No. 45 (1), pp. 5–32.
Sheluhin O. I., Polkovnikov M. V. Primenenie algoritma «izoliruyushij les» dlya resheniya zadach obnaruzheniya anomalij. Reshenie, 2019, No. 1, pp. 186–18.
Liu, Fei Tony, Ting, Kai Ming and Zhou, Zhi-Hua. Isolation forest, Proceedings of the 2008 Eighth IEEE International Conference on Data Mining, December 2008, pp. 413–422. DOI: 10.1109/ICDM.2008.17
Gavrylenko S., Sheverdin I., Kazarinov M. The ensemble method development of classification of the computer system state based on decisions trees, Advanced Information System, 2020, Vol. 4, No. 2, рр. 5–10. DOI: 10.20998/25229052.2020.3.01
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2021 С. Ю. Гавриленко , І. В. Шевердін
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Creative Commons Licensing Notifications in the Copyright Notices
The journal allows the authors to hold the copyright without restrictions and to retain publishing rights without restrictions.
The journal allows readers to read, download, copy, distribute, print, search, or link to the full texts of its articles.
The journal allows to reuse and remixing of its content, in accordance with a Creative Commons license СС BY -SA.
Authors who publish with this journal agree to the following terms:
-
Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License CC BY-SA that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
-
Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
-
Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work.