SIMPLE, FAST AND SCALABLE RECOMMENDATION SYSTEMS VIA EXTERNAL KNOWLEDGE DISTILLATION
DOI:
https://doi.org/10.15588/1607-3274-2025-3-12Keywords:
knowledge distillation, knowledge graphs, decoder-only models, node embeddings, transformer models, attention mechanism, recurrent neural networks, long short-term memory networks, deep neural networks, personalized sequential recommendations, predicting the next most relevant product, user modelingAbstract
Context. Recommendation systems are important tools for modern businesses to generate more income via proposing relevant goods to clients and achieve more loyal attendees. With deep learning emergence and hardware capabilities evolution it became possible to grasp customer behavioral patterns in data-driven way. However, accuracy of prediction is dependent on complexity of system, and these factors lead to increased delay in model’s output. The object of the study is the task of issuing sequential recommendations, namely the next most relevant product, subject to restrictions on system response time.
Objective. The goal of the research is the synthesis of a deep neural network that can retrieve relevant items for a large portion of users with minimal delay.
Method. The proposed method of obtaining recommendation systems that leverages a mixture of Attention-based deep learning model architectures with application of knowledge graphs for prediction quality enhancement via explicit enrichment of recommendation candidate pool, demonstrates the benefits of decoder-only models and distillation learning framework. The latter approach was proven to demonstrate outstanding performance in solving recommendation retrieval task while responding fast for large user batch processing.
Results. A model of a recommender system and a method for its training are proposed, combining the knowledge distillation
paradigm and learning on knowledge graphs. The proposed method was implemented via two-tower deep neural network to solve recommendation retrieval problem. A system for predicting the most relevant proposals for the user has been built, which includes the proposed model and its training method, as well as ranking indicators MAP@k and NDCG@k to assess the quality of the models. A program has been developed that implements the proposed architecture of the recommendation system, with the help of which the problem of issuing the most relevant proposals has been studied. When conducting experiments on a large amount of real data from user visits to an online retail store, it was found that the proposed method for designing recommender systems guarantees high relevance of the recommendations issued, is fast and unpretentious to computing resources at the stage of receiving responses from the system.
Conclusions. Series of conducted experiments confirmed that the proposed system effectively solves the problem in a short period of time, which is a strong argument in favor of its use in real conditions for large businesses that operate millions of visits per month and thousands of products. Prospects for further research within the given research topic include the use of other knowledge distillation methods, such as internal or self-distillation, the use of deep learning architectures other than the attention mechanism, and optimization of embedding vector storage
References
Falk K. Practical Recommender Systems. Shelter Island, Manning, 2019, 432 p.
Rasool A. Next Best Offer (NBO) / Next Best Action (NBA) – why it requires a fresh perspective? [Electronic resource]. Access mode: https://www.linkedin.com/pulse/next-best-offer-nbo-actionnba-why-requires-fresh-azaz-rasool/
Wang S., Wang Y., Hu L. et al. Modeling User Demand Evolution for Next-Basket Prediction, IEEE Transactions on Knowledge and Data Engineering, 2023, Vol. 35, Issue 11, pp. 11585–11598. DOI: 10.1109/TKDE.2022.3231018.
Eliyahu K. A. Achieving Commercial Excellence through Next Best Offer models. [Electronic resource]. Access mode: https://www.linkedin.com/pulse/achievingcommercial-excellence-through-next-best-offer-kisliuk/
Wang S., Hu L., Wang Y. et al. Sequential Recommender Systems: Challenges, Progress and Prospects, International Joint Conference on Artificial Intelligence : Twenty-eighth international joint conference, IJCAI 2019, Macao, 10–16 August 2019 : proceedings. Macao: International Joint Conference on Artificial Intelligence, 2019, pp. 6332–6338. DOI: 10.24963/ijcai.2019/883.
Garcin F., Dimitrakakis C., Faltings B. Personalized News Recommendation with Context Trees, Recommender systems : Seventh ACM conference, RecSys'13, Hong-Kong, 12–16 October 2013 : proceedings. New York, Association for Computing Machinery, 2013, pp. 105–112. DOI: 10.1145/2507157.2507166.
He R., McAuley J. Fusing Similarity Models with Markov Chains for Sparse Sequential Recommendation, ArXiv, 2016. DOI: 1609.09152v1.
Geron A. Hands-On Machine Learning with Scikit-Learn and TensorFlow. Sebastopol, O’Reilly Media Inc., 2017, 760 p.
Hochreiter S., Schmidhuber J. Long short-term memory, Neural computation, 1997, Vol. 9, № 8, pp. 1735–1780.
Xia Q., Jiang P., Sun F. et al. Modeling Consumer Buying Decision for Recommendation Based on Multi-Task Deep Learning, Information and Knowledge Management : Twenty-seventh ACM international conference, CIKM '18, Torino, 22–26 October 2018 : proceedings. New York, Association for Computing Machinery, 2018, pp. 1703– 1706. DOI: 10.1145/3269206.3269285.
Zhao C., You J., Wen X., Li X. Deep Bi-LSTM Networks for Sequential Recommendation, Entropy (Basel), 2020, Vol. 22, Issue 8, P. 870. DOI: 10.3390/e22080870.
Vaswani A., Shazeer N., Parmar N. et al. Attention is all you need, Neural Information Processing Systems : Thirtyfirst international conference, NIPS '17, Long Beach, California, 04–09 December 2017 : proceedings. New York: Curran Associates Inc., 2017, pp. 6000–6010.
Ying H., Zhuang F., Zhang F. et al. Sequential Recommender System based on Hierarchical Attention Network, International Joint Conference on Artificial Intelligence : Twenty-seventh international joint conference, IJCAI '18, Stockholm, 13–19 July 2018 : proceedings. Menlo Park, AAAI Press, 2018, pp. 3926– 3932. DOI: 10.24963/ijcai.2018/546.
Fan Z., Liu Z., Wang Y. et al. Sequential Recommendation via Stochastic Self-Attention, ACM Web Conference 2022, WWW '22, Lyon, 25–29 April 2022 : proceedings. New York, Association for Computing Machinery, 2022, pp. 2036–2047. DOI: 10.1145/3485447.3512077.
Kipf T. N., Welling M. Semi-Supervised Classification with Graph Convolutional Networks, International Conference on Learning Representations : Fifth international conference, ICLR 2017, Toulon, 24–26 April 2017 : proceedings. New York, Curran Associates Inc., 2017. DOI: 10.48550/arXiv.1609.02907.
Wu Z., Pan S., Chen F. et al. A Comprehensive Survey of Graph Neural Networks for Knowledge Graphs, IEEE Transactions on Neural Networks and Learning Systems, 2022, Vol. 32, № 1, pp. 4–24. DOI: 10.1109/TNNLS.2020.2978386.
Hekmatfar T., Haratizadeh S., Razban P., Goliaei S.]Attention-Based Recommendation On Graphs, ArXiv, 2022. DOI: 2201.05499.
Hinton G., Vinyals O., Dean J. Distilling the knowledge in a neural network, ArXiv, 2015. DOI: 1503.02531.
Ba L. J., Caruana R. Do Deep Nets Really Need to be Deep? Advances in Neural Information Processing Systems, 2014, Vol. 27, pp. 2654–2662. DOI: 1312.6184.
Church K. W., Hanks P. Word association norms, mutual information, and lexicography, Computational Linguistics, 1990, Vol. 16, № 1, pp. 22–29.
Grover A., Leskovec J. Node2vec: Scalable Feature Learning for Networks, ArXiv, 2016. DOI: 1607.00653.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 D. V. Androsov, N. I. Nedashkovskaya

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Creative Commons Licensing Notifications in the Copyright Notices
The journal allows the authors to hold the copyright without restrictions and to retain publishing rights without restrictions.
The journal allows readers to read, download, copy, distribute, print, search, or link to the full texts of its articles.
The journal allows to reuse and remixing of its content, in accordance with a Creative Commons license СС BY -SA.
Authors who publish with this journal agree to the following terms:
-
Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License CC BY-SA that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
-
Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
-
Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work.