SONGS CONTINUATION GENERATION TECHNOLOGY BASED ON TEST GENERATION STRATEGIES, TEXTMINING AND LANGUAGE MODEL T5
DOI:
https://doi.org/10.15588/1607-3274-2023-4-15Keywords:
text generation, T5 language model, Transformers, author’s style, Contrastive search, Top-p sampling, Top-k sampling, Multinomial sampling, Beam search, Diverse beam search, Greedy search, and Beam-search multinomial samplingAbstract
Context. Pre-trained large language models are currently the driving force behind the development of not only NLP, but also deep learning systems in general. Model transformers are able to solve virtually all problems that currently exist, provided that certain requirements and training practices are met. In turn, words, sentences and texts are the basic and most important way of communication between intellectually developed beings. Of course, speech and texts are used to convey certain emotions, events, etc. One of the main ways of using language to describe experienced emotions is songs with lyrics. However, often due to the need to preserve rhyme and rhyming, the dimensions of verse lines, song structure, etc., artists have to use repetition of lines in the lyrics. In addition, the process of writing texts can be long.
Objective of the study is to develop information technology for generating the continuation of song texts based on the T5 machine learning model with (SA, specific author) and without (NSA, non-specific author) consideration of the author's style.
Method. Choosing a decoding strategy is important for the generation process. However, instead of favoring a particular strategy, the system will support multiple strategies. In particular, the following 8 strategies: Contrastive search, Top-p sampling, Top-k sampling, Multinomial sampling, Beam search, Diverse beam search, Greedy search, and Beam-search multinomial sampling.
Results. A machine learning model was developed to generate the continuation of song lyrics using large language models, in particular the T5 model, to accelerate, complement and increase the flexibility of the songwriting process.
Conclusions. The created model shows excellent results of generating the continuation of song texts on test data. Analysis of the raw data showed that the NSA model has less degrading results, while the SA model needs to balance the amount of text for each author. Several text metrics such as BLEU, RougeL and RougeN are calculated to quantitatively compare the results of the models and generation strategies. The value of the BLEU metric is the most variable, and its value varies significantly depending on the strategy. At the same time, Rouge metrics have less variability, a smaller range of values. For comparison, 8 different decoding methods for text generation, supported by the transformers library, were used. From all the results of the text comparison, it is clear that the metrically best method of song text generation is beam search and its variations, in particular beam sampling. Contrastive search usually outperformed the conventional greedy approach. The top-p and top-k methods are not clearly superior to each other, and in different situations gave different results.
References
Vaswani A., Shazeer N., Parmar N., Uszkoreit J., Jones L., Gomez A. N., Kaiser Ł., Polosukhin I. Attention Is All You Need, arXiv. Access mode: https://arxiv.org/abs/1706.03762
Raffel C., Shazeer N., Roberts A., Lee K., Narang S., Matena M., Zhou Y., Li W., Liu P. J. Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer, arXiv. Access mode: https://arxiv.org/abs/1910.10683
Hugging Face community. Text generation strategies. Access mode: https://huggingface.co/docs/transformers/v4.29.0/en/generati on_strategies
von Platen P. How to generate text: using different decoding methods for language generation with Transformers. Access mode: https://huggingface.co/blog/how-to-generate
Hugging Face community. T5. Access mode: https://huggingface.co/docs/transformers/model_doc/t5
Vijayakumar A. K., Cogswell M., Selvaraju R. R., Sun Q., Lee S., Crandall D., Batra D. Diverse Beam Search: Decoding Diverse Solutions from Neural Sequence Models, arXiv. Access mode: https://arxiv.org/abs/1610.02424
Hugging Face community. T5v1.1. Access mode: https://huggingface.co/docs/transformers/model_doc/t5v1.1
Lacoste A., Luccioni A., Schmidt V., Dandres T. Quantifying the Carbon Emissions of Machine Learning, arXiv. Access mode: https://arxiv.org/abs/1910.09700
Google. SentencePiece. Access mode: https://github.com/google/sentencepiece
Sennrich R. Subword Neural Machine Translation. Access mode: https://github.com/rsennrich/subword-nmt
Wu Y., Schuster M., Chen Z., Le Q. V., Norouzi M., Macherey W., Krikun M., Cao Y., Gao Q., Macherey K., Klingner J., Shah A., Johnson M., Liu X., Kaiser Ł., Gouws S., Kato Y., Kudo T., Kazawa H., Stevens K., Kurian G., Patil N., Wang W. , Young C., Smith J., Riesa J., Rudnick A., Vinyals O. , Corrado G., Hughes M., Dean J. Google’s Neural Machine Translation System: Bridging the Gap between Human and Machine Translation, arXiv. Access mode: https://arxiv.org/abs/1609.08144
Hugging Face community. Transformers. State-of-the-art Machine Learning for PyTorch, TensorFlow, and JAX. Access mode: https://huggingface.co/docs/transformers/index
Abadi M., Agarwal A., Barham P., Brevdo E., Chen Z., Citro C., Corrado G. S., Davis A. , Dean J., Devin M., Ghemawat S., Goodfellow I., Harp A., Irving G., Isard M., Jia Y., Jozefowicz R., Kaiser L., Kudlur M., Levenberg J., Mane D., Monga R., Moore S., Murray D., Olah C., Schuster M., Shlens J., Steiner B., Sutskever I., Talwar K., Tucker P., Vanhoucke V., Vasudevan V., Viegas F., Vinyals O., Warden P., Wattenberg M., Wicke M., Yu Y., Zheng X. TensorFlow: Large-Scale Machine Learning on Heterogeneous Systems, arXiv. Access mode: https://arxiv.org/abs/1603.04467
Shah D. Song Lyrics Dataset, Kaggle. Access mode: https://www.kaggle.com/datasets/deepshah16/song-lyricsdataset
Swift T., Dessner A. Long story short, Genius. Taylor Swift Music. Access mode: https://genius.com/Taylor-swift-longstory-short-lyrics
Fan A., Lewis M., Dauphin Y. Hierarchical Neural Story Generation, Association for Computational Linguistics : 56th Annual Meeting, Melbourne, Australia, July 2018 : proceedings. Melbourne, ACL, 2018, pp. 889–898. DOI: 10.18653/v1/p18-1082
Chiang T.-R., Chen Y.-N. Relating Neural Text Degeneration to Exposure Bias, Analyzing and Interpreting Neural Networks for NLP : the Fourth BlackboxNLP Workshop, Punta Cana, Dominican Republic, November 2021 : proceedings. Punta Cana, ACL, 2021, pp. 228–239. DOI: 10.18653/v1/2021.blackboxnlp-1.16
Su Y., Lan T., Wang Y., Yogatama D., Kong L., Collier N. A Contrastive Framework for Neural Text Generation, arXiv. Access mode: https://arxiv.org/abs/2202.06417
Paulus R., Xiong C., Socher R. A Deep Reinforced Model for Abstractive Summarization, arXiv. Access mode: https://arxiv.org/abs/1705.04304
Klein G., Kim Y., Deng Y., Senellart J., Rush A. OpenNMT: Open-Source Toolkit for Neural Machine Translation, System Demonstrations : Association for Computational Linguistics, Vancouver, Canada, July 2017 : proceedings. Vancouver, ACL, 2017, pp. 67–72. DOI: 10.18653/v1/p17-4012
Murray K., Chiang D. Correcting Length Bias in Neural Machine Translation, arXiv. Access mode: https://arxiv.org/abs/1808.10006
Mathur N., Baldwin T., Cohn T. Tangled up in BLEU: Reevaluating the Evaluation of Automatic Machine Translation Evaluation Metrics, Association for Computational Linguistics : 58th Annual Meeting, Online, July 2020 : proceedings. Online, ACL, 2020, pp. 4984– 4997. DOI: 10.18653/v1/2020.acl-main.448
Lin C.-Y. ROUGE: A Package for Automatic Evaluation of Summaries, Text Summarization Branches Out : Association for Computational Linguistics, Barcelona, Spain, July 2004 : proceedings. Barcelona, ACL, 2004, pp. 74–81. Access mode: https://aclanthology.org/W04-1013
KerasNLP. Access mode: https://keras.io/keras_nlp/
Prokipchuk O., Vysotska V. Ukrainian Language Tweets Analysis Technology for Public Opinion Dynamics Change Prediction Based on Machine Learning, Radio Electronics, Computer Science, Control, 2023, No. 2(63), pp. 103–116. DOI: 10.15588/1607-3274-2023-2-11
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2024 О. О. Медяков, В. А. Висоцька
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Creative Commons Licensing Notifications in the Copyright Notices
The journal allows the authors to hold the copyright without restrictions and to retain publishing rights without restrictions.
The journal allows readers to read, download, copy, distribute, print, search, or link to the full texts of its articles.
The journal allows to reuse and remixing of its content, in accordance with a Creative Commons license СС BY -SA.
Authors who publish with this journal agree to the following terms:
-
Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License CC BY-SA that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
-
Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
-
Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work.