UDC 004.032.26
DEVELOPMENT AND APPLICATION OF METHODS OR RECOGNIZING NOISY AUDIO FILES USING NEURAL NETWORK TECHNIQUES
Yu. L. Leokhin, Dr. in technical sciences, full professor, deputy rector for scientific work, MTUCI, Moscow, Russia;
orcid.org/0000-0003-3321-4497, e-mail: This email address is being protected from spambots. You need JavaScript enabled to view it.
T. D. Fatkhulin, PhD (in technical sciences), associate professor, department of MC and IT, Moscow, Russia; orcid.org/0000-0003-0998-1055, e-mail: This email address is being protected from spambots. You need JavaScript enabled to view it.
M. V. Mentus, student, MTUCI, Moscow, Russia;
orcid.org/0009-0005-8300-6954, e-mail: This email address is being protected from spambots. You need JavaScript enabled to view it.
The problem of speech recognition in the presence of extraneous noises of various origins is considered. The aim of the work is to develop and evaluate the effectiveness of methods that make it possible to recognize speech in the presence of noise using neural network techniques. The relevance of the work is a significant expansion of the range of industries in which speech recognition has become much simpler and more efficient due to the development of neural network techniques. Software solutions “Whisper” and “Vosk” which allow transcribing (recognizing) speech are considered. A classification of audio noise is given, and existing methods of dealing with them are described. The influence of noise on training a speech recognition system is shown. Methods for training speech recognition system using synthetically generated dataset with noise have been developed. A data noise module was designed and developed, and a test bench was assembled. Approbation of the developed methods is given. Finally, the results of analyzing the data obtained during the experiments are presented and conclusions are drawn
Key words: : speech recognition systems, speech synthesis systems, neural network training, noise, efficiency, Word Error Rate, dataset generation, noisy data.