UDC 007:681.512.2
EXTRACTING FACTS FROM NATURAL LANGUAGE TEXTS BY METHOD OF UNIFICATION OF SEMANTIC PATTERN
I. Yu. Kashirin, Dr. Sc. (Tech), Professor, Department of Computational and Applied Mathematics, RSREU, Ryazan, Russia;
orcid.org/0000-0003-1694-7410, e-mail: This email address is being protected from spambots. You need JavaScript enabled to view it.
The original technology of designing and applying semantic patterns for processing natural language constructions is considered. The method of semantic patterns unification, called i-patterns, is described con structively. The technology uses tuples of words formed from various knowledge base relationships and is used to extract concise facts from complex sentences of mass media. An end-to-end example of software im plementation in Python v.3.10 and Anaconda v.2.1 environments is considered. Software implementation of the technology uses external software libraries SpaCy, WordNet, RuWord Net, Wiki-ru-WordNet, FrameNet, stanza, Yargy, as well as search retriever, Python i-patterns with an orig inal unification algorithm developed by the author of the article. The effectiveness of the technology present ed is confirmed by a series of practical experiments using the example of solving the problem of accumulat ing a training corpus for language neural network BERT models. The results of the study will be useful in classifying media materials into reliable and false ones. The aim of the work as a scientific article is to present a new intelligent method of unifying semantic patterns to extract concise facts from complex political articles to the experts in AI field.
Key words: : Bert models, fact extraction, semantic patterns, retrievers, political news, natural language analysis, deep learning models.