UDC 004.8
DATA MINING AND DATABASE REENGINEERING: THE ALGORITHM OF RELATIONAL DATABASES ATTRIBUTES CLUSTERING
A. I. Baranchikov, Dr. in technical sciences, Professor at the Department of computing machines RSREU, Ryazan, Russia;
orcid.org/0000-0003-4133-7489, e-mail: This email address is being protected from spambots. You need JavaScript enabled to view it.
E. B. Fedosova, postgraduate student, RSREU, Ryazan, Russia;
orcid.org/0009-0006-1413-9910, e-mail: This email address is being protected from spambots. You need JavaScript enabled to view it.
The article proposes a solution to one of the reengineering relational databases problems, in particular, combining specialized attributes into semantic groups – clusters. It’s proposed to apply Data Mining methods to solve this problem. The aim of the work is development of an algorithm for clustering relational databases attributes. The Cluster_Define algorithm allows to divide existing attributes into clusters containing attributes that are similar in structure and semantics. Elements of cluster analysis and the k-means algorithm were used in this algorithm o select the optimal number of clusters, it is proposed to use (Silhouette Method). It’s proposed to use Silhouette Method to select the optimal number of cluster. The simplest metric in the body of the k-means clustering algorithm is the Euclidean distance.
Key words: : Data Mining, relational databases, clustering, attributes, k-means, cluster, reengineering, cluster analysis.
