This email address is being protected from spambots. You need JavaScript enabled to view it.
 
+7 (4912) 72-03-73
 
Интернет-портал РГРТУ: https://rsreu.ru

UDC 004.891

AUTOMATED PROGRAM TEXT ANALYSIS USING REPRESENTATIONS BASED ON MARKOV CHAINS AND EXTREME LEARNING MACHINES

A. V. Gorchakov, post-graduate student, Department of Corporate Information Systems, Institute of Information Technologies, MIREA – Russian Technological University, Moscow, Russia;

orcid.org/0000-0003-1977-8165, e-mail: This email address is being protected from spambots. You need JavaScript enabled to view it.

L. A. Demidova, Dr. Sc. (Tech.), Full Professor, Department of Corporate Information Systems, Institute of Information Technologies, MIREA – Russian Technological University, Moscow, Russia;

orcid.org/0000-0003-4516-3746, e-mail: This email address is being protected from spambots. You need JavaScript enabled to view it.

P. N. Sovietov, Ph.D. (Tech.), Associated Professor, Department of Corporate Information Systems, Institute of Information Technologies, MIREA – Russian Technological University, Moscow, Russia;

orcid.org/0000-0002-1039-2429, e-mail: This email address is being protected from spambots. You need JavaScript enabled to view it.

The digitalization of the economy leads to an increase in demand for software developers, and, as a result, to the massive nature of programming courses. The aim is to develop a module for analyzing solutions to automatically generated unique programming tasks in the Digital Teaching Assistant (DTA) system, which automates a massive Python programming course at RTU MIREA. To vectorize a program text, it is proposed to build an abstract syntax tree, and then convert the resulting tree into a Markov chain. To classify vector representations of program texts, it is proposed to use an extreme learning machine - a computationally efficient architecture of an artificial neural network. The labeling of the data set is carried out by the hierarchical clustering algorithm. The use of the developed module made it possible to automate the determination of methods for solving automatically generated problems in real time in programs sent to the DTA. The obtained information about the methods of solution can be used by programming instructors during the semester to identify gaps in the knowledge and skills of students. Statistics obtained from classifiers of vector representations of program texts are reported to students through the web interface of the DTA system.

Key words: : classification of program texts, code analysis, classification algorithm, artificial neural network, extreme learning machine, abstract syntax trees.

 Download