Analysis of decompiled program code using abstract syntax trees
Authors:
Abstract:
The paper proposes a method of preprocessing fragments of binary code for the task of detection their similarity using machine learning algorithms. The method is based on analysis of pseudocode, which is retrieved from decompilation process. The pseudocode is preprocessed with usage of attributed abstract syntax trees. Evaluation of the method indicates its efficiency in binary code similarity detection task due to semantic vectors used for abstract syntax tree modification.