Building a semantic space of intentions using generative pre-trained models for solving the spam filtering task

doi:10.48612/jisp/rk44-9aab-nxha

Building a semantic space of intentions using generative pre-trained models for solving the spam filtering task

Network and telecommunication security

Authors:

Zhukov I. Yu. Balashova E. E. Mandrov A. P. Kravchenko N. D.

Abstract:

One of the key elements in solving spam message filtering is the text vectorization method. The article proposes a vectorization approach based on matching text to pairs of intentions. A list of intention pairs was identified and a synthetic dataset of textual utterances was generated. A neural network was designed and trained to determine the degree of belonging of each intention to the text expression at the model input. The developed method was tested on the spam message filtering task using logistic regression and the Enron dataset and SMS dataset