Sparse descriptors for whole graph embedding and dictionary based feature ranking

Loading...
Thumbnail Image

Date

Journal Title

Journal ISSN

Volume Title

Publisher

Montana State University - Bozeman, College of Engineering

Abstract

Graph representation has gained wide popularity as a data representation method in many applications. Unfortunately, most data processing techniques cannot be applied directly to a graph structure. Therefore, graph embedding methods are frequently used to convert graphs to vectors. While such methods are essential in standard data processing pipelines, they often result in complicated, nonlinear, and high-dimensional mappings. The goal of this dissertation is to utilize sparse dictionary learning techniques in the context of graph embedding. In contrast to traditional graph embedding methods, sparse representations are linear by design. This linearity also leads to intuition, since the building blocks of a sparse dictionary are directly related to the input space. Despite the potential advantages of sparse processing and the ubiquitousness of sparsity in other signal processing domains, its applications in graph embedding are not well studied. This dissertation consists of three main tasks. First, a novel sparse graph descriptor algorithm is presented, inspired by the Graph2Vec graph embedding algorithm. Second, sparse representation-based feature ranking metrics are deployed to identify important sub- tree structures of the graphs that can be used to define a dictionary. The developed embedding algorithm and feature-ranking metrics are compared to existing graph embedding methods and feature-ranking algorithms on several typical benchmark graph datasets. Finally, these sparse representation-based techniques are applied to control flow graphs of binary files to detect malware, showing the utility of the developed algorithms.

Description

Keywords

Citation

Endorsement

Review

Supplemented By

Referenced By