On the Importance of Embeddings and Attention Mechanisms in Graph Neural Networks

Date

Authors

Zhou, Jieming

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Most real-world applications comprise graph-structured data, e.g., social networks. Unlike image data, graphs cannot be represented as regular grids in Euclidean space and processed by Convolutional Neural Networks (CNNs). To solve graph-based problems, Graph Neural Networks (GNNs) have become essential for graph representation learning for tasks like friend recommendations and solubility prediction. However, GNN embedding generation algorithms still need improvement. These methods often cannot capture node interactions or preserve graph structure. Additionally, they struggle to learn long-range dependencies due to lacking complex attention mechanisms. These shortcomings weaken their ability to perform tasks requiring a deep understanding of node relationships. In this thesis, we aim to tackle these issues by improving embedding generation techniques and incorporating innovative attention mechanisms to boost the overall performance of GNNs. In the first part of this thesis, we focus on improving embedding generation algorithms for GNNs. We introduce the Feature cOrrelation aGgregation (FOG) module to address the issue that most GNNs rely on first-order information and lack higher-order information during message-passing. Our FOG captures second-order feature correlations between nodes and their neighbours, allowing GNNs to generate more discriminative embeddings. As a result, our design significantly enhances GNNs' performance in node classification, edge classification, and graph regression tasks. We also design the Global Positional Encoding Network (GPEN), which provides advanced structural embeddings for GNNs. Current local-distance-aware methods (e.g., DE-GNN) only capture structural information within subgraphs, resulting in ambiguity when comparing subgraphs with identical structures. In contrast, our GPEN encodes global positional information by capturing relative distances between nodes and using a referential node set to establish a global positional embedding space, improving generalisation across graphs with varying levels of homophily. In the second part of this thesis, we delve into attention mechanisms and their application to GNNs. We begin by proposing the Cross-Correlated Attention Network (CCAN) for person re-identification in the image domain. We aim to address the limitations of previous attention designs that fail to capture the inherent interdependencies between the attended features. To enhance feature representation in complex visual environments, we leverage the spatial relationships between attention regions in two separate feature maps. Although not directly applicable to GNNs, the concepts developed provide valuable insights into attention mechanisms that could be adapted for graph-based tasks. Finally, we develop a diffusion model, Diff3DHPE, built on a spatial-temporal transformer backbone for 3D human pose estimation (3D HPE). Transformers have proven useful in modelling GNNs. Thus, current state-of-the-art approaches convert the 3D HPE task to a graph-based problem and use transformers to attend to long-distance dependencies in spatial and temporal dimensions. Although they achieve significantly higher accuracy, their performance is still impacted by imperfect 2D keypoint inputs. In contrast, our design improves overall accuracy via the reverse diffusion process to refine the 3D pose estimations. Moreover, we modify the attention module in the transformer to a band-pass filter to effectively relieve the over-smoothing issue introduced by deep GNNs in practice. In summary, we introduce novel methods for improving embedding generation in GNNs and present advanced attention mechanisms that expand the scope of GNNs for more complex tasks. Through comprehensive experiments, we practically prove that our contributions can provide a strong foundation for enhancing GNN designs across diverse domains and inspiration for future works.

Description

Keywords

Citation

Source

Book Title

Entity type

Access Statement

License Rights

Restricted until

Downloads

File
Description