Effective and Scalable Representation Learning for Graphs: From Theory to Applications

Zhu, Hao

Effective and Scalable Representation Learning for Graphs: From Theory to Applications

Authors

Zhu, Hao

Abstract

Graphs are ubiquitous in the real world, representing a wide range of relational data structures across numerous domains. However, developing effective and scalable machine learning models that can capture the intrinsic properties and inductive biases of graph-structured data remains a significant challenge. Traditional representation learning algorithms for graphs face several limitations, including scalability issues for large-scale graphs, oversmoothing in deep architectures, overreliance on labeled data, and limited applicability to tasks beyond graph domains. This thesis presents a comprehensive body of work that tackles the challenges of representation learning for graphs, focusing on oversmoothing, effectiveness, and scalability. In the first part, Simple Spectral Graph Convolution (S2GC), a variant of Graph Convolutional Networks (GCNs) derived from a modified Markov Diffusion Kernel, is proposed. S2GC demonstrates superior computational complexity and competitive performance compared to classic Graph Neural Network (GNN) architectures across various tasks. Furthermore, I introduce random RangE FInder-based Network Embedding (REFINE), a highly efficient graph embedding approach that combines randomized range finders and S2GC, enabling scalable matrix factorization for large-scale graphs. In the second part, the classic Laplacian Eigenmaps (LE) framework is extended to incorporate contrastive learning principles, resulting in Contrastive Laplacian Eigenmaps (COLES). COLES offers an efficient and scalable closed-form solution for unsupervised graph representation learning on large-scale graphs. Building upon COLES, Generalized Laplacian Eigenmaps (GLEN), a principled framework, is proposed and can learn graph representations by maximizing the rank difference between the total scatter matrix and the within-class scatter matrix, ensuring optimal class separation. These GNN-based architectures achieve unprecedented performance improvements over traditional methods in various tasks, including node classification and clustering. In the third part, I demonstrate the versatility of graph representation learning by presenting applications across diverse domains. Leveraging COLES, I develop unsupErvised discriminAnt Subspace lEarning (EASE), a similarity matrix with a block-diagonal prior that improves transductive few-shot learning performance by learning a discriminative subspace tailored to the given support and query data. Additionally, Prototype-based Label Propagation (ProtoLP), a novel approach, is proposed and able to dynamically construct graphs based on features, enabling fast transductive inference close to the speed of inductive methods while capturing complex interactions between prototypes and query samples. The research presented in this thesis addresses the fundamental challenges of effective and scalable graph representation learning, providing theoretical insights, novel algorithms, and practical applications that harness the power of graph-structured data across numerous domains.

URI

https://hdl.handle.net/1885/733734639

Collections

Open Access Theses

Type

Thesis (PhD)

Downloads

File

Description

Effective_and_Scalable_Representation_Learning_for_Graphs__From_Theory_to_Applications__PhD_Thesis_ (5).pdf (3.57 MB)

Thesis Material

Full item page

Cultural advice

Effective and Scalable Representation Learning for Graphs: From Theory to Applications

Date

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Description

Keywords

Citation

URI

Collections

Source

Type

Book Title

Entity type

Access Statement

License Rights

DOI

Restricted until

Downloads