Exploring Shortest Paths on Large-Scale Networks
Abstract
Networks, which are gaining much popularity in various disciplines, model connec-tions between different real-world entities. Based on graph theory, we gain insightinto a series of problems for understanding networks.
Shortest path is among themost fundamental notions studied in graph theory. By exploring shortest path struc-ture, we can better understand the connections between entities. However, it is oftencomputationally expensive to compute shortest paths on large-scale networks whichare common nowadays.This thesis addresses two problems on large-scale networks in relating to shortestpaths: (1) How are two vertices connected in a graph? We first define the notion ofshortest path graph which contains exactly all shortest paths between two vertices.This notion empowers us to make full use of shortest path structures for analysingthe connection between vertices. We formalize the shortest-path-graph problem andpropose a novel method, called Query-by-Sketch, which can efficiently leverage of-fline labelling to guide online searching through a fast-sketching process for solvingthe shortest-path-graph problem on complex networks. (2) How is one source vertexconnected to other vertices in a graph? Enumerating all shortest paths between a source vertex and all other vertices is less informative due the fact that the number of shortest paths is excessive. In addition to this, coverage centrality fails to measure how influ-ential a vertex is in the information flow from a source vertex. Thus we extend thenotion of coverage centrality to relative coverage, to measure the importance a vertexis w.r.t a source vertex. Then we formalize the top-k relative coverage problem anddevelop an efficient method to answer top-k relative coverage queries in a reducedspace, and design a bit-parallel optimization to speed up the computation of relativecoverage.The thesis also theoretically discusses the complexity of proposed methods, andpresents the experimental results of proposed methods using 12 and 6 real-worlddatasets for two problems respectively, which illustrates the efficiency and scalabilityof proposed metho
Description
Keywords
Citation
Collections
Source
Type
Book Title
Entity type
Access Statement
License Rights
Restricted until
Downloads
File
Description
Thesis Material