Understanding complex systems : network approach to information filtering

Date

2012

Authors

Song, Won-Min

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Motivation: Recently, network theory has emerged as an effective tool to model complex systems by regarding the constituents as vertices, and relevant interactions as edges. Accordingly, the importance to construct representative networks from empirical data sets, which contain meaningful information of the complex systems, has become greater. However, this task is often alluded by: (a) presence of redundancy and heterogeneity, (b) simultaneous presence of two contradicting data structures: clusters and hierarchy, and (c) lack of expertise to interpret the data. In this thesis, I address these problems by utilizing a graph filtering technique called Planar Maximally Filtered Graph (PMFG), a maximal planar graph which preferentially includes edges with higher relevance between vertices, and develop a novel technique which detects meaningful data structures in unsupervised manner. Results: Specifically, I approach these problems in two-stages: (I) by validating planar graphs as adequate platforms for information filtering, and (II) by developing an effective graph-theoretic technique to extract the meaningful and relevant information. In the stage (I), I study hierarchies in maximal planar graphs by means of 3-cliques, that is a complete subgraph that links all three elements, and present a unique hierarchical structure called `bubble hierarchical tree' where bubbles are a special class of subgraphs linked by separating 3-cliques. I report bubble hierarchical trees for Apollonian networks and PMFG computed from a real-world system, and their meaningful organization patterns via the bubble topology. This is extended to a wider class of stochastic/deterministic planar graphs by building them in a bubble-by-bubble fashion. I report some of characteristic features of real-world complex networks such as scale-free degree distributions, degree correlations, and hierarchical organization of these graphs, and I show the capability of planar graphs to be used as platforms for effective information filtering procedures. In the stage (II), I utilize the bubble topology of PMFG to develop a novel clustering technique called Directed Bubble Hierarchical Tree (DBHT), a deterministic technique that extracts meaningful clustering and hierarchical structures simultaneously without any prior information. I apply the DBHT technique to a wide range of data sets: from various synthetic data, Iris classification, to various gene expression data for cancer classifications. I report that the performances of DBHT technique are competitive and/or this method outperforms some of recent state-of-art techniques in these cases. Particularly, I report significant gene clusters found by DBHT technique for gene expression data for various subtypes of Diffuse Large B-Cell Lymphoma, which play pivotal roles in discriminating and/or treating these subtypes. Lastly, I study the applicability of graph filtering techniques to data sets which possess clustering and/or hierarchical structures by utilizing synthetic data sets with adjustable parameters for noises, cluster sizes and hierarchy among the clusters. I explore some of widely used techniques such as Minimum Spanning Tree (MST), PMFG, k-Nearest Neighbors graph (kNN) and complete graph, and objectively evaluate their performances on the synthetic data sets. Interestingly, the results suggest that, filtered graphs with topological constraints (MST, PMFG) are particularly well-suited for detecting meaningful clustering and hierarchical structures. --provided by Candidate.

Description

Keywords

Citation

Source

Type

Thesis (PhD)

Book Title

Entity type

Access Statement

Open Access

License Rights

Restricted until

Downloads