Visualisation of "High p, small n" data

Date

2007

Authors

Pittelkow, Yvonne
Wilson, Susan

Journal Title

Journal ISSN

Volume Title

Publisher

Physica-Verlag GMBH

Abstract

Development of methods for visualisation of high-dimensional data where the number of observations, n, is small compared to the number of variables, p, is of increasing importance. One major application is the burgeoning field of microarray (gene expression) experiments. Because of their high cost, the number of chips (n) is O(10 - 102) while the number (p) of genes (including expressed sequence tags) on each chip is O(103 - 104). Based on synthetic data simulated in accord with current biological interpretation of microarray data, we have adapted the biplot that simultaneously plots the genes and the chips to display relevant experimental information. Other ordination techniques are also useful for visually exploring microarray data. The biological information that can be revealed by applying these exploratory, visual techniques is illustrated using data from gene expression experiments. When ordination methods, or dimension reduction methods such as PCA and its many variants, are used, in association with gene selection methods, it is well known that "selection bias" can result. We show an application of bootstrap methodology to ordination methods that can be used to account for this bias. Such methods are invaluable when visualization methods are used for pattern recognition, such as when identifying previously unknown sub-classes of tumours in molecular classification.

Description

Keywords

Citation

Source

Computational Statistics

Type

Journal article

Book Title

Entity type

Access Statement

License Rights

Restricted until

2037-12-31