Visualisation of "High p, small n" data
Date
2007
Authors
Pittelkow, Yvonne
Wilson, Susan
Journal Title
Journal ISSN
Volume Title
Publisher
Physica-Verlag GMBH
Abstract
Development of methods for visualisation of high-dimensional data where the number of observations, n, is small compared to the number of variables, p, is of increasing importance. One major application is the burgeoning field of microarray (gene expression) experiments. Because of their high cost, the number of chips (n) is O(10 - 102) while the number (p) of genes (including expressed sequence tags) on each chip is O(103 - 104). Based on synthetic data simulated in accord with current biological interpretation of microarray data, we have adapted the biplot that simultaneously plots the genes and the chips to display relevant experimental information. Other ordination techniques are also useful for visually exploring microarray data. The biological information that can be revealed by applying these exploratory, visual techniques is illustrated using data from gene expression experiments. When ordination methods, or dimension reduction methods such as PCA and its many variants, are used, in association with gene selection methods, it is well known that "selection bias" can result. We show an application of bootstrap methodology to ordination methods that can be used to account for this bias. Such methods are invaluable when visualization methods are used for pattern recognition, such as when identifying previously unknown sub-classes of tumours in molecular classification.
Description
Keywords
Citation
Collections
Source
Computational Statistics
Type
Journal article
Book Title
Entity type
Access Statement
License Rights
Restricted until
2037-12-31
Downloads
File
Description