Cultural advice

The Australian National University acknowledges, celebrates and pays our respects to the Ngunnawal and Ngambri people of the Canberra region and to all First Nations Australians on whose traditional lands we meet and work, and whose cultures are among the oldest continuing cultures in human history.

Aboriginal and Torres Strait Islander peoples are advised that ANU Library collections may include images, names, voices, and other representations of deceased persons.

Material in the collection may contain terms, language or views that reflect the period in which the item was created and may be considered inappropriate today.

Recent developments of copula-based models to handle missing data of mixed-type in multivariate analysis

Loading...
Thumbnail Image

Date

Authors

Wang, Jiali

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

In this thesis, we propose innovative imputation models to handle missing data of mixed-type. Our imputation models can handle 1) multilevel data sets through random effects; 2) heterogeneity in a population by specifying infinite mixture models; and 3) a large number of variables using graphical lasso methods. Two clinical data sets, a randomised control trial of acute stroke care patients and a survey of menstrual disorder among teenagers, are used for the real data application examples, although we believe that the proposed methods can also be applied to other data sets with similar structures. In Chapter 2, we propose a copula based method to handle missing values in multivariate data of mixed type in multilevel data sets. Building upon the extended rank likelihood approach combined with a multinomial probit model formulation, our model is a latent variable model which is able to capture the relationship among variables of different types as well as accounting for the clustering structure. Our proposed method is evaluated through simulations using both artificial data and the acute stroke data set to compare it with several conventional methods of handling missing data. We conclude that our proposed copula based imputation model for mixed type variables achieves good imputation accuracy and recovery of parameters in some models of interest, and that adding random effects enhances performance when the clustering effect is strong. In Chapter 3, we consider an infinite mixture of elliptical copulas induced by a Dirichlet process mixture to build a flexible copula function as the imputation model. A slice sampling algorithm is used in conjunction with a prior parallel tempering algorithm to sample from the infinite dimensional parameter space and to overcome the mixing issue when sampling from a multimodal distribution. Using simulations, we demonstrate that the infinite mixture copula model provides a better overall fit compared to their single component counterparts, and performs better at capturing tail dependence features of the data. The application of this model is also demonstrated using the acute stroke data set. In Chapter 4, we propose a Gaussian copula model with a graphical lasso prior to analyse the conditional associations among 100+ questions in a study of menstrual disorder among teenagers. Our data come from a large population based study of menstrual disorder in Australian teenagers conducted in 2005 and 2016 respectively. We also compare cohort differences of menstruation over the 11-year interval and use the model to predict girls with a higher risk of developing endometriosis. The model is based on the model proposed in Chapter 2, but with a graphical lasso prior to shrink the elements in the precision matrix of the Gaussian distribution to encourage a sparse graphical structure. The level of shrinkage is adaptable from the strength of the conditional associations among questions in the survey. We find that menstrual disturbance is more pronouncedly reported in 2016 than a decade ago, and the questions in the questionnaire form several clusters with strong associations.

Description

Citation

Source

Book Title

Entity type

Access Statement

License Rights

Restricted until

Downloads

abcd