Generalised Linear Mixed Models: Asymptotic Theory and Novel Applications

Date

Authors

Ning, Nickson

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Generalised linear mixed models (GLMMs) are a commonly used class of models for modelling dependent and clustered data, for example longitudinal and spatial data that often arise in areas such as medical science, official and survey statistics, and ecology and environmental studies. This thesis investigates three theoretical and methodological aspects of GLMMs, including asymptotic properties of existing point estimators used in independent cluster mixed models, asymptotic properties of a commonly used interval estimator and inferential procedure in independent cluster mixed models, and a novel application of spatial GLMMs to simultaneously correct for covariate measurement error and spatial autocorrelation. In Chapter 2, we examine asymptotic properties of the penalised quasi-likelihood estimator for fixed and random effects in an independent cluster GLMM, under a framework where both the number of clusters and cluster sizes tend to infinity. Despite the common usage of such models and estimators, to date there has been very limited work on the theoretical properties of the penalised quasi-likelihood estimator, particularly in the way of asymptotic distributional results. The finite sample performance of our novel theoretical results is validated through a simulation study. In Chapter 3, we reconcile our theoretical results established in Chapter 2 with the empirical performance of standard errors provided by the commonly used \texttt{R} package \texttt{glmmTMB} for fitting GLMMs. In doing so, we provide a formal theoretical justification for the standard error formula used in the \texttt{glmmTMB} package, with the caveat that interpretation should be made under a repeated sampling framework conditional on a finite subset of the random effects. We derive similar results for other commonly used measures of random effect prediction uncertainty, such as the unconditional mean squared error of prediction and conditional mean squared error of prediction. Furthermore, we clarify the similarities and differences between all of these inferential procedures as well as the distinction between conditional and unconditional inference. Finally, we highlight and explain the drawbacks of the standard error formula used by another popular \texttt{R} package, \texttt{lme4}. In Chapter 4, we propose a method for analysing non-continuous geostatistical data that simultaneously corrects for covariate measurement error, and accounts for the spatial autocorrelation of the responses (via a spatial GLMM). The former is done without needing validation data to estimate the covariate measurement error variance. The method is motivated by data from the North American Breeding Bird Survey (BBS), and assumes the true covariates to be realisations of a smooth underlying spatial field. Due to the uncertainty involved in estimating the true underlying covariates, the standard error formula provided by \texttt{glmmTMB} is not an appropriate estimate of the uncertainty of our proposed estimator; thus, we derive an appropriate standard error for our proposed method. Numerical studies and an application to BBS data demonstrate that our proposed method is effective at correcting for the covariate measurement error and spatial autocorrelation; that is, our proposed estimators no longer exhibited attenuation or bias in large samples. The work in this thesis emphasises the need for theoretical aspects of generalised linear mixed models to be further explored, and sets up a foundation for doing so. The theoretical and empirical findings themselves provide a novel way for prediction intervals to be constructed for random effects, and encourages practitioners to think carefully about the inferential framework they wish to work under.

Description

Keywords

Citation

Source

Book Title

Entity type

Access Statement

License Rights

Restricted until

Downloads

File
Description