Diffey, Simon Malcolm
Description
Linear mixed models and factor analytic mixed models are routinely applied to biological data arising from designed experiments. The preferred method for estimating the parameters associated with these models is residual maximum likelihood (REML). Most statistical software packages available for the REML estimation of parameters associated with linear mixed models and factor analytic mixed models implement a Newton-Raphson type algorithm such as the expected information algorithm or the...[Show more] average information algorithm. There are two problems with these algorithms. Firstly, successive iterations of these algorithms are not guaranteed to increase the residual log-likelihood function. Secondly, parameter updates may not remain in their parameter space. Either may result in the algorithm failing to converge to a solution. The REML expectation maximisation (REML EM) algorithm and the parameter expanded version of this algorithm (REML PX EM) are alternatives to Newton-Raphson type algorithms. Features of these two algorithms are that the residual log-likelihood may not decrease with successive iterations and parameter updates remain in their parameter space. Before the REML EM or REML PX EM algorithm can be considered practical alternatives to Newton-Raphson type algorithms two issues need to be addressed. Firstly, they can be notoriously slow to converge, particularly the REML EM algorithm. Secondly, compared to the average information algorithm, current implementations of these two algorithms are computationally more expensive at each iterate. This increased computational expense relates to calculating the trace of a matrix of the same order as the length of the observed data vector. This thesis addresses the issues of speed and computational efficiency of the REML EM and REML PX EM algorithms for linear mixed models and factor analytic mixed models. The REML EM and REML PX EM algorithms require specification of the incomplete and missing data. A new specification of the incomplete data for linear mixed models and factor analytic mixed models is introduced which is shown be computationally more efficient and we describe the conditions under which this new specification will have a faster rate of convergence. For factor analytic mixed models, model specification, a new parameter expansion, a new missing data specification, and the efficacy of using a less stringent stopping rule are considered. In an example plant breeding data set and in a simulation study it is shown that these innovations can drastically reduce the number of iterations to convergence. The improvements to the REML EM and REML PX EM algorithms presented in this thesis make these two algorithms, particularly the latter, more likely to be implemented alongside Newton-Raphson type algorithms in statistical software packages for linear mixed models and factor analytic mixed models. In such a situation this would provide users of these models a viable alternative in the event of a Newton-Raphson type algorithm failing.
Items in Open Research are protected by copyright, with all rights reserved, unless otherwise indicated.