Difference between revisions of "Algorithms"
| (36 intermediate revisions by 2 users not shown) | |||
| Line 4: | Line 4: | ||
| The iterative solver uses the [https://en.wikipedia.org/wiki/Conjugate_gradient_method#The_preconditioned_conjugate_gradient_method preconditioned conjugate gradient method] and is {{lmt}}'s default solver. It does not require the explicit construction of any mixed model equation, and is therefore less resource demanding than the direct solver. That is, many models which cannot be solved using the direct solver can still be solved using the iterative solver. Even for small models the iterative solver usually outperforms the direct solver in terms of total processing time. | The iterative solver uses the [https://en.wikipedia.org/wiki/Conjugate_gradient_method#The_preconditioned_conjugate_gradient_method preconditioned conjugate gradient method] and is {{lmt}}'s default solver. It does not require the explicit construction of any mixed model equation, and is therefore less resource demanding than the direct solver. That is, many models which cannot be solved using the direct solver can still be solved using the iterative solver. Even for small models the iterative solver usually outperforms the direct solver in terms of total processing time. | ||
| Whether the iterative solver has converged in round $$i$$ can be evaluated with convergence criterions $$log_e\left(\sqrt{\frac{||(Cx_i-b)||}{||b||}}\right)<t$$ or $$log_e\left(\sqrt{\frac{||(x_{i}-x_{i-1})||'}{||x_{i-1}||}}\right)<t$$, where $$C$$ is the mixed-model coefficient matrix, $$x_i$$ is the solution vector in round $$i$$, $$b$$ is the right-hand side and $$t$$ is the convergence threshold which defaults to -18.42, which is $$log_e(10^{-9})$$. | |||
| $$ | |||
| $$ | |||
| ===Direct solver=== | ===Direct solver=== | ||
| Line 13: | Line 10: | ||
| ==Variance component estimation== | ==Variance component estimation== | ||
| For random factors {{lmt}} supports variance of structure [https://en.wikipedia.org/wiki/Kronecker_product $$\Gamma\otimes\Sigma$$], where $$\Sigma$$ is an dense symmetric positive definite matrix to be estimated. For residuals {{lmt}} supports variance structures $$I\otimes\Sigma$$ and $$\Theta_L(I_{n_{observations}})\Theta_L^{'}$$, where $$\Theta$$ is symmetric positive definite [https://en.wikipedia.org/wiki/Block_matrix#Block_diagonal_matrices block-diagonal matrix] of $$n$$ symmetric positive definite martices $$\Sigma_i, i=1,..,n$$, $$\Theta_L$$ is the lower [https://en.wikipedia.org/wiki/Cholesky_decomposition Cholesky factor] of $$\Theta$$ and $$I_{n_{observations}}$$ is an identity matrix of dimensions equal to the total number of observations across all traits. Note that the number of records associated to a particular $$\Sigma_i$$ should be sufficient to facilitate its estimation. | |||
| ===Gibbs sampling=== | ===Gibbs sampling=== | ||
| ====Single pass Gibbs sampling==== | ====Single pass Gibbs sampling==== | ||
| {{lmt}}'s single pass Gibbs sampling algorithm is described in <ref name="Sorensen2002" />. In short, all location parameters are drawn from their joint conditional posterior distribution. Note that this requires solving the mixed model equation system once per iteration which usually leads to a substantial increase in processing time. | {{lmt}}'s single pass Gibbs sampling algorithm is described in <ref name="Sorensen2002" />. In short, all location parameters are drawn from their joint conditional posterior distribution. Note that this requires solving the mixed model equation system once per iteration which usually leads to a substantial increase in processing time. Note that ssSNBPLUP models are not supported. | ||
| ====Blocked Gibbs sampling==== | ====Blocked Gibbs sampling==== | ||
| For random factors {{lmt}}'s blocked Gibbs sampler draws correlated location parameters within factor level from their joint conditional posterior distribution. Location parameters of fixed factors are drawn in scalar mode from their fully conditional posterior. | For random factors {{lmt}}'s blocked Gibbs sampler draws correlated location parameters within factor level from their joint conditional posterior distribution. Location parameters of fixed factors are drawn in scalar mode from their fully conditional posterior. Note that ssGTBLUP and ssSNPBLUP models are not supported. | ||
| ===Restricted Maximum Likelyhood=== | ===Restricted Maximum Likelyhood=== | ||
| ====MC-EM-REML==== | ====MC-EM-REML==== | ||
| {{lmt}} provides a monte-carlo expectation-maximisation REML algorithms which uses the preconditioned gradient solver for solving the mixed model equations and a blocked Gibbs sampler to sample the necessary traces<ref name="Harville2004" />. | {{lmt}} provides a monte-carlo expectation-maximisation REML algorithms which uses the preconditioned gradient solver for solving the mixed model equations and a blocked Gibbs sampler to sample the necessary traces<ref name="Harville2004" />. Note that ssSNPBLUP and ssGTBLUP models are not supported. | ||
| The MC-EM-REML convergence criterion is $$log_e\left(\sqrt{\frac{||(p_{i}-p_{i-1})||}{||p_{i-1}||}}\right)$$ where $$p$$ is the parameter vector and $$i$$ is the iteration counter. | |||
| ====Average information (AI)-REML==== | |||
| {{lmt}} provides the calculation of variance components using average information REML <ref name="Johnson1995" />, <ref name="Gilmour1995" /> and <ref name="Jensen1997" />. | |||
| REML estimates of co-variance matrices can be derived using the phenotypic co-variance matrix $$V$$ or the mixed-model equations system coefficient matrix C. | |||
| {{lmt}} provides three different AI-REML convergence criterions: | |||
| * the relative change of the log-likelihood calculated as $$log_e\left(\sqrt{\frac{||(l_{i}-l_{i-1})||}{||l_{i-1}||}}\right)$$ where $$l$$ is the log-likelihood and $$i$$ is the iteration counter. | |||
| * $$log_e\left(\sqrt{\frac{||(p_{i}-p_{i-1})||}{||p_{i-1}||}}\right)$$ where $$p$$ is the parameter vector and $$i$$ is the iteration counter. | |||
| * $$log_e\left(\sqrt{||g_{i}||}\right)$$ where $$g$$ is the gradient vector and $$i$$ is the iteration counter. | |||
| =====AI-REML Iteration mechanism===== | |||
| For finding the next parameter vector {{lmt}} use a mixture of the [https://en.wikipedia.org/wiki/Levenberg%E2%80%93Marquardt_algorithm Levenberg–Marquardt algorithm] and ordinary step length down scaling, which can be described as $$\zeta_i=\zeta_{i-1}+\alpha(\Omega_i+I\beta)^{-1}\xi_i$$, where $$\zeta_i$$ and $$\zeta_{i-1}$$ are parameter vector is round $$i$$ and $$i-1$$, respectively, $$\alpha$$ is the step length, $$\Omega$$ is the AI matrix, $$\xi$$ is the Jacobian and $$\beta$$ is an arbitrary real number $$\geq$$0. | |||
| Once $$\Omega_i$$ and $$\xi_i$$ are derived {{lmt}} will calculate $$\zeta_i$$ using $$\alpha=1$$ and $$\beta=0$$. If $$\zeta_i$$ is not valid(i.e. the $$\Sigma$$ matrices are not positive definite), it will use the Levenberg-Marquardt algorithm to find a valid $$\zeta_i$$ by setting $$\beta$$ to an ever increasing number. {{lmt}} will try this for 10000 iterations. If $$\zeta_i$$ is still not valid {{lmt}} will return to $$\Omega_{i-1}$$ and $$\xi_{i-1}$$ and set $$\alpha=\alpha*0.5$$. | |||
| This can lead to a situation where {{lmt}} down-scales $$\alpha$$ to near zero and the convergence criterion appears to report convergence. It is therefore crucial to check the iteration output. True convergence is only achieved if $$\alpha=1$$ and $$\beta=0$$. | |||
| =====AI-REML-C===== | |||
| {{lmt}} supports AI-REML-C, which relies on the construction and factorization of the mixed-model equations system coefficient matrix C. | |||
| Note that ssSNPBLUP and ssGTBLUP models are not supported. Further, it is not advisable to use airemlc for ssGBLUP models with several correlated genetic effects. | |||
| ===Fixation of Sigma matrix elements=== | |||
| Elements of $$\Sigma$$ matrices can be exempted from re-estimation in two ways: | |||
| #providing a boolean [[Parameter_file_elements#<sigma>|mask matrix]] $$B$$ where elements set to "T" are related to elements in $$\Sigma$$ which should be regarded as fixed, or by | |||
| #setting a diagonal element in $$\Sigma$$ desired to be fixed to 1.0 to 1.0, or by | |||
| #setting an off-diagonal element in $$\Sigma$$ desired to be fixed to 0.0 to 0.0. | |||
| Note that in case exemption is communicated via options 2 and 3 the $$\Sigma$$ matrix provided at start must still be positive definite. Further note that using option 1 overrides all information contained in $$\Sigma$$. That is if $$\Sigma[1,1]$$ is set to 1.0 but $$B$$[1,1] is set to false, $$\Sigma$$[1,1] is not exempt. | |||
| == | ==Elements of the inverse of the mixed model coefficient matrix== | ||
| In principle {{lmt}} can generate any element of the inverse mixed model coefficient matrix. However, the user interface is currently limited to the diagonal elements for fixed factors and the diagonal blocks for random factors. These elements can either be sampled or obtained accurately via solving. | |||
| ===Gibbs Sampling=== | ===Gibbs Sampling=== | ||
| Following the approach of Harville(1999)<ref name="Harville1999" /> {{lmt}} can sample for fixed factors the diagonal elements of the inverse of the mixed model coefficient matrix, for random factors the diagonal blocks of the inverse of the coefficient matrix where the block size is determined by the dimension of the related $$\Sigma$$ matrix. The blocks are the prediction error co-variance matrices of the factor levels of correlated sub-factors. When sampling prediction error variances {{lmt}} can run many Gibbs chains in parallel allowing to exploit multi-core hardware architecture. However, it is recommended to specify not more chains than the number of available '''real''' cores excluding hyper-threading technology. | Following the approach of Harville(1999)<ref name="Harville1999" /> {{lmt}} can sample for fixed factors the diagonal elements of the inverse of the mixed model coefficient matrix, for random factors the diagonal blocks of the inverse of the coefficient matrix where the block size is determined by the dimension of the related $$\Sigma$$ matrix. The blocks are the prediction error co-variance matrices of the factor levels of correlated sub-factors. When sampling prediction error variances {{lmt}} can run many Gibbs chains in parallel allowing to exploit multi-core hardware architecture. However, it is recommended to specify not more chains than the number of available '''real''' cores excluding hyper-threading technology. | ||
| ===Solving=== | ===Solving=== | ||
| {{lmt}} can obtain elements of the inverse of the coefficient matrix via solving the mixed model equations. This method is currently only supported for the diagonal prediction error co-variance blocks of random factors, where the block size is determined by the dimension of the related $$\Sigma$$ matrix. For this algorithm {{lmt}} can utilize either the [[Iterative solver]] or the [[Direct solver]]. | {{lmt}} can obtain elements of the inverse of the coefficient matrix via solving the mixed model equations. This method is currently only supported for the diagonal prediction error co-variance blocks of random factors, where the block size is determined by the dimension of the related $$\Sigma$$ matrix. For this algorithm {{lmt}} can utilize either the [[#Iterative solver|iterative solver]] or the [[#Direct solver|direct solver]]. | ||
| ==Iterative inbreeding== | |||
| {{lmt}} supports the iterative calculation of inbreeding coefficients as described in VanRaden(1992)<ref name="VanRaden1992" />. | |||
| ==References== | ==References== | ||
| Line 34: | Line 72: | ||
| <ref name="Harville1999">David A. Harville; Use of the Gibbs sampler to invert large, possibly sparse, positive definite matrices; Linear Algebra and its Applications; 1999</ref> | <ref name="Harville1999">David A. Harville; Use of the Gibbs sampler to invert large, possibly sparse, positive definite matrices; Linear Algebra and its Applications; 1999</ref> | ||
| <ref name="Harville2004">David A. Harville; Making REML computationally feasible for large data sets: use of the Gibbs sampler; Journal of Statistical Computation & Simulation; 2004</ref> | <ref name="Harville2004">David A. Harville; Making REML computationally feasible for large data sets: use of the Gibbs sampler; Journal of Statistical Computation & Simulation; 2004</ref> | ||
| <ref name="Jensen1997">J. Jensen et. al.; Residual maximum likelihood estimation of (co) variance components in multivariate mixed linear models using average information; Indian Society of Agricultural Statistics; 1997</ref> | |||
| <ref name="Gilmour1995">A. Gilmour et. al.; Average information REML: an efficient algorithm for variance parameter estimation in linear mixed models; Biometrics; 1995</ref> | |||
| <ref name="Johnson1995">D.L. Johnson and R. Thompson; Restricted maximum likelihood estimation of variance components for univariate animal models using sparse matrix techniques and average information; Journal of Dairy Science; 1995</ref> | |||
| <ref name="VanRaden1992">PM VanRanden; Accounting for Inbreeding and Crossbreeding in Genetic Evaluation of Large Populations; Journal of Dairy Science; 1992</ref> | |||
| </references> | </references> | ||
Latest revision as of 00:35, 4 November 2022
Solving Linear Mixed model Equations
lmt supports two types of solver for solving MME's: a direct solver and an iterative solver
Iterative solver
The iterative solver uses the preconditioned conjugate gradient method and is lmt's default solver. It does not require the explicit construction of any mixed model equation, and is therefore less resource demanding than the direct solver. That is, many models which cannot be solved using the direct solver can still be solved using the iterative solver. Even for small models the iterative solver usually outperforms the direct solver in terms of total processing time.
Whether the iterative solver has converged in round $$i$$ can be evaluated with convergence criterions $$log_e\left(\sqrt{\frac{||(Cx_i-b)||}{||b||}}\right)<t$$ or $$log_e\left(\sqrt{\frac{||(x_{i}-x_{i-1})||'}{||x_{i-1}||}}\right)<t$$, where $$C$$ is the mixed-model coefficient matrix, $$x_i$$ is the solution vector in round $$i$$, $$b$$ is the right-hand side and $$t$$ is the convergence threshold which defaults to -18.42, which is $$log_e(10^{-9})$$.
Direct solver
The direct solver requires the mixed model coefficient matrix to be build and all Kronecker products to be resolved. This can be quite memory demanding and should therefore be used carefully. The direct solver uses a Cholesky decomposition and forward-backward-substitution to solve the mixed model equation system, where especially the decomposition step can be very resource demanding and time consuming.
Variance component estimation
For random factors lmt supports variance of structure $$\Gamma\otimes\Sigma$$, where $$\Sigma$$ is an dense symmetric positive definite matrix to be estimated. For residuals lmt supports variance structures $$I\otimes\Sigma$$ and $$\Theta_L(I_{n_{observations}})\Theta_L^{'}$$, where $$\Theta$$ is symmetric positive definite block-diagonal matrix of $$n$$ symmetric positive definite martices $$\Sigma_i, i=1,..,n$$, $$\Theta_L$$ is the lower Cholesky factor of $$\Theta$$ and $$I_{n_{observations}}$$ is an identity matrix of dimensions equal to the total number of observations across all traits. Note that the number of records associated to a particular $$\Sigma_i$$ should be sufficient to facilitate its estimation.
Gibbs sampling
Single pass Gibbs sampling
lmt's single pass Gibbs sampling algorithm is described in [1]. In short, all location parameters are drawn from their joint conditional posterior distribution. Note that this requires solving the mixed model equation system once per iteration which usually leads to a substantial increase in processing time. Note that ssSNBPLUP models are not supported.
Blocked Gibbs sampling
For random factors lmt's blocked Gibbs sampler draws correlated location parameters within factor level from their joint conditional posterior distribution. Location parameters of fixed factors are drawn in scalar mode from their fully conditional posterior. Note that ssGTBLUP and ssSNPBLUP models are not supported.
Restricted Maximum Likelyhood
MC-EM-REML
lmt provides a monte-carlo expectation-maximisation REML algorithms which uses the preconditioned gradient solver for solving the mixed model equations and a blocked Gibbs sampler to sample the necessary traces[2]. Note that ssSNPBLUP and ssGTBLUP models are not supported.
The MC-EM-REML convergence criterion is $$log_e\left(\sqrt{\frac{||(p_{i}-p_{i-1})||}{||p_{i-1}||}}\right)$$ where $$p$$ is the parameter vector and $$i$$ is the iteration counter.
Average information (AI)-REML
lmt provides the calculation of variance components using average information REML [3], [4] and [5]. REML estimates of co-variance matrices can be derived using the phenotypic co-variance matrix $$V$$ or the mixed-model equations system coefficient matrix C.
lmt provides three different AI-REML convergence criterions:
- the relative change of the log-likelihood calculated as $$log_e\left(\sqrt{\frac{||(l_{i}-l_{i-1})||}{||l_{i-1}||}}\right)$$ where $$l$$ is the log-likelihood and $$i$$ is the iteration counter.
- $$log_e\left(\sqrt{\frac{||(p_{i}-p_{i-1})||}{||p_{i-1}||}}\right)$$ where $$p$$ is the parameter vector and $$i$$ is the iteration counter.
- $$log_e\left(\sqrt{||g_{i}||}\right)$$ where $$g$$ is the gradient vector and $$i$$ is the iteration counter.
AI-REML Iteration mechanism
For finding the next parameter vector lmt use a mixture of the Levenberg–Marquardt algorithm and ordinary step length down scaling, which can be described as $$\zeta_i=\zeta_{i-1}+\alpha(\Omega_i+I\beta)^{-1}\xi_i$$, where $$\zeta_i$$ and $$\zeta_{i-1}$$ are parameter vector is round $$i$$ and $$i-1$$, respectively, $$\alpha$$ is the step length, $$\Omega$$ is the AI matrix, $$\xi$$ is the Jacobian and $$\beta$$ is an arbitrary real number $$\geq$$0.
Once $$\Omega_i$$ and $$\xi_i$$ are derived lmt will calculate $$\zeta_i$$ using $$\alpha=1$$ and $$\beta=0$$. If $$\zeta_i$$ is not valid(i.e. the $$\Sigma$$ matrices are not positive definite), it will use the Levenberg-Marquardt algorithm to find a valid $$\zeta_i$$ by setting $$\beta$$ to an ever increasing number. lmt will try this for 10000 iterations. If $$\zeta_i$$ is still not valid lmt will return to $$\Omega_{i-1}$$ and $$\xi_{i-1}$$ and set $$\alpha=\alpha*0.5$$.
This can lead to a situation where lmt down-scales $$\alpha$$ to near zero and the convergence criterion appears to report convergence. It is therefore crucial to check the iteration output. True convergence is only achieved if $$\alpha=1$$ and $$\beta=0$$.
AI-REML-C
lmt supports AI-REML-C, which relies on the construction and factorization of the mixed-model equations system coefficient matrix C.
Note that ssSNPBLUP and ssGTBLUP models are not supported. Further, it is not advisable to use airemlc for ssGBLUP models with several correlated genetic effects.
Fixation of Sigma matrix elements
Elements of $$\Sigma$$ matrices can be exempted from re-estimation in two ways:
- providing a boolean mask matrix $$B$$ where elements set to "T" are related to elements in $$\Sigma$$ which should be regarded as fixed, or by
- setting a diagonal element in $$\Sigma$$ desired to be fixed to 1.0 to 1.0, or by
- setting an off-diagonal element in $$\Sigma$$ desired to be fixed to 0.0 to 0.0.
Note that in case exemption is communicated via options 2 and 3 the $$\Sigma$$ matrix provided at start must still be positive definite. Further note that using option 1 overrides all information contained in $$\Sigma$$. That is if $$\Sigma[1,1]$$ is set to 1.0 but $$B$$[1,1] is set to false, $$\Sigma$$[1,1] is not exempt.
Elements of the inverse of the mixed model coefficient matrix
In principle lmt can generate any element of the inverse mixed model coefficient matrix. However, the user interface is currently limited to the diagonal elements for fixed factors and the diagonal blocks for random factors. These elements can either be sampled or obtained accurately via solving.
Gibbs Sampling
Following the approach of Harville(1999)[6] lmt can sample for fixed factors the diagonal elements of the inverse of the mixed model coefficient matrix, for random factors the diagonal blocks of the inverse of the coefficient matrix where the block size is determined by the dimension of the related $$\Sigma$$ matrix. The blocks are the prediction error co-variance matrices of the factor levels of correlated sub-factors. When sampling prediction error variances lmt can run many Gibbs chains in parallel allowing to exploit multi-core hardware architecture. However, it is recommended to specify not more chains than the number of available real cores excluding hyper-threading technology.
Solving
lmt can obtain elements of the inverse of the coefficient matrix via solving the mixed model equations. This method is currently only supported for the diagonal prediction error co-variance blocks of random factors, where the block size is determined by the dimension of the related $$\Sigma$$ matrix. For this algorithm lmt can utilize either the iterative solver or the direct solver.
Iterative inbreeding
lmt supports the iterative calculation of inbreeding coefficients as described in VanRaden(1992)[7].
References
- ↑ D. Sorensen and D. Gianola; Likelihood, Bayesian, and MCMC Methods in Quantitative Genetics; 2002; 584-588
- ↑ David A. Harville; Making REML computationally feasible for large data sets: use of the Gibbs sampler; Journal of Statistical Computation & Simulation; 2004
- ↑ D.L. Johnson and R. Thompson; Restricted maximum likelihood estimation of variance components for univariate animal models using sparse matrix techniques and average information; Journal of Dairy Science; 1995
- ↑ A. Gilmour et. al.; Average information REML: an efficient algorithm for variance parameter estimation in linear mixed models; Biometrics; 1995
- ↑ J. Jensen et. al.; Residual maximum likelihood estimation of (co) variance components in multivariate mixed linear models using average information; Indian Society of Agricultural Statistics; 1997
- ↑ David A. Harville; Use of the Gibbs sampler to invert large, possibly sparse, positive definite matrices; Linear Algebra and its Applications; 1999
- ↑ PM VanRanden; Accounting for Inbreeding and Crossbreeding in Genetic Evaluation of Large Populations; Journal of Dairy Science; 1992