Modules¶
DMU¶
Reading a driver file (to be described later) containing description of which type of analysis to perform,
input data, model, variance structures, prior variances and covariances and a number of optional parameters
(see description of the driver file).
All effect ID’s in data and additional input files are recoded and checked for consistency across
the different input data files.
Basen on the ITASK specified on the $ANALYSIS line, The modules DMU4, DMU5, DMUAI or RJMC are executed.
DMU4¶
This module can be used to predict future outcomes of random effects (e.g. breeding values) and to estimate
fixed effects. The multiple trait mixed model equations are set up and solved using techniques, that requires
that all non-zeros of the whole system is stored in memory of the computer.
The multiple trait mixed model equations can be solved by seven different iterative methods using various
forms of adaptive relaxation techniques from the subroutine package ITPACK (Kicaid et al., 1982), or by
direct methods based on FSPAK (Perez-Enciso et al., 1994) or LAPACK subroutines.
The optimum solver depends on the model used, the amount of data, and the data structure. For sparse system,
FSPAK and ITPACK solvers are the most efficient. The ITPACK solver requires less memory then FSPAK solvers.
Among the FSPAK solvers, method 8 has the smallest and method 9 the largest memory requirement.
Time requirements for the FSPAK solvers are less for method 9, followed by 10 and 8. If a direct method
cannot be used due to memory requirement,ITPAK method 1 (JCG) is a good choice.
For dens system as occurs in SNP- and G-BLUP models, the dens solver is the most efficient in terms of
computer time. If run on a SMP (multi core/CPU) computer, the solving step is parallelized over all available
CPU’s/cores, or on the number of CPU’s/cores specified by setting the environment variable MKL_NUM_THREADS=n,
where n is the number of CPU’s/cores to use.
If solutions are obtained by FSPAK or for dense systems by LAPACK, standard error of estimate for fixed
effects and standard error of prediction for random effects in the model are also computed. Irrespectively
of the solver used, standard error of estimate/prediction and correlation among estimates/predictions of
selected fixed or random effects in the model(s) can be obtained. Such information can be used to test
various hypotheses.
DMU5¶
This module can be used to solve the multiple trait mixed model equations based on iteration on data.
Since the explicit construction of the system of equations is avoided, it is possible to solve much larger
systems by this module than by the DMU4. The DMU5 module solves the model based on processing/reading i
the data file in each round of iteration. The iterative solver is based on the Preconditioned Conjugate
Gradient method.
DMUAI¶
This module can be used for estimation of (co)variance components using Average Information REstricted
Maximum Likelihood (AI-REML) (Jensen et al. 1997).
The algorithm is based on the use of Average Information (AI) as second differentials of the likelihood
function. The AI is obtained by averaging the information matrices based on observed and expected
information. The module can also use Expectation Maximization (EM) to maximize the restricted likelihood
function.
It is only the part of Likelihood that depends on random effect that is maximized. The full Likelihood also
contains a factor (constant) that depends on data and fixed effects. Therefore, the reported “-2LogL +
constant” value can only be used to compare models using the exact same data and fixed effects.
Asymptotic standard errors of estimated (co)variance components are obtained from the Average Information
matrix. If parameters in the form of interclass correlations and correlations between random effects
(e.g. genetic correlations) the standard errors of these parameter estimates are computed based on a Taylor
series approximation.
AI-REML can yield updates of the parameter vector outside the parameter space. To overcome this problem
different methods are implemented in DMUAI. The methods are:
AI, but combining AI and EM if an update goes outside the parameter space.
EM based on a algorithm by Robin Thompson
EM based on an algorithm by Esa Mäntysaari
AI, but with step halving if an update goes outside the parameter space.
DMUAI can use sparse computation based on FSPAK subroutines or dens computation based on LAPACK subroutines.
As for DMU4, the dens computation can use parallel computations on SMP (multi core/CPU) computer.
The default is to parallelize over all over all available CPU’s/cores, but can be limited by setting the
environment variable MKL_NUM_THREADS=n, where n is the number of CPU’s/cores to use.
RJMC¶
Module for single and multiple trait Markov Chain Monte Carlo (MCMC) based estimation of location and
dispersion parameters. The implementation is based on iteration on data techniques and can handle
Gaussian, binary, ordered categorical, 2-component mixture, zero inflated sequential binary traits and
reaction norm models with unknown environmental gradient.
Sampling of genetic dispersion parameters can be restricted so only information on parental animals is used.
This approach is especially useful for cross-sectional binary traits (Ødegård et al. (2010)).