Difference between revisions of "Output files"
Line 29: | Line 29: | ||
*convergence criterion CR | *convergence criterion CR | ||
*second per iteration | *second per iteration | ||
====Variance component estimation using MC-EM-REML==== | |||
=====mcemreml_conv.csv===== | |||
Contains the iteration statistic with one parameter vector per iteration. The parameter vector contains the following elements: | |||
*seconds for the last MC-Em iteration | |||
*seconds for solving the equation system | |||
*seconds for sampling the traces | |||
*convergence criterion '''cd''' | |||
=====<u.d. variance name>_sigma_SA.csv===== | |||
The file contains the column-wise upper-triangular elements of the respective $$\Sigma$$ matrix estimated in each round. The file contains as many rows as MC-EM-REML iterations. Estimates at convergence for a $$\Sigma$$ matrix being part of variance structure named {{cc|g}} maybe read into R by | |||
<syntaxhighlight lang="R" line> | |||
d<-fread("g_sigma_UPDATE.csv")) | |||
n<-floor(sqrt(ncol(d)*2)) | |||
g<-matrix(0,n,n);g[upper.tri(g,diag=TRUE)]<-d[nrow(d)-1,]; | |||
</syntaxhighlight> | |||
====Variance component estimation using AI-REML-C==== | ====Variance component estimation using AI-REML-C==== |
Revision as of 04:15, 7 June 2022
Default output files
General output files
lmt.log
General log file always generated. Provides information of the current state of operation.
Operation dependent output files
Solving the mixed model equation system
results.csv
The file contains the solutions for the mixed model equation system. The file has four columns:
- factor name,
- sub-factor name,
- factor level id, and
- solution.
Note that factor name s are derived according to the lmt factor naming convention, and sub-factor name s are user-defined and are extracted from the equation system. Further, the factor level id is the original as provided by the data, pedigree, etc. For variables undergoing polynomial expansions, e.g. sub-factor c in
y=x*b+age(t(co(p(1,2))))*c+id*u(v(my_var(1)))
the sub-factor name will be expanded as well to sub-factor name_id , where "id" is the polynomial id. For the above example c would be expanded to c_1 and c_2 .
The file has as many records the system has equations. Fixed factor levels which have been omitted due to rank deficiencies are not printed.
PCG solver output files
so_conv.csv
Contains the iteration statistics with columns
- iteration number
- alpha
- beta
- $$||r_i M r_i||$$, where $$M$$ is the preconditioner matrix and $$r_i=Cx_i-b$$ for a system $$Cx=b$$
- convergence criterion CD
- convergence criterion CR
- second per iteration
Variance component estimation using MC-EM-REML
mcemreml_conv.csv
Contains the iteration statistic with one parameter vector per iteration. The parameter vector contains the following elements:
- seconds for the last MC-Em iteration
- seconds for solving the equation system
- seconds for sampling the traces
- convergence criterion cd
<u.d. variance name>_sigma_SA.csv
The file contains the column-wise upper-triangular elements of the respective $$\Sigma$$ matrix estimated in each round. The file contains as many rows as MC-EM-REML iterations. Estimates at convergence for a $$\Sigma$$ matrix being part of variance structure named g maybe read into R by
d<-fread("g_sigma_UPDATE.csv"))
n<-floor(sqrt(ncol(d)*2))
g<-matrix(0,n,n);g[upper.tri(g,diag=TRUE)]<-d[nrow(d)-1,];
Variance component estimation using AI-REML-C
aic_conv.csv
Contains the iteration statistic with one parameter vector per iteration. The parameter vector contains the following elements:
- iteration number
- convergence criterion ng
- convergence criterion cd
- convergence criterion ll
- log-likelihood
- Newton over-relaxation parameter
- number of Newton over-relaxation iterations
- seconds for the last AI-REML iteration
<u.d. variance name>_sigma_UPDATE.csv
The file contains the column-wise upper-triangular elements of the respective $$\Sigma$$ matrix estimated in each round. The file contains as many rows as AI-REML-C iterations plus one. The row before the last contains the co-variance estimates at convergence. The last row contains the approximate standard errors of the parameter estimates. Estimates at convergence for a $$\Sigma$$ matrix being part of variance structure named g maybe read into R by
d<-fread("g_sigma_UPDATE.csv"))
n<-floor(sqrt(ncol(d)*2))
g<-matrix(0,n,n);g[upper.tri(g,diag=TRUE)]<-d[nrow(d)-1,];
Intermediate output files
Files from processing pedigrees
"u.d. pedigree name"_sorted.csv
File contains a 3(ordinary pedigree) or 4(probabilistic pedigree) column matrix containing the sorted and renumbered pedigree generated from the original pedigree.
"u.d. pedigree name"_crossref.csv
File contains a vector of original ids of individuals in "u.d. pedigree name"_sorted.csv. That is, the original id of individual #1 in "u.d. pedigree name"_sorted.csv is located in record 1 of this file, etc.
"u.d. pedigree name".bin
Block file in binary format containing for blocks:
- a: real vector of diagonal elements of $$A$$
- ai: real vector of diagonal elements of $$A^{-1}$$
- m: real vector of mendelian sampling terms
- pe: 3 column integer matrix of the sorted and renumbered pedigree underlying $$A$$
Requested output files
Files from processing pedigrees
Genetic group regression matrix
lmt can write the genetic group regression matrix(aka $$Q$$) of a pedigree containing phantom parents to a user-defined file. For the necessary key string see here.
A $$Q$$ matrix written to Q.coocsv format maybe read into R
dim<-scan("Q.coocsv",n=2,sep=",")
Q<-matrix(0,d[1],d[2])
dat<-fread("Q.coocsv",skip=1)
Q[cbind(d$V1,d$V2)]<-d$V3
Files from running AI-REML jobs
AI matrix, gradient vector and parameter vector
Upon provision of the respective switch in the instruction file lmt writes the AI matrix, gradient vector and parameter vector to files ai_ai.csv , ai_ja.csv and ai_pa.csv , respectively. Files ai_ai.csv and ai_ja.csv contain as many records as AI-REML iterations. File ai_pa.csv contains contains as many records as AI-REML iterations + 1, where the first record is the parameter vector at the start.
Each row of file ai_ai.csv contains the upper-triangular of the AI in column-major order. The AI matrix of say iteration 2 can be reconstructed by
d<-fread("ai_ai.csv"))
n<-floor(sqrt(ncol(d)*2))
ai<-matrix(0,n,n);ai[upper.tri(ai,diag=TRUE)]<-d[2,];