# Output files

## Default output files

### General output files

#### lmt.log

General log file always generated. Provides information of the current state of operation.

### Operation dependent output files

#### Solving the mixed model equation system

##### results.csv

The file contains the solutions for the mixed model equation system. The file has four columns:

- factor name,
- sub-factor name,
- factor level id, and
- solution.

Note that factor name s are derived according to the lmt factor naming convention, and sub-factor name s are user-defined and are extracted from the equation system. Further, the factor level id is the original as provided by the data, pedigree, etc. For variables undergoing polynomial expansions, e.g. sub-factor c in

y=x*b+age(t(co(p(1,2))))*c+id*u(v(my_var(1)))

the sub-factor name will be expanded as well to sub-factor name_id , where "id" is the polynomial id. For the above example c would be expanded to c_1 and c_2 .

The file has as many records the system has equations. Fixed factor levels which have been omitted due to rank deficiencies are not printed.

For a random factor with user-defined name a with several correlated sub-factors one can recover the factor level effect matrices with the R code

```
results<-as.matrix(fread("results.csv")))
data<-matrix(results$V4[results$V1=="a"],ncol=length(unique(results$V2[results$V1=="a")),byrow=F)
```

##### PCG solver output files

###### so_conv.csv

Contains the iteration statistics with columns

- iteration number
- alpha
- beta
- $$||r_i M r_i||$$, where $$M$$ is the preconditioner matrix and $$r_i=Cx_i-b$$ for a system $$Cx=b$$
- convergence criterion CD
- convergence criterion CR
- second per iteration

#### Variance component estimation using Gibbs sampling

##### <u.d. variance name>_sigma_SA.csv

The file contains the column-wise upper-triangular elements of the respective $$\Sigma$$ matrix estimated in each round. The file contains as many rows as MC-EM-REML iterations. Estimates for a $$\Sigma$$ matrix being part of variance structure named g , assuming 10,000 samples, a burn-in of 1,000 samples and a thinning of 50 samples, can be obtained by

```
d<-as.matrix(fread("g_sigma_SA.csv")))
n<-floor(sqrt(ncol(d)*2))
gMean<-matrix(0,n,n);gSd<-gMean
gMean[upper.tri(g,diag=TRUE)]<-colMeans(d[seq(1000,nrow(d),20),]);
gSd[upper.tri(g,diag=TRUE)]<-apply(d[seq(1000,nrow(d),20),],2,sd);
```

#### Variance component estimation using MC-EM-REML

##### mcemreml_conv.csv

Contains the iteration statistic with one parameter vector per iteration. The parameter vector contains the following elements:

- seconds for the last MC-Em iteration
- seconds for solving the equation system
- seconds for sampling the traces
- convergence criterion
**cd**

##### <u.d. variance name>_sigma_SA.csv

The file contains the column-wise upper-triangular elements of the respective $$\Sigma$$ matrix estimated in each round. The file contains as many rows as MC-EM-REML iterations. Estimates at convergence for a $$\Sigma$$ matrix being part of variance structure named g maybe read into R by

```
d<-fread("g_sigma_SA.csv"))
n<-floor(sqrt(ncol(d)*2))
g<-matrix(0,n,n);g[upper.tri(g,diag=TRUE)]<-d[nrow(d),];
```

#### Variance component estimation using AI-REML-C

##### aic_conv.csv

Contains the iteration statistic with one parameter vector per iteration. The parameter vector contains the following elements:

- iteration number
- convergence criterion
**ng** - convergence criterion
**cd** - convergence criterion
**ll** - log-likelihood
- $$\beta$$
- number of Levenber-Marquardt iterations
- seconds for the last AI-REML iteration
- $$\alpha$$

##### <u.d. variance name>_sigma_UPDATE.csv

The file contains the column-wise upper-triangular elements of the respective $$\Sigma$$ matrix estimated in each round. The file contains as many rows as AI-REML-C iterations plus one. The row before the last contains the co-variance estimates at convergence. The last row contains the approximate standard errors of the parameter estimates. Estimates at convergence for a $$\Sigma$$ matrix being part of variance structure named g maybe read into R by

```
d<-fread("g_sigma_UPDATE.csv"))
n<-floor(sqrt(ncol(d)*2))
g<-matrix(0,n,n);g[upper.tri(g,diag=TRUE)]<-d[nrow(d)-1,];
```

Note that when restarting **lmt** will not overwrite this file. Instead it will append records.

### Intermediate output files

#### Files from processing pedigrees

##### "u.d. pedigree name"_sorted.csv

File contains a 3(ordinary pedigree) or 4(probabilistic pedigree) column matrix containing the sorted and renumbered pedigree generated from the original pedigree.

##### "u.d. pedigree name"_crossref.csv

File contains a vector of original ids of individuals in **"u.d. pedigree name"_sorted.csv**. That is, the original id of individual #1 in **"u.d. pedigree name"_sorted.csv** is located in record 1 of this file, etc.

##### "u.d. pedigree name".bin

Block file in binary format containing for blocks:

- a: real vector of diagonal elements of $$A$$
- ai: real vector of diagonal elements of $$A^{-1}$$
- m: real vector of mendelian sampling terms
- pe: 3 column integer matrix of the sorted and renumbered pedigree underlying $$A$$

## Requested output files

### Files from processing pedigrees

#### Genetic group regression matrix

lmt can write the genetic group regression matrix(aka $$Q$$) of a pedigree containing phantom parents to a user-defined file. For the necessary key string see here.

A $$Q$$ matrix written to Q.coocsv format maybe read into R

```
dim<-scan("Q.coocsv",n=2,sep=",")
Q<-matrix(0,d[1],d[2])
dat<-fread("Q.coocsv",skip=1)
Q[cbind(d$V1,d$V2)]<-d$V3
```

### Files from building GRMs

#### Genomic relationship matrix

lmt can write a genomic relationship matrix to a user-nominated file after it has been constructed. Supported output file formats are csc and bin where the latter nominates a block file in binary format. In both cases only the upper triangular in column major order is written out. The matrix maybe reconstructed in R

```
d<-scan("mygrm.csv")
n<-floor(sqrt(length(d)*2))
G<-matrix(0,n,n)
G[upper.tri(G,diag=T)]<-d
```

### Files from running AI-REML jobs

#### AI matrix, gradient vector and parameter vector

Upon provision of the respective switch in the instruction file **lmt** writes the AI matrix, gradient vector and parameter vector to files ai_ai.csv , ai_ja.csv and ai_pa.csv , respectively. Files ai_ai.csv and ai_ja.csv contain as many records as AI-REML iterations. File ai_pa.csv contains contains as many records as AI-REML iterations + 1, where the first record is the parameter vector at the start.

Each row of file ai_ai.csv contains the upper-triangular of the AI in column-major order. The AI matrix of say iteration 2 can be reconstructed by

```
d<-fread("ai_ai.csv"))
n<-floor(sqrt(ncol(d)*2))
ai<-matrix(0,n,n);ai[upper.tri(ai,diag=TRUE)]<-d[2,];
```