Difference between revisions of "Run It"
Line 18: | Line 18: | ||
set the lmt threading behaviour via the two environment variables <b>OMP_NUM_THREADS</b> | set the lmt threading behaviour via the two environment variables <b>OMP_NUM_THREADS</b> | ||
and <b>MKL_NUM_THREADS</b>. | and <b>MKL_NUM_THREADS</b>. | ||
The performance of multi-threaded programs can also be heavily affected by thread | The performance of multi-threaded programs can also be heavily affected by thread | ||
affinity settings where the optimal settings must be found via trial and error. However, | affinity settings where the optimal settings must be found via trial and error. However, | ||
as a starting point, it is advisable to set the environment variable | as a starting point, it is advisable to set the environment variable | ||
<b>KMP_AFFINITY=granularity=core,scatter</b>. | <b>KMP_AFFINITY=granularity=core,scatter</b>. | ||
If you run several instances of <b>lmt</b> at the same time on the same computer you may | If you run several instances of <b>lmt</b> at the same time on the same computer you may | ||
experience a severe drop in performance of each of the runs. This is usually caused | experience a severe drop in performance of each of the runs. This is usually caused | ||
Line 27: | Line 29: | ||
be avoided. However, using the <b>numactl</b> environment may provide an option to avoid | be avoided. However, using the <b>numactl</b> environment may provide an option to avoid | ||
performance drops. Please check out <b>numactl --help</b> for further reading. | performance drops. Please check out <b>numactl --help</b> for further reading. | ||
{{Colored box|title=.bashrc example|content= | |||
ulimit -s unlimited | |||
export OMP_NUM_THREADS=16 | |||
export OMP_STACKSIZE=2000M | |||
export OMP_DYNAMIC=false | |||
export OMP_PLACES=cores | |||
export OMP_PROC_BIND=true | |||
export OMP_NESTED=true | |||
export KMP_AFFINITY=granularity=core,scatter | |||
}} |
Revision as of 10:58, 29 November 2020
lmt is developed for Linux operation systems on computers with an Intel architecture. Thus using lmt on AMD architecture will result in increased run time. For executables running on Windows or Mac please contact the author.
lmt requires that some environment variables are set to specific values. lmt will check those settings at start and will stop if the settings are wrong.
- stack size must be set to unlimited via ulimit -s unlimited
- OMP_DYNAMIC=FALSE
- OMP_STACKSIZE='a reasonable value (i.e. 2000M)'
- OMP_NESTED=TRUE
lmt is highly threaded and will try to use all available computing resources. Will this is desirable for very large models it can hamper performance when crunching medium to small data sets resulting in an increased run time. What "medium to small" means de- pends on the actual computer and must therefore be determined by the user. The user can set the lmt threading behaviour via the two environment variables OMP_NUM_THREADS and MKL_NUM_THREADS.
The performance of multi-threaded programs can also be heavily affected by thread affinity settings where the optimal settings must be found via trial and error. However, as a starting point, it is advisable to set the environment variable KMP_AFFINITY=granularity=core,scatter.
If you run several instances of lmt at the same time on the same computer you may experience a severe drop in performance of each of the runs. This is usually caused by competition for resources and therefore parallel runs of several lmt instances should be avoided. However, using the numactl environment may provide an option to avoid performance drops. Please check out numactl --help for further reading.