Monday, December 23, 2019

Working on a SGE cluster

Using SGE:                                     qconf -help
* check all queuename (-q):                    qconf -sql
* check all parallel environments (-pe):  qconf -spl
* consider the allocation rule:  qconf -sp mpi

* setup modulefiles: edit file "/home1/p001cao/.bashrc", add line
module use /home1/p001cao/local/1myModfiles

Ia. OpenMPI-4.0.2 with Intel-xe2019u5 (USC2)

module load intel/compiler-xe19u5
module load compiler/gcc/7.4.0

check: icpc -v

tar xvzf openmpi-4.0.3.tar.gz
cd openmpi-4.0.3
mkdir build_IB
cd build_IB

# install without ucx
../configure CC=icc CXX=icpc FC=ifort F77=ifort \
--with-sge --with-verbs --without-cma --without-ucx \
--prefix=/home1/p001cao/local/app/openmpi/4.0.3-intelxe19u5 
# install with ucx:
- install ucx
- install Mellanox-ofed (need to use ucx)
https://docs.mellanox.com/display/MLNXOFEDv451010/Installing+Mellanox+OFED
https://www.mellanox.com/products/infiniband-drivers/linux/mlnx_ofed
tar xvzf MLNX_OFED_LINUX-4.7-3.2.9.0-rhel6.9-x86_64.tgz
./mlnxofedinstall

../configure CC=icc CXX=icpc FC=ifort F77=ifort \
--with-sge --with-verbs --without-cma --enable-mca-no-build=btl-uct \
--with-ucx=/home1/p001cao/local/app/ucx-1.7 \
--with-mxm-libdir=/opt/mellanox/mxm/lib \
--prefix=/home1/p001cao/local/app/openmpi/4.0.3-intelxe19u5_ucx
Note:
* --with-verbs : to use infiniband (older version is --with-openib)
* for open4.0 or later, ucx is used by default, if don't want to use ucx, then must set:
export OMPI_MCA_btl_openib_allow_ib=1
export OMPI_MCA_btl_openib_if_include="mlx4_0:1"
* ucx notwork, so far
https://github.com/openucx/ucx/wiki/OpenMPI-and-OpenSHMEM-installation-with-UCX
https://developer.arm.com/tools-and-software/server-and-hpc/help/porting-and-tuning/building-open-mpi-with-openucx/single-page

Ib. OpenMPI-3.1.5

module load intel/compiler-xe19u5
module load compiler/gcc/7.4.0

tar xvzf openmpi-3.1.5.tar.gz
cd openmpi-3.1.5
mkdir build
cd build
../configure CC=icc CXX=icpc FC=ifort F77=ifort \
--with-sge --with-verbs  --without-cma \
--prefix=/home1/p001cao/local/app/openmpi/3.1.5-intelxe19u5 

Ic. IMPI

# add variable into qbash
For IB:
export I_MPI_FABRICS= shm:dapl
export I_MPI_DAPL_PROVIDER ofa-v2-mlx4_0-1

For Ethernet:
export I_MPI_FABRICS= shm:tcp

IMPI-2016:
export I_MPI_FABRICS=shm:dapl
export I_MPI_DALP_PROVIDER=ofa-v2-ib0
export I_MPI_DYNAMIC_CONNECTION=0

IMPI-2019:
export I_MPI_FABRICS=shm:ofi
export FI_PROVIDER=verbs
export I_MPI_DYNAMIC_CONNECTION=0
###################
The Intel® MPI Library switched from the Open Fabrics Alliance* (OFA) framework to the Open Fabrics Interfaces* (OFI) framework and currently supports libfabric*. 
Since IMPI 2019, IMPI discontinue support of the following fabrics that can be specified by I_MPI_FABRICS:
- TCP
- OFA
- DAPL
- TMI
Currently, IMPI 2019 supports only OFI (intra-/internode) and SHM (intranode) fabrics. OFI is a fraemwork that has replacements for all previous fabrics. Those replacements are called OFI providers:
- TCP fabric - sockets OFI provider
- OFA and DAPL fabrics - verbs OFI provider
- TMI - psm2 OFI provider
The provider can be specified by FI_PROVIDER='OFI provider name' (e.g. FI_PROVIDER=psm2 to use Intel OPA fabric; FI_PROVIDER=sockets to use Ethernet (or OPA)). OFI discovers all available hardware and maps them on an appropriate OFI provider (e.g. psm2 - Intel OPA, verbs - IB/OPA/iWARP/RoCE, sockets - Ethernet or OPA/IB(over IPoOPA/IPoIB)). User can specify which IP interface (IPoIB or IPoOPA - e.g. ib0; Ethernet - e.g. eth0) should be used for OFI/sockets provider by specifying FI_SOCKETS_IFACE='IP interface name'.
https://software.intel.com/en-us/articles/intel-mpi-library-2019-over-libfabric

all MPI variables:
https://software.intel.com/en-us/mpi-developer-reference-linux-communication-fabrics-control


II. Miniconda (USC2)

download:    
wget https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh

1. install:
bash Miniconda3-latest-Linux-x86_64.sh

choose folder to install: /home1/p001cao/local/miniconda3...
running conda init? NO
2. modules:
set     topdir          /home1/p001cao/local/miniconda3
prepend-path    PATH                    $topdir/bin
prepend-path    LD_LIBRARY_PATH         $topdir/lib
prepend-path    INCLUDE                 $topdir/include
python envs:

module load conda/conda3
conda create -n     py37ompi     python=3.7

set     topdir    /home1/p001cao/local/miniconda3/envs/py37ompi
prepend-path    PATH                    $topdir/bin
prepend-path    LD_LIBRARY_PATH         $topdir/lib
prepend-path    INCLUDE                 $topdir/include
3. install pkgs:
module load conda/conda3
source activate py37ompi

conda install numpy scipy scikit-learn pandas

Voro++, Ovito:
pip install tess ovito

 mpi4py with OpenMPI:
conda search -c intel       mpi4py
conda install -c conda-forge mpi4py=3.0.3=py37hd0bea5a_0

google jax:
module load conda/conda3
source activate  py37                    # use for plumed
pip install jax jaxlib


No comments:

Post a Comment