OpenMPI on SubMIT¶
The Message Passing Interface (MPI) is a standardized and widely used communication protocol designed for parallel computing in distributed-memory systems. It enables processes running on different nodes of a cluster to exchange data efficiently, making it essential for high-performance computing (HPC) applications. OpenMPI is a popular open-source implementation of the MPI standard, offering high flexibility, scalability, and performance optimizations. Developed collaboratively by the HPC community, OpenMPI supports multiple network interfaces and integrates seamlessly with modern supercomputing environments. Its modular architecture allows users to tailor configurations for specific hardware, making it a preferred choice for researchers and engineers running large-scale simulations, numerical computations, and machine learning workloads.
Here we briefly introduce the OpenMPI installation on SubMIT and provide some examples of running your program with it.
Centralized installation of OpenMPI on SubMIT¶
We provide a centralized OpenMPI installation through the module
system. One can load the OpenMPI module using:
module load mpi
you can check if it is successfully loaded by running:
module list
mpirun --version
Running MPI Programs in C/C++¶
Here is a simple example of an MPI program in C (mpi_hello.c):
#include <mpi.h>
#include <stdio.h>
int main(int argc, char** argv) {
MPI_Init(&argc, &argv);
int rank, size;
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_Comm_size(MPI_COMM_WORLD, &size);
printf("Hello from process %d out of %d\n", rank, size);
MPI_Finalize();
return 0;
}
Compile the code using:
mpicc -o mpi_hello mpi_hello.c # For C
mpicxx -o mpi_hello mpi_hello.cpp # For C++
Run the program with:
mpirun -np 4 ./mpi_hello
Running MPI Programs in Python¶
To use OpenMPI with Python, one can install mpi4py
or equivalent python packages. We recommended you to install it with conda
using the following command:
conda install -c conda-forge mpi4py openmpi=4.1.*=external_*
The last part of the command specifies that the system-provided OpenMPI libraries will be used. Otherwise, conda
will try to install its own version of OpenMPI or other MPI distributions (which usually will work as well, but there is no guarantee that they are optimized as the system-provide one).
Please also install the ucx
libary (as the necessary point-to-point messaging layer):
conda install -c conda-forge ucx
Here is a simple Python MPI example (mpi_example.py):
from mpi4py import MPI
comm = MPI.COMM_WORLD
rank = comm.Get_rank()
size = comm.Get_size()
print(f"Hello from process {rank} out of {size}")
Run the script using:
mpirun -np 4 python mpi_example.py
Submitting Jobs to the Slurm¶
An example submission script to Slurm is:
#!/bin/bash
#SBATCH --job-name=test
#SBATCH --nodes=3
#SBATCH --ntasks=12
#SBATCH --cpus-per-task=1
#SBATCH --time=00:20:00
#SBATCH --mem-per-cpu=100
module load mpi
mpirun -np $SLURM_NTASKS ./my_program
Here we request 3 nodes with 12 tasks in total. Each task has 1 CPU, so we are only doing multi-processing here and no multi-threading.
More Testing scripts for OpenMPI¶
More testing scripts for OpenMPI can be found at:
https://github.com/mit-submit/submit-examples/blob/main/openmpi/
To use them, load the OpenMPI module on SubMIT and run the make file to compile the codes.
OpenMPI in containers when using HTCondor¶
This part walks you through running OpenMPI applications on an HTCondor-managed external cluster.
Unfortunately, in this case, we cannot use the centralized OpenMPI installation, since that is only available on SubMIT clusters. We will here show how to run some testing scripts as an example.
We need to first create a container image that includes OpenMPI and the necessary libraries. Our mpi.def
contains:
Bootstrap: docker
From: centos:7
%post
# Redirect repos to vault.centos.org
sed -i 's|^mirrorlist=|#mirrorlist=|g' /etc/yum.repos.d/CentOS-Base.repo
sed -i 's|^#baseurl=http://mirror.centos.org|baseurl=http://vault.centos.org|g' /etc/yum.repos.d/CentOS-Base.repo
yum clean all
yum install -y openmpi openmpi-devel hwloc numactl
export CFLAGS="-march=core2 -mtune=generic -mno-avx -mno-avx2 -mno-sse4"
/usr/lib64/openmpi/bin/mpicc $CFLAGS -o /usr/local/bin/hello_c /hello_c.c
/usr/lib64/openmpi/bin/mpicc $CFLAGS -o /usr/local/bin/ring_c /ring_c.c
%files
hello_c.c /hello_c.c
ring_c.c /ring_c.c
%environment
export PATH=/usr/lib64/openmpi/bin:$PATH
export LD_LIBRARY_PATH=/usr/lib64/openmpi/lib:$LD_LIBRARY_PATH
where you can find the hello_c.c and ring_c.c files in the “More Testing Scripts” section above and copy them to your current directory. Here we use the centos:7 to ensure compatibility with the HTCondor-managed external cluster. We then build the container image via:
singularity build mpi.sif mpi.def
Then we create a sample HTCondor submit file (condor.sub
):
universe = vanilla
request_disk = 1024
executable = job.sh
transfer_input_files = mpi.sif, job.sh
should_transfer_files = YES
when_to_transfer_output = ON_EXIT
request_cpus = 2
output = test.out
error = test.err
log = test.log
+DESIRED_Sites = "mit_tier3"
queue 1
And a corresponding job.sh
:
# Create a per-job temporary directory
export TMPDIR=$(mktemp -d /tmp/openmpi.XXXXXX)
# use container
singularity exec mpi.sif /usr/lib64/openmpi/bin/mpirun -n 2 /usr/local/bin/hello_c
singularity exec mpi.sif /usr/lib64/openmpi/bin/mpirun -n 2 /usr/local/bin/ring_c
rm -rf "$TMPDIR"
We can then submit the job via:
condor_submit condor.sub
The output in test.out
should look like:
Hello, world, I am 0 of 2, (Open MPI v1.10.7, package: Open MPI mockbuild@x86-02.bsys.centos.org Distribution, ident: 1.10.7, repo rev: v1.10.6-48-g5e373bf, May 16, 2017, 142)
Hello, world, I am 1 of 2, (Open MPI v1.10.7, package: Open MPI mockbuild@x86-02.bsys.centos.org Distribution, ident: 1.10.7, repo rev: v1.10.6-48-g5e373bf, May 16, 2017, 142)
Process 0 sending 10 to 1, tag 201 (2 processes in ring)
Process 0 sent to 1
Process 0 decremented value: 9
Process 0 decremented value: 8
Process 0 decremented value: 7
Process 0 decremented value: 6
Process 0 decremented value: 5
Process 0 decremented value: 4
Process 0 decremented value: 3
Process 0 decremented value: 2
Process 0 decremented value: 1
Process 0 decremented value: 0
Process 0 exiting
Process 1 exiting
Note that warnings are expected in general as the OpenMPI installed in the container is usually not optimized for the hardware of the external cluster. The most common one is complaining about the lack of OpenFabrics support.
You can try suppress that by adding --mca btl ^openib,ofi
to the mpirun
command in the job.sh
file, but it is not absolutely necessary.