Prerequisites
- MPI (either OpenMPI or Intel-MPI)
- GCC or Intel Fortran compiler
- (optional) Parallel HDF5 (compiled with either OpenMPI or Intel-MPI)
apt
package manager): sudo apt install build-essential libopenmpi-dev
(for gcc
and openmpi
), sudo apt install libhdf5-openmpi-dev hdf5-tools
(for hdf5
).Configuring, making and running
# to view all configuration options
python3 configure.py --help
# to configure the code (example)
python3 configure.py -mpi08 -hdf5 --user=user_2d_rec -2d
# compile and link (-j compiles in parallel which is much faster)
make all -j
# run the code (on clusters need to do `srun`)
mpirun -np [NCORES] ./bin/tristan-mp2d -i [input_file_name] -o [output_dir_name] -r [restart_dir_name]
# clear compilation
make clean
make clean
before recompiling the code.Docker
Another way to avoid the tedium of installing libraries (especially for local development) is to use Docker containers. This approach allows to quickly create an isolated linux environment with all the necessary packages already preinstalled (similar to a VM, but much lighter). The best thing is that VSCode can natively attach to a running container, and all the development can be done there (make sure to install the appropriate extension).
To get started with this approach, make sure to install the Docker (as well as the Docker-compose) then simply follow these steps.
# from tristan root diretory
cd docker
# launch the container in the background (first-time run might take a few mins)
docker-compose up -d
# if you are attempting to also force rebuild the container, use the `--build` flag
docker-compose up -d --build
# ensure the container is running
docker ps
Then you can attach to the container via VSCode, or if you prefer the terminal, simply attach to the running container by doing:
docker exec -it trv2 zsh
# then the code will be in the `/home/$USER/tristan-v2` directory
cd /home/$USER/tristan-v2
To stop the container run docker-compose stop
. To stop and delete the container simply run docker-compose down
from the same docker/
directory.
/root/tristan-v2
directory. Any changes to the rest of the container’s filesystem are discarded when the container is deleted (either via docker-compose down
or directly docker rm <CONTAINER>
).Cluster specific customization
Code can be configured for a specific cluster saved into the configure.py
file. For example, to enable all the Perseus
-specific configurations (Princeton University) include the --cluster=perseus
flag when calling the python configure.py
. This will automatically enable the new MPI version, ifport
library, intel
compilers and specific vectorization flags.
Most clusters use the so-called slurm
scheduling system where jobs are submitted for scheduling using a submit script. Here is a best practice example of such a script:
#!/bin/bash
#SBATCH -J myjobname
#SBATCH -n 64
#SBATCH -t 01:00:00
# specify all the variables
EXECUTABLE=tristan-mp2d
INPUT=input
OUTPUT_DIR=output
SLICE_DIR=slices
RESTART_DIR=restart
REPORT_FILE=report
ERROR_FILE=error
# here you'll need to include cluster specific modules ...
# ... these are the ones used on `perseus`
module load intel-mkl/2019.3/3/64
module load intel/19.0/64/19.0.3.199
module load intel-mpi/intel/2018.3/64
module load hdf5/intel-16.0/intel-mpi/1.8.16
# create the output directory
mkdir $OUTPUT_DIR
# backup the executable and the input file
cp $EXECUTABLE $OUTPUT_DIR
cp $INPUT $OUTPUT_DIR
srun $EXECUTABLE -i $INPUT -o $OUTPUT_DIR -s $SLICE_DIR -r $RESTART_DIR > $OUTPUT_DIR/$REPORT_FILE 2> $OUTPUT_DIR/$ERROR_FILE
Stellar
(PU)
Modules to load:
1) intel-rt/2021.1.2
2) intel-tbb/2021.1.1
3) intel-mkl/2021.1.1
4) intel-debugger/10.0.0
5) intel-dpl/2021.1.2
6) /opt/intel/oneapi/compiler/2021.1.2/linux/lib/oclfpga/modulefiles/oclfpga
7) intel/2021.1.2
8) ucx/1.9.0
9) intel-mpi/intel/2021.1.1
10) hdf5/intel-2021.1/intel-mpi/1.10.6
11) anaconda3/2020.11
Simply loading the following modules will automatically load all the others:
intel/2021.1.2
intel-mpi/intel/2021.1.1
hdf5/intel-2021.1/intel-mpi/1.10.6
anaconda3/2020.11 # <- used when configuring with python
You can make an alias for simplicity and put in your .zshrc
or .bashrc
:
alias tristan_modules='module purge; module load intel/2021.1.2 intel-mpi/intel/2021.1.1 hdf5/intel-2021.1/intel-mpi/1.10.6 anaconda3/2020.11'
Frontera
Modules:
1) intel/18.0.5 2) impi/18.0.5 3) phdf5/1.10.4 4) python3/3.7.0
Useful alias:
alias tristan_modules="module purge; module load intel/18.0.5; module load impi/18.0.5; module load phdf5/1.10.4; module load python3/3.7.0"
Helios
(IAS)
Before running the code do the following:
# if running on a single node:
export I_MPI_PMI_LIBRARY=/usr/lib64/libpmi.so
# if running on multiple nodes:
export I_MPI_PMI_LIBRARY=/usr/lib64/libpmi2.so
export UCX_TLS=all