Parallel Analog Ensemble (PAnEn) generates accurate forecast ensembles relying on a single deterministic model simulation and the historical observations. The technique was introduced by Luca Delle Monache et al. in the paper Probabilistic Weather Prediction with an Analog Ensemble. Developed and maintained by GEOlab at Penn State, PAnEn aims to provide an efficient implementation for this technique and user-friendly interfaces in R and C++ for researchers who want to use this technique in their own research.
The easiest way to use this package is to install the R package, ‘RAnEn’. C++ libraries are also available but they are designed for intermediate users with requirement for performance. For installation guidance, please refer to the installation section.
To cite this package, you have several options:
LaTex: Please use this file for citation.
R: Simply type
citation('RAnEn')and the citation message will be printed.
Weiming Hu, Guido Cervone, Laura Clemente-Harding, and Martina Calovi. (2019). Parallel Analog Ensemble. Zenodo. http://doi.org/10.5281/zenodo.3384321
RAnEn is very easy to install if you are already using R. This is the recommended way to start.
The command is the same for
RAnEn installation and update.
RAnEn, please install the following packages first:
Windows, please also install the latest version of Rtools.
The following R command install the latest
install.packages("https://github.com/Weiming-Hu/AnalogsEnsemble/raw/master/RAnalogs/releases/RAnEn_latest.tar.gz", repos = NULL)
That’s it. You are good to go. Please refer to tutorials or the R documentation to learn more about using
RAnEn. You might also want to install RAnEnExtra package with functions for visualization and verification. After
RAnEn installation, you can simply run
Mac users: if the package shows that
OpenMP is not supported. You can do one of the followings:
~/.R/Makevarsif you do not have it already and add the following content to it. Of course, change the compilers to what you have. If you do not have any alternative compilers other than Clang, HomeBrew is your friend.
CC=gcc-8 CXX=g++-8 CXX1X=g++-8 CXX14=g++-8
data.table. They provide similar solutions but stick with Clang compilers.
After the installation, you can always revert back to your original setup and
RAnEn will stay supported by
# Download and run the docker image within docker docker container run -it weiminghu123/panen:default # Run the dokcer image with a local folder mounted inside the image docker container run -it -v ~/Desktop:/Desktop weiminghu123/panen:default # Download and run the docker image within singularity singularity run docker://weiminghu123/panen:default
To install the C++ libraries, please check the following dependencies.
Boostis a very large library. If you don’t want to install the entire package,
PAnEnis able to build the required ones automatically.
CppUnitprovides test frameworks. If
CppUnitis found in the system, test programs will be compiled.
To set up the dependency, it is recommended to use conda. I chose
minicoda instead of
anaconda simply beacause
miniconda is the light-weight version. If you already have
anaconda, you are fine as well.
The following code sets up the environment from stratch:
# Python version is required because of boost compatibility issues conda create -n venv_anen python==3.8 -y # Keep your environment activate during the entire installation process, including CAnEn conda activate venv_anen # Required dependency conda install -c anaconda cmake boost -y conda install -c conda-forge netcdf-cxx4 eccodes doxygen -y # Optional dependency: LibTorch # If you need libTorch, please go ahead to https://pytorch.org/get-started/locally/ and select # Stable -> [Your OS] -> LibTorch -> C++/Java -> [Compute Platform] -> cxx11 ABI version # # Please see https://github.com/Weiming-Hu/AnalogsEnsemble/issues/86#issuecomment-1047442579 for instructions # on how to inlcude libTorch during the cmake process. # Optional dependency: MPI conda install -c conda-forge openmpi -y
After the dependencies are installed, let’s build
# Download the source files (~10 Mb) wget https://github.com/Weiming-Hu/AnalogsEnsemble/archive/master.zip # Unzip unzip master.zip # Create a separate folder to store all intermediate files during the installation process cd AnalogsEnsemble-master/ mkdir build cd build cmake -DCMAKE_INSTALL_PREFIX=~/AnalogEnsemble .. # Compile make -j 4 # Install make install
Below is a list of parameters you can change and customize.
|CMAKE_C_COMPILER||The C compiler to use.||[System dependent]|
|CMAKE_CXX_COMPILER||The C++ compiler to use.||[System dependent]|
|CMAKE_INSTALL_PREFIX||The installation directory.||[System dependent]|
|CMAKE_PREFIX_PATH||Which folder(s) should cmake search for packages besides the default. Paths are surrounded by double quotes and separated with semicolons.||[Empty]|
|CMAKE_INSTALL_RPATH||The run-time library path. Paths are surrounded by double quotes and separated with semicolons.||[Empty]|
|INSTALL_RAnEn||Build and install the
|BOOST_URL||The URL for downloading Boost. This is only used when
|ENABLE_MPI||Build the MPI supported libraries and executables. This requires the MPI dependency.||OFF|
|ENABLE_OPENMP||Enable multi-threading with OpenMP||ON|
|ENABLE_AI||Enable PyTorch integration and the power of AI.||OFF|
You can change the default of the parameters, for example,
cmake -DCMAKE_INSTALL_PREFIX=~/AnalogEnsemble ... Don’t forget the extra letter
D when specifying argument names.
Here is a list of instructions to build and install
AnEn on supercomputers.
TL;DR Launching an MPI-OpenMP hybrid program can be tricky. If the performance with MPI is acceptable, disable OpenMP (`cmake -DENABLE_OPENMP=OFF ..`). If the hybrid solution is desired, make sure you have the proper setup.
ENABLE_MPI is turned on, MPI programs will be built. These MPI programs are hybrid programs (unless you set
cmake) that use both MPI and OpenMP. Please check with your individual supercomputer platform to find out what the proper configuration for launching an MPI + OpenMP hybrid program is. Users are responsible not to launch too many process and threads at the same time which would overtask the machine and might lead to hanging problems (as what I have seen on XSEDE Stampede2).
To dive deeper into the hybrid parallelization design, MPI is used for computationally expensive portions of the code, e.g. file I/O and analog generation while OpenMP is used by the master process during bottleneck portion of the code, e.g. data reshaping and information queries.
When analogs with a long search and test periods are desired, MPI is used to distribute forecast files across processes. Each process reads a subset of the forecast files. This solves the problem where serial I/O can be very slow.
When a large number of stations/grids present, MPI is used to distribute analog generation for different stations across processes. Each process takes charge of generating analogs for a subset of stations.
Sitting between the file I/O and the analog generation is the bottleneck which is hard to parallelize with MPI, e.g. reshaping the data and querying test/search times. Therefore, they are parallelized with OpenMP on master process only.
So if the platform support heterogeneous task layout, users can theoretically allocate one core per worker process and more cores for the master process to facilitate its multi-threading scope. But again, only do this when you find the bottleneck is taking much longer time than file I/O and analog generation. Use
--profile to have profiling information in standard message output.
Here are also some tips and caveats in this ticket.
# "`-''-/").___..--''"`-._ # (`6_ 6 ) `-. ( ).`-.__.`) WE ARE ... # (_Y_.)' ._ ) `._ `. ``-..-' PENN STATE! # _ ..`--'_..-_/ /--'_.' ,' # (il),-'' (li),' ((!.-' # # Authors: # Weiming Hu <firstname.lastname@example.org> # Guido Cervone <email@example.com> # Laura Clemente-Harding <firstname.lastname@example.org> # Martina Calovi <email@example.com> # # Contributors: # Luca Delle Monache # # Geoinformatics and Earth Observation Laboratory (http://geolab.psu.edu) # Department of Geography and Institute for CyberScience # The Pennsylvania State University