Parallel Analog Ensemble (PAnEn) generates accurate forecast ensembles relying on a single deterministic model simulation and the historical observations. The technique was introduced by Luca Delle Monache et al. in the paper Probabilistic Weather Prediction with an Analog Ensemble. Developed and maintained by GEOlab at Penn State, PAnEn aims to provide an efficient implementation for this technique and user-friendly interfaces in R and C++ for researchers who want to use this technique in their own research.
The easiest way to use this package is to install the R package, ‘RAnEn’. C++ libraries are also available but they are designed for intermediate users with requirement for performance. For installation guidance, please refer to the installation section.
To cite this package, you have several options:
LaTex: Please use this file for citation.
R: Simply type
citation('RAnEn')and the citation message will be printed.
- Using plain text: Please use the following citation format:
Weiming Hu, Guido Cervone, Laura Clemente-Harding, and Martina Calovi. (2019). Parallel Analog Ensemble. Zenodo. http://doi.org/10.5281/zenodo.3384321
RAnEn is very easy to install if you are already using R. This is the recommended way to start.
The command is the same for
RAnEn installation and update.
RAnEn, please install the following packages first:
- If you are using
Windows, please also install the latest version of Rtools.
The following R command install the latest
install.packages("https://github.com/Weiming-Hu/AnalogsEnsemble/raw/master/RAnalogs/releases/RAnEn_latest.tar.gz", repos = NULL)
That’s it. You are good to go. Please refer to tutorials or the R documentation to learn more about using
RAnEn. You might also want to install RAnEnExtra package with functions for visualization and verification. After
RAnEn installation, you can simply run
Mac users: if the package shows that
OpenMP is not supported. You can do one of the followings:
- Avoid using Clang compilers and convert to GNU compilers. To change the compilers used by R, create a file
~/.R/Makevarsif you do not have it already and add the following content to it. Of course, change the compilers to what you have. If you do not have any alternative compilers other than Clang, HomeBrew is your friend.
CC=gcc-8 CXX=g++-8 CXX1X=g++-8 CXX14=g++-8
- You can also follow the instructions here provided by
data.table. They provide similar solutions but stick with Clang compilers.
After the installation, you can always revert back to your original setup and
RAnEn will stay supported by
To install the C++ libraries, please check the following dependencies:
- Required CMake is the required build system generator.
- Required NetCDF provides the file I/O with NetCDF files.
- Required Eccodes provides the file I/O with Grib2 files.
- Optional Boost provides high-performance data structures.
Boostis a very large library. If you don’t want to install the entire package,
PAnEnis able to build the required ones automatically.
CppUnitprovides test frameworks. If
CppUnitis found in the system, test programs will be compiled.
Please use the following scripts to install the libraries:
# Download the source files (~10 Mb) wget https://github.com/Weiming-Hu/AnalogsEnsemble/archive/master.zip # Unzip unzip master.zip # Create a separate folder to store all intermediate files during the installation process cd AnalogsEnsemble-master/ mkdir build cd build cmake -DCMAKE_INSTALL_PREFIX=~/AnalogEnsemble .. # Compile make -j 4 # Install make install
Below is a list of parameters you can change and customize.
|CMAKE_C_COMPILER||The C compiler to use.||[System dependent]|
|CMAKE_CXX_COMPILER||The C++ compiler to use.||[System dependent]|
|CMAKE_INSTALL_PREFIX||The installation directory.||[System dependent]|
|CMAKE_PREFIX_PATH||Which folder(s) should cmake search for packages besides the default. Paths are surrounded by double quotes and separated with semicolons.||[Empty]|
|CMAKE_INSTALL_RPATH||The run-time library path. Paths are surrounded by double quotes and separated with semicolons.||[Empty]|
|INSTALL_RAnEn||Build and install the
|BOOST_URL||The URL for downloading Boost. This is only used when
|ENABLE_MPI||Build the MPI supported libraries and executables. This requires the MPI dependency.||OFF|
|ENABLE_OPENMP||Enable multi-threading with OpenMP||ON|
|ENABLE_AI||Enable PyTorch integration and the power of AI.||OFF|
You can change the default of the parameters, for example,
cmake -DCMAKE_INSTALL_PREFIX=~/AnalogEnsemble ... Don’t forget the extra letter
D when specifying argument names.
High-Performance Computing and Supercomputers
Here is a list of instructions to build and install
AnEn on supercomputers.
MPI and OpenMP
TL;DR Launching an MPI-OpenMP hybrid program can be tricky. If the performance with MPI is acceptable, disable OpenMP (`cmake -DENABLE_OPENMP=OFF ..`). If the hybrid solution is desired, make sure you have the proper setup.
ENABLE_MPI is turned on, MPI programs will be built. These MPI programs are hybrid programs (unless you set
cmake) that use both MPI and OpenMP. Please check with your individual supercomputer platform to find out what the proper configuration for launching an MPI + OpenMP hybrid program is. Users are responsible not to launch too many process and threads at the same time which would overtask the machine and might lead to hanging problems (as what I have seen on XSEDE Stampede2).
To dive deeper into the hybrid parallelization design, MPI is used for computationally expensive portions of the code, e.g. file I/O and analog generation while OpenMP is used by the master process during bottleneck portion of the code, e.g. data reshaping and information queries.
When analogs with a long search and test periods are desired, MPI is used to distribute forecast files across processes. Each process reads a subset of the forecast files. This solves the problem where serial I/O can be very slow.
When a large number of stations/grids present, MPI is used to distribute analog generation for different stations across processes. Each process takes charge of generating analogs for a subset of stations.
Sitting between the file I/O and the analog generation is the bottleneck which is hard to parallelize with MPI, e.g. reshaping the data and querying test/search times. Therefore, they are parallelized with OpenMP on master process only.
So if the platform support heterogeneous task layout, users can theoretically allocate one core per worker process and more cores for the master process to facilitate its multi-threading scope. But again, only do this when you find the bottleneck is taking much longer time than file I/O and analog generation. Use
--profile to have profiling information in standard message output.
Here are also some tips and caveats in this ticket.
- Delle Monache, Luca, et al. “Probabilistic weather prediction with an analog ensemble.” Monthly Weather Review 141.10 (2013): 3498-3516.
- Clemente-Harding, L. A Beginners Introduction to the Analog Ensemble Technique
- Cervone, Guido, et al. “Short-term photovoltaic power forecasting using Artificial Neural Networks and an Analog Ensemble.” Renewable energy 108 (2017): 274-286.
- Junk, Constantin, et al. “Predictor-weighting strategies for probabilistic wind power forecasting with an analog ensemble.” Meteorol. Z 24.4 (2015): 361-379.
- Balasubramanian, Vivek, et al. “Harnessing the power of many: Extensible toolkit for scalable ensemble applications.” 2018 IEEE International Parallel and Distributed Processing Symposium (IPDPS). IEEE, 2018.
- Hu, Weiming, and Guido Cervone. “Dynamically Optimized Unstructured Grid (DOUG) for Analog Ensemble of numerical weather predictions using evolutionary algorithms.” Computers & Geosciences 133 (2019): 104299.
# "`-''-/").___..--''"`-._ # (`6_ 6 ) `-. ( ).`-.__.`) WE ARE ... # (_Y_.)' ._ ) `._ `. ``-..-' PENN STATE! # _ ..`--'_..-_/ /--'_.' ,' # (il),-'' (li),' ((!.-' # # Authors: # Weiming Hu <firstname.lastname@example.org> # Guido Cervone <email@example.com> # Laura Clemente-Harding <firstname.lastname@example.org> # Martina Calovi <email@example.com> # # Contributors: # Luca Delle Monache # # Geoinformatics and Earth Observation Laboratory (http://geolab.psu.edu) # Department of Geography and Institute for CyberScience # The Pennsylvania State University