MoreParallelR::parallel.apply provides a convenient solution
for parallelizing the apply
function on array. This
function first breaks the dimension specified by MARGIN
to a list of smaller arrays and then call the mcapply
function to achieve the rest of the parallelization.
parallel.apply(X, MARGIN, FUN, ..., verbose = F, cores = 1, progress.bar = F)
X | An array, including a matrix. |
---|---|
MARGIN | A vector giving the subscripts which the function will be applied over. |
FUN | The function to be applied. |
... | Optional arguments to |
verbose | Whether to print progress information. |
cores | The number of cores for parallelization. |
progress.bar | Whether to show a progress bar.
This requires the package |
An array.
To see better improvement by the parallelization, it is
preferred to have the runtime of FUN
longer. In other
words, this solution works better when you have a heavy
workload in the function FUN
.
This idea was originally inspired by my advisor, Prof. Guido Cervone, during a casual conversation.
This function is different from
plyr::laply
that it returns an array with the specified MARGIN
as
dimensions.
Please be aware of whether your FUN
behaves
differently for a vector, a matrix, or an array. If you
are applying the function on a matrix or an array, lapply
and plyr:laply
will coerce the high-dimensional object
to vector; but parallel.apply
will take the data AS IT IS
to feed the FUN
. This might cause different results
from this function and apply
.
# This example shows you how to run parallel.apply on a synthetic # array and the how the performance compares to a serial run. # library(profvis) profvis({ library(MoreParallelR) library(magrittr) # Generate synthesized data dims <- c(80 , 90, 100, 15) X <- dims %>% prod() %>% runif(min = 1, max = 10) %>% array(dim = dims) MARGIN <- c(2, 4) cores <- 4 FUN <- function(v) { library(magrittr) # A costly function ret <- v %>% as.vector() %>% sin() %>% cos() %>% var() return(ret) } # Run the paralle code X.new.par <- parallel.apply( X, MARGIN, cores = cores, FUN) # Run the serial code X.new.sq <- apply(X, MARGIN, FUN) # Compare results identical(X.new.par, X.new.sq) })