MoreParallelR::parallel.apply provides a convenient solution for parallelizing the apply function on array. This function first breaks the dimension specified by MARGIN to a list of smaller arrays and then call the mcapply function to achieve the rest of the parallelization.

parallel.apply(X, MARGIN, FUN, ..., verbose = F, cores = 1,
  progress.bar = F)

Arguments

X

An array, including a matrix.

MARGIN

A vector giving the subscripts which the function will be applied over.

FUN

The function to be applied.

...

Optional arguments to FUN.

verbose

Whether to print progress information.

cores

The number of cores for parallelization.

progress.bar

Whether to show a progress bar. This requires the package pbmcapply.

Value

An array.

Details

To see better improvement by the parallelization, it is preferred to have the runtime of FUN longer. In other words, this solution works better when you have a heavy workload in the function FUN.

This idea was originally inspired by my advisor, Prof. Guido Cervone, during a casual conversation.

This function is different from plyr::laply that it returns an array with the specified MARGIN as dimensions.

Note

Please be aware of whether your FUN behaves differently for a vector, a matrix, or an array. If you are applying the function on a matrix or an array, lapply and plyr:laply will coerce the high-dimensional object to vector; but parallel.apply will take the data AS IT IS to feed the FUN. This might cause different results from this function and apply.

Examples

# This example shows you how to run parallel.apply on a synthetic # array and the how the performance compares to a serial run. # library(profvis) profvis({ library(MoreParallelR) library(magrittr) # Generate synthesized data dims <- c(80 , 90, 100, 15) X <- dims %>% prod() %>% runif(min = 1, max = 10) %>% array(dim = dims) MARGIN <- c(2, 4) cores <- 4 FUN <- function(v) { library(magrittr) # A costly function ret <- v %>% as.vector() %>% sin() %>% cos() %>% var() return(ret) } # Run the paralle code X.new.par <- parallel.apply( X, MARGIN, cores = cores, FUN) # Run the serial code X.new.sq <- apply(X, MARGIN, FUN) # Compare results identical(X.new.par, X.new.sq) })