Running loops in parallel in R using foreach

Описание к видео Running loops in parallel in R using foreach

Loops have a bad reputation in R for being slow. In many cases, loops can be avoided using vectorized functions or apply functions like lapply or the map family of functions from the purrr package.

However, if you encounter R code that runs too slowly because of loops and you find it hard to rewrite the code to avoid loops, a quicker, yet powerful approach may be to make the loops run in parallel. We can do that using the foreach package by Michelle Wallig and Steve Weston.

We compare Base R's for loops to the foreach approach. A strength of the latter is that it automatically creates a return object (default: a list), which is not the case in Base R. (It's possible to customize that, which I don't do in the video.) Benchmarking shows a great speed improvement for parallelized loops compared to loops running sequentially. However, the clusterApply() approach is still a bit faster in our use case, which runs 200 regression models and returns model summaries.

Check out foreach's documentation: It contains well-written vignettes - see help(package = "foreach"). A powerful concept I don't mention in the video is iterators, which allow you to efficiently manage what is sent to the workers in each iteration, to minimize data transfer overhead.

Note that not all loops are suited for running in parallel: especially if each iteration depends on results of previous iterations, as may be the case in simulations. Here, we assume that each iteration runs independently of other iterations.

Code can be found here:
https://github.com/fjodor/paralleliza...

Here's the video that explains parallel::clusterApply() in more detail:
   • Running R code in parallel using para...  

Thumbnail image: Chait Goli from Pexels

Contact me, e. g. to discuss (online) R workshops / trainings / webinars:

LinkedIn:   / wolfriepl  
Twitter:   / statistikindd  
Xing: https://www.xing.com/profile/Wolf_Riepl
Facebook:   / statistikdresden  

https://statistik-dresden.de/kontakt
R Workshops: https://statistik-dresden.de/r-schulu...
Blog (German, translate option): https://statistik-dresden.de/statisti...

Playlist: Music chart history
   • Music Chart History  

Комментарии

Информация по комментариям в разработке