Overview
It’s tough to argue with R as a high-quality, cross-platform, open source statistical software product—unless you’re in the business of crunching Big Data. This concise book introduces you to several strategies for using R to analyze large datasets, including three chapters on using R and Hadoop together. You’ll learn the basics of Snow, Multicore, Parallel, Segue, RHIPE, and Hadoop Streaming, including how to find them, how to use them, when they work well, and when they don’t.
With these packages, you can overcome R’s single-threaded nature by spreading work across multiple CPUs, or offloading work to multiple machines to address R’s memory barrier.
- Snow: works well in a traditional cluster environment
- Multicore: popular for multiprocessor and multicore computers
- Parallel: part of the upcoming R 2.14.0 release
- R+Hadoop: provides low-level access to a popular form of cluster computing
- RHIPE: uses Hadoop’s power with R’s language and interactive shell
- Segue: lets you use Elastic MapReduce as a backend for lapply-style operations
This book title, Parallel R (Data Analysis in the Distributed World), ISBN: 9781449309923, by Q. Ethan McCallum, Stephen Weston, published by O'Reilly Media (November 2, 2011) is available in paperback. Our minimum order quantity is 25 copies. All standard bulk book orders ship FREE in the continental USA and delivered in 4-10 business days.
Unlike Amazon and other retailers who may also offer Parallel R (Data Analysis in the Distributed World) books on their website, we specialize in large quantities and provide personal service, from trusted, experienced, friendly people in Portland, Oregon. We offer a Price Match Guarantee, and QuickQuote form, to make purchasing quick and easy.
Prefer to work with a human being when you order Parallel R (Data Analysis in the Distributed World) books in bulk? Our Book Specialists are standing by Monday-Friday 8-5 PST, ready to help!