R interface to Apache Spark ™

Interact with Spark using familiar R interfaces, such as dplyr, broom, and DBI.
Gain access to Spark’s distributed Machine Learning libraries, Structure Streaming,and ML Pipelines from R.
Extend your toolbox by adding XGBoost, MLeap, H2O and Graphframes to your Spark plus R analysis.
Connect R wherever Spark runs: Hadoop, Mesos, Kubernetes, Stand Alone, and Livy.
Run distributed R code inside Spark

Get Started

Welcome new users! Start here to learn how to install and use sparklyr.

Guides

“How-to” articles to help you learn how to do things such as: connect AWS S3 buckets, handling Streaming Data, create ML Pipelines and others.

Deployment

Articles on Spark environments. Including AWS EMR, Databricks and Qubole.