The examples below showcase R applications and analysis performed using Sparklyr. You can see how to use dplyr and machine learning functions within R Notebooks, and create interactive dashboards with Spark connections using flexdashboards and Shiny Applications.

Notebooks


One billion NYC taxi trips

You can use Spark and R to analyze data at scale. This document describes how to use sparklyr to access and understand understand your data. Use the following tools in the toolchain. This example compares models using Spark ML, H2O and R

Total US births

Use dplyr syntax to write Apache Spark SQL queries. Use select, where, group by, joins, and window functions in Apache Spark SQL.

ML classifiers

You can use sparklyr to fit a wide variety of machine learning algorithms in Apache Spark. This analysis compares the performance of six classification models in Apache Spark on the Titanic data set. For the Titanic data, decision trees and random forests performed the best and had comparatively fast run times. See results for a detailed comparison.

Dashboards


Time Gained in Flight

This example based on the NYC airports data analyzed in the Notebooks section

Diamonds explorer

This familiar example reads data into Spark using the parquet format. You can sample and filter the data in Spark then collect the results to be visualized.

Shiny Applications


ML Titanic Classification

You can use sparklyr to fit a wide variety of machine learning algorithms in Apache Spark. This analysis compares the performance of six classification models in Apache Spark on the Titanic data set. For the Titanic data, decision trees and random forests performed the best and had comparatively fast run times. See results for a detailed comparison.

Iris K-means clustering

This familiar examples read data into Spark using the parquet format. You can cluster the data in Spark then collect the results to be visualized.

Time gained in flight app

You can connect a Shiny app to a live spark context. This example uses Spark SQL and ML to create a look up table. You can use Shiny to filter the look up table and visualize the results.

sparklyr is an RStudio project. © 2016 RStudio, Inc.