Connecting to Spark

Functions for installing Spark components and managing connections to Spark.

Read Spark Configuration

Manage Spark Connections

Download and install various versions of Spark

View Entries in the Spark Log

Open the Spark web interface

Reading and Writing Data

Functions for reading and writing Spark DataFrames.

Read a CSV file into a Spark DataFrame

Read a JSON file into a Spark DataFrame

Read a Parquet file into a Spark DataFrame

Reads from a Spark Table into a Spark DataFrame.

Write a Spark DataFrame to a CSV

Write a Spark DataFrame to a JSON file

Write a Spark DataFrame to a Parquet file

Writes a Spark DataFrame into a Spark table

Spark Tables

Functions for manipulating Spark Tables.

Cache a Spark Table

Uncache a Spark Table

Spark DataFrames

Functions for maniplulating Spark DataFrames.

Copy an Object into Spark

Mutate a Spark DataFrame

Partition a Spark Dataframe

Model Predictions with Spark DataFrames

Read a Column from a Spark DataFrame

Register a Spark DataFrame

Randomly Sample Rows from a Spark DataFrame

Sort a Spark DataFrame

Add a Unique ID Column to a Spark DataFrame

Machine Learning Algorithms

Functions for invoking machine learning algorithms.

Spark ML -- Alternating Least Squares (ALS) matrix factorization.

Spark ML -- Decision Trees

Spark ML -- Generalized Linear Regression

Spark ML -- Gradient-Boosted Tree

Spark ML -- K-Means Clustering

Spark ML -- Latent Dirichlet Allocation

Spark ML -- Linear Regression

Spark ML -- Logistic Regression

Spark ML -- Multilayer Perceptron

Spark ML -- Naive-Bayes

Spark ML -- One vs Rest

Spark ML -- Principal Components Analysis

Spark ML -- Random Forests

Spark ML -- Survival Regression

Machine Learning Transformers

Functions for transforming features in Spark DataFrames.

Feature Transformation -- Binarizer

Feature Transformation -- Bucketizer

Feature Transformation -- Discrete Cosine Transform (DCT)

Feature Transformation -- ElementwiseProduct

Feature Transformation -- IndexToString

Feature Transformation -- OneHotEncoder

Feature Transformation -- QuantileDiscretizer

Feature Transformation -- SQLTransformer

Feature Transformation -- StringIndexer

Feature Transformation -- VectorAssembler

Machine Learning Utilities

Functions for interacting with Spark ML model fits.

Spark ML - Binary Classification Evaluator

Spark ML - Classification Evaluator

Save / Load a Spark ML Model Fit

Spark ML - Feature Importance for Tree Models

Machine Learning Extensions

Functions for creating custom wrappers to other Spark ML algorithms.

Create Dummy Variables

Create an ML Model Object

Options for Spark ML Routines

Prepare a Spark DataFrame for Spark ML Routines

Pre-process the Inputs to a Spark ML Routine

Extensions API

Functions for creating extensions to the sparklyr package.

Compile Scala sources into a Java Archive (jar)

Read configuration values for a connection

Discover the Scala Compiler

Invoke a Method on a JVM Object

Register a Package that Implements a Spark Extension

Access the Spark API

Define a Spark Compilation Specification

Retrieve the Spark Connection Associated with an R Object

Retrieve a Spark DataFrame

Default Compilation Specification for Spark Extensions

Define a Spark dependency

Retrieve a Spark JVM Object Reference

Get the Spark Version Associated with a Spark Connection


Functions to use with the Livy method (Experimental).

Create a Spark Configuration for Livy

Install Livy

Start Livy