Compute (Approximate) Quantiles with a Spark DataFrame

Given a numeric column within a Spark DataFrame, compute approximate quantiles (to some relative error).

sdf_quantile(x, column, probabilities = c(0, 0.25, 0.5, 0.75, 1),
  relative.error = 1e-05)

Arguments

x

An object coercable to a Spark DataFrame (typically, a tbl_spark).

column

The column for which quantiles should be computed.

probabilities

A numeric vector of probabilities, for which quantiles should be computed.

relative.error

The relative error -- lower values imply more precision in the computed quantiles.