Feature Transformation -- Bucketizer

Usage

ft_bucketizer(x, input_col = NULL, output_col = NULL, splits)

Arguments

x
An object (usually a spark_tbl) coercable to a Spark DataFrame.
input_col
The name of the input column(s).
output_col
The name of the output column.
splits
A numeric vector of cutpoints, indicating the bucket boundaries.

Description

Similar to R's cut function, this transforms a numeric column into a discretized column, with breaks specified through the splits parameter.

See also

See http://spark.apache.org/docs/latest/ml-features.html for more information on the set of transformations available for DataFrame columns in Spark. Other feature transformation routines: ft_binarizer, ft_discrete_cosine_transform, ft_elementwise_product, ft_index_to_string, ft_one_hot_encoder, ft_quantile_discretizer, ft_sql_transformer, ft_string_indexer, ft_vector_assembler, sdf_mutate