Feature Transformation -- Binarizer

Apply thresholding to a column, such that values less than or equal to the threshold are assigned the value 0.0, and values greater than the threshold are assigned the value 1.0. Column output is numeric for compatibility with other modeling functions.

ft_binarizer(x, input.col, output.col, threshold = 0.5, ...)

Arguments

x

An object (usually a spark_tbl) coercable to a Spark DataFrame.

input.col

The name of the input column(s).

output.col

The name of the output column.

threshold

The numeric threshold.

...

Optional arguments; currently unused.

See also

See http://spark.apache.org/docs/latest/ml-features for more information on the set of transformations available for DataFrame columns in Spark.

Other feature transformation routines: ft_bucketizer, ft_count_vectorizer, ft_discrete_cosine_transform, ft_elementwise_product, ft_index_to_string, ft_one_hot_encoder, ft_quantile_discretizer, ft_regex_tokenizer, ft_stop_words_remover, ft_string_indexer, ft_tokenizer, ft_vector_assembler, sdf_mutate