Serialize a Spark DataFrame to the Parquet format.

spark_write_parquet(x, path, mode = NULL, options = list())

Arguments

x

A Spark DataFrame or dplyr operation

path

The path to the file. Needs to be accessible from the cluster. Supports the "hdfs://", "s3n://" and "file://" protocols.

mode

Specifies the behavior when data or table already exists.

options

A list of strings with additional options. See http://spark.apache.org/docs/latest/sql-programming-guide.html#configuration.

See also

Other Spark serialization routines: spark_load_table, spark_read_csv, spark_read_jdbc, spark_read_json, spark_read_parquet, spark_read_table, spark_save_table, spark_write_csv, spark_write_json, spark_write_table