pyspark.sql.DataFrame.checkpoint¶
-
DataFrame.
checkpoint
(eager=True)[source]¶ Returns a checkpointed version of this Dataset. Checkpointing can be used to truncate the logical plan of this
DataFrame
, which is especially useful in iterative algorithms where the plan may grow exponentially. It will be saved to files inside the checkpoint directory set withSparkContext.setCheckpointDir()
.- Parameters
eager – Whether to checkpoint this
DataFrame
immediately
Note
Experimental
New in version 2.1.