pyspark.sql.DataFrame.corr¶
-
DataFrame.
corr
(col1, col2, method=None)[source]¶ Calculates the correlation of two columns of a
DataFrame
as a double value. Currently only supports the Pearson Correlation Coefficient.DataFrame.corr()
andDataFrameStatFunctions.corr()
are aliases of each other.- Parameters
col1 – The name of the first column
col2 – The name of the second column
method – The correlation method. Currently only supports “pearson”
New in version 1.4.