FPGrowth

class pyspark.mllib.fpm.FPGrowth[source]

A Parallel FP-growth algorithm to mine frequent itemsets.

New in version 1.4.0.

Methods

Methods Documentation

classmethod train(data, minSupport=0.3, numPartitions=- 1)[source]

Computes an FP-Growth model that contains frequent itemsets.

Parameters
  • data – The input data set, each element contains a transaction.

  • minSupport – The minimal support level. (default: 0.3)

  • numPartitions – The number of partitions used by parallel FP-growth. A value of -1 will use the same number as input data. (default: -1)

New in version 1.4.0.