GaussianMixture¶
-
class
pyspark.mllib.clustering.
GaussianMixture
[source]¶ Learning algorithm for Gaussian Mixtures using the expectation-maximization algorithm.
New in version 1.3.0.
Methods
Methods Documentation
-
classmethod
train
(rdd, k, convergenceTol=0.001, maxIterations=100, seed=None, initialModel=None)[source]¶ Train a Gaussian Mixture clustering model.
- Parameters
rdd – Training points as an RDD of Vector or convertible sequence types.
k – Number of independent Gaussians in the mixture model.
convergenceTol – Maximum change in log-likelihood at which convergence is considered to have occurred. (default: 1e-3)
maxIterations – Maximum number of iterations allowed. (default: 100)
seed – Random seed for initial Gaussian distribution. Set as None to generate seed based on system time. (default: None)
initialModel – Initial GMM starting point, bypassing the random initialization. (default: None)
New in version 1.3.0.
-
classmethod