HashingTF¶
-
class
pyspark.mllib.feature.
HashingTF
(numFeatures=1048576)[source]¶ Maps a sequence of terms to their term frequencies using the hashing trick.
Note
The terms must be hashable (can not be dict/set/list…).
- Parameters
numFeatures – number of features (default: 2^20)
>>> htf = HashingTF(100) >>> doc = "a a b b c d".split(" ") >>> htf.transform(doc) SparseVector(100, {...})
New in version 1.2.0.
Methods
Methods Documentation