Version 0.11.1#

Changelog#

Bug fixes#

  • Fix a bug in SMOTENC where the entries of the one-hot encoding should be divided by sqrt(2) and not 2, taking into account that they are plugged into an Euclidean distance computation. #1014 by Guillaume Lemaitre.

  • Raise an informative error message when all support vectors are tagged as noise in SVMSMOTE. #1016 by Guillaume Lemaitre.

  • Fix a bug in SMOTENC where the median of standard deviation of the continuous features was only computed on the minority class. Now, we are computing this statistic for each class that is up-sampled. #1015 by Guillaume Lemaitre.

  • Fix a bug in SMOTENC such that the case where the median of standard deviation of the continuous features is null is handled in the multiclass case as well. #1015 by Guillaume Lemaitre.

  • Fix a bug in BorderlineSMOTE version 2 where samples should be generated from the whole dataset and not only from the minority class. #1023 by Guillaume Lemaitre.

Version 0.11.0#

July 8, 2023

Changelog#

Bug fixes#

Compatibility#

Deprecation#

Enhancements#

  • SMOTENC now accepts a parameter categorical_encoder allowing to specify a OneHotEncoder with custom parameters. #1000 by Guillaume Lemaitre.

  • SMOTEN now accepts a parameter categorical_encoder allowing to specify a OrdinalEncoder with custom parameters. A new fitted parameter categorical_encoder_ is exposed to access the fitted encoder. #1001 by Guillaume Lemaitre.

  • RandomUnderSampler and RandomOverSampler (when shrinkage is not None) now accept any data types and will not attempt any data conversion. #1004 by Guillaume Lemaitre.

  • SMOTENC now support passing array-like of str when passing the categorical_features parameter. #1008 by :user`Guillaume Lemaitre <glemaitre>`.

  • SMOTENC now support automatic categorical inference when categorical_features is set to "auto". #1009 by :user`Guillaume Lemaitre <glemaitre>`.