【Python】样本不均衡处理模块imblearn文档(待更新)

安装地址:https://github.com/scikit-learn-contrib/imbalanced-learn
参考:https://blog.csdn.net/kizgel/article/details/78553009#214-数学公式

Help on package imblearn:

NAME
    imblearn - Toolbox for imbalanced dataset in machine learning.

DESCRIPTION
    ``imbalanced-learn`` is a set of python methods to deal with imbalanced
    datset in machine learning and pattern recognition.
    
    Subpackages
    -----------
    combine
        Module which provides methods based on over-sampling and under-sampling.
    ensemble
        Module which provides methods generating an ensemble of
        under-sampled subsets.
    exceptions
        Module including custom warnings and error clases used across
        imbalanced-learn.
    keras
        Module which provides custom generator, layers for deep learning using
        keras.
    metrics
        Module which provides metrics to quantified the classification performance
        with imbalanced dataset.
    over_sampling
        Module which provides methods to under-sample a dataset.
    tensorflow
        Module which provides custom generator, layers for deep learning using
        tensorflow.
    under-sampling
        Module which provides methods to over-sample a dataset.
    utils
        Module including various utilities.
    pipeline
        Module which allowing to create pipeline with scikit-learn estimators.

PACKAGE CONTENTS
    _version
    base
    combine (package)
    datasets (package)
    ensemble (package)
    exceptions
    keras (package)
    metrics (package)
    over_sampling (package)
    pipeline
    tensorflow (package)
    tests (package)
    under_sampling (package)
    utils (package)

CLASSES
    imblearn.base.BaseSampler(imblearn.base.SamplerMixin)
        imblearn.base.FunctionSampler
    
    class FunctionSampler(BaseSampler)
     |  Construct a sampler from calling an arbitrary callable.
     |  
     |  Read more in the :ref:`User Guide <function_sampler>`.
     |  
     |  Parameters
     |  ----------
     |  func : callable or None,
     |      The callable to use for the transformation. This will be passed the
     |      same arguments as transform, with args and kwargs forwarded. If func is
     |      None, then func will be the identity function.
     |  
     |  accept_sparse : bool, optional (default=True)
     |      Whether sparse input are supported. By default, sparse inputs are
     |      supported.
     |  
     |  kw_args : dict, optional (default=None)
     |      The keyword argument expected by ``func``.
     |  
     |  Notes
     |  -----
     |  
     |  See
     |  :ref:`sphx_glr_auto_examples_plot_outlier_rejections.py`
     |  
     |  Examples
     |  --------
     |  >>> import numpy as np
     |  >>> from sklearn.datasets import make_classification
     |  >>> from imblearn import FunctionSampler
     |  >>> X, y = make_classification(n_classes=2, class_sep=2,
     |  ... weights=[0.1, 0.9], n_informative=3, n_redundant=1, flip_y=0,
     |  ... n_features=20, n_clusters_per_class=1, n_samples=1000, random_state=10)
     |  
     |  We can create to select only the first ten samples for instance.
     |  
     |  >>> def func(X, y):
     |  ...   return X[:10], y[:10]
     |  >>> sampler = FunctionSampler(func=func)
     |  >>> X_res, y_res = sampler.fit_resample(X, y)
     |  >>> np.all(X_res == X[:10])
     |  True
     |  >>> np.all(y_res == y[:10])
     |  True
     |  
     |  We can also create a specific function which take some arguments.
     |  
     |  >>> from collections import Counter
     |  >>> from imblearn.under_sampling import RandomUnderSampler
     |  >>> def func(X, y, sampling_strategy, random_state):
     |  ...   return RandomUnderSampler(
     |  ...       sampling_strategy=sampling_strategy,
     |  ...       random_state=random_state).fit_resample(X, y)
     |  >>> sampler = FunctionSampler(func=func,
     |  ...                           kw_args={'sampling_strategy': 'auto',
     |  ...                                    'random_state': 0})
     |  >>> X_res, y_res = sampler.fit_resample(X, y)
     |  >>> print('Resampled dataset shape {}'.format(
     |  ...     sorted(Counter(y_res).items())))
     |  Resampled dataset shape [(0, 100), (1, 100)]
     |  
     |  Method resolution order:
     |      FunctionSampler
     |      BaseSampler
     |      SamplerMixin
     |      abc.NewBase
     |      sklearn.base.BaseEstimator
     |      builtins.object
     |  
     |  Methods defined here:
     |  
     |  __init__(self, func=None, accept_sparse=True, kw_args=None)
     |      Initialize self.  See help(type(self)) for accurate signature.
     |  
     |  ----------------------------------------------------------------------
     |  Data and other attributes defined here:
     |  
     |  __abstractmethods__ = frozenset()
     |  
     |  ----------------------------------------------------------------------
     |  Data descriptors inherited from BaseSampler:
     |  
     |  ratio_
     |  
     |  ----------------------------------------------------------------------
     |  Methods inherited from SamplerMixin:
     |  
     |  fit(self, X, y)
     |      Check inputs and statistics of the sampler.
     |      
     |      You should use ``fit_resample`` in all cases.
     |      
     |      Parameters
     |      ----------
     |      X : {array-like, sparse matrix}, shape (n_samples, n_features)
     |          Data array.
     |      
     |      y : array-like, shape (n_samples,)
     |          Target array.
     |      
     |      Returns
     |      -------
     |      self : object
     |          Return the instance itself.
     |  
     |  fit_resample(self, X, y)
     |      Resample the dataset.
     |      
     |      Parameters
     |      ----------
     |      X : {array-like, sparse matrix}, shape (n_samples, n_features)
     |          Matrix containing the data which have to be sampled.
     |      
     |      y : array-like, shape (n_samples,)
     |          Corresponding label for each sample in X.
     |      
     |      Returns
     |      -------
     |      X_resampled : {array-like, sparse matrix}, shape (n_samples_new, n_features)
     |          The array containing the resampled data.
     |      
     |      y_resampled : array-like, shape (n_samples_new,)
     |          The corresponding label of `X_resampled`.
     |  
     |  fit_sample = fit_resample(self, X, y)
     |      Resample the dataset.
     |      
     |      Parameters
     |      ----------
     |      X : {array-like, sparse matrix}, shape (n_samples, n_features)
     |          Matrix containing the data which have to be sampled.
     |      
     |      y : array-like, shape (n_samples,)
     |          Corresponding label for each sample in X.
     |      
     |      Returns
     |      -------
     |      X_resampled : {array-like, sparse matrix}, shape (n_samples_new, n_features)
     |          The array containing the resampled data.
     |      
     |      y_resampled : array-like, shape (n_samples_new,)
     |          The corresponding label of `X_resampled`.
     |  
     |  ----------------------------------------------------------------------
     |  Methods inherited from sklearn.base.BaseEstimator:
     |  
     |  __getstate__(self)
     |  
     |  __repr__(self)
     |      Return repr(self).
     |  
     |  __setstate__(self, state)
     |  
     |  get_params(self, deep=True)
     |      Get parameters for this estimator.
     |      
     |      Parameters
     |      ----------
     |      deep : boolean, optional
     |          If True, will return the parameters for this estimator and
     |          contained subobjects that are estimators.
     |      
     |      Returns
     |      -------
     |      params : mapping of string to any
     |          Parameter names mapped to their values.
     |  
     |  set_params(self, **params)
     |      Set the parameters of this estimator.
     |      
     |      The method works on simple estimators as well as on nested objects
     |      (such as pipelines). The latter have parameters of the form
     |      ``<component>__<parameter>`` so that it's possible to update each
     |      component of a nested object.
     |      
     |      Returns
     |      -------
     |      self
     |  
     |  ----------------------------------------------------------------------
     |  Data descriptors inherited from sklearn.base.BaseEstimator:
     |  
     |  __dict__
     |      dictionary for instance variables (if defined)
     |  
     |  __weakref__
     |      list of weak references to the object (if defined)

DATA
    __all__ = ['FunctionSampler', '__version__']

VERSION
    0.4.0.dev0

FILE
    /opt/conda/lib/python3.6/site-packages/imbalanced_learn-0.4.0.dev0-py3.6.egg/imblearn/__init__.py

你可能感兴趣的:(Python,机器学习)