小猪课堂

【Python机器学习】——决策树DecisionTreeClassifier详解

点击可以看别人总结的DecisionTreeClassifier决策树分类器

这个DecisionTreeClassifier属于分类树
还有另一种是回归树DecisionTreeRegression

我们先来调用包sklearn 中的tree我们一点一点学sklearn

from sklearn import tree

有人愿意产看源代码可以看下面哈，我觉得来这搜的都不愿意看，我们理论懂就好了，然后用起来

clf=tree.DecisionTreeClassifier()
clf

我们一点一点分解DecisionTreeClassifier() 记住这是驼峰写法就好了，以后只要看到sklearn就知道作者使用的是驼峰写法。

DecisionTreeClassifier(class_weight=None, criterion='gini', max_depth=None,
                       max_features=None, max_leaf_nodes=None,
                       min_impurity_decrease=0.0, min_impurity_split=None,
                       min_samples_leaf=1, min_samples_split=2,
                       min_weight_fraction_leaf=0.0, presort=False,
                       random_state=None, splitter='best')

假如：
DecisionTreeClassifier(criterion=‘entropy’, min_samples_leaf=3)函数为创建一个决策树模型，其函数的参数含义如下所示：

class_weight ： 指定样本各类别的的权重，主要是为了防止训练集某些类别的样本过多导致训练的决策树过于偏向这些类别。这里可以自己指定各个样本的权重，如果使用“balanced”，则算法会自己计算权重，样本量少的类别所对应的样本权重会高。

criterion ： gini或者entropy,前者是基尼系数，后者是信息熵；

max_depth ： int or None, optional (default=None) 设置决策随机森林中的决策树的最大深度，深度越大，越容易过拟合，推荐树的深度为：5-20之间；

max_features： None（所有），log2，sqrt，N 特征小于50的时候一般使用所有的；

max_leaf_nodes ： 通过限制最大叶子节点数，可以防止过拟合，默认是"None”，即不限制最大的叶子节点数。

min_impurity_decrease ：

random_state ：

min_impurity_split： 这个值限制了决策树的增长，如果某节点的不纯度(基尼系数，信息增益，均方差，绝对差)小于这个阈值则该节点不再生成子节点。即为叶子节点。

min_samples_leaf ： 这个值限制了叶子节点最少的样本数，如果某叶子节点数目小于样本数，则会和兄弟节点一起被剪枝。

min_samples_split ： 设置结点的最小样本数量，当样本数量可能小于此值时，结点将不会在划分。

min_weight_fraction_leaf： 这个值限制了叶子节点所有样本权重和的最小值，如果小于这个值，则会和兄弟节点一起被剪枝默认是0，就是不考虑权重问题。

presort ：

splitter ： best or random 前者是在所有特征中找最好的切分点后者是在部分特征中，默认的”best”适合样本量不大的时候，而如果样本数据量非常大，此时决策树构建推荐”random” 。

更多内容VX关注【小猪课堂】公众号，你想要的干活都在这里

源代码如下，我是不愿意看的

============================================================

Help on DecisionTreeClassifier in module sklearn.tree.tree object:

class DecisionTreeClassifier(BaseDecisionTree, sklearn.base.ClassifierMixin)
 |  DecisionTreeClassifier(criterion='gini', splitter='best', max_depth=None, min_samples_split=2, min_samples_leaf=1, min_weight_fraction_leaf=0.0, max_features=None, random_state=None, max_leaf_nodes=None, min_impurity_decrease=0.0, min_impurity_split=None, class_weight=None, presort=False)
 |  
 |  A decision tree classifier.
 |  
 |  Read more in the :ref:`User Guide <tree>`.
 |  
 |  Parameters
 |  ----------
 |  criterion : string, optional (default="gini")
 |      The function to measure the quality of a split. Supported criteria are
 |      "gini" for the Gini impurity and "entropy" for the information gain.
 |  
 |  splitter : string, optional (default="best")
 |      The strategy used to choose the split at each node. Supported
 |      strategies are "best" to choose the best split and "random" to choose
 |      the best random split.
 |  
 |  max_depth : int or None, optional (default=None)
 |      The maximum depth of the tree. If None, then nodes are expanded until
 |      all leaves are pure or until all leaves contain less than
 |      min_samples_split samples.
 |  
 |  min_samples_split : int, float, optional (default=2)
 |      The minimum number of samples required to split an internal node:
 |  
 |      - If int, then consider `min_samples_split` as the minimum number.
 |      - If float, then `min_samples_split` is a fraction and
 |        `ceil(min_samples_split * n_samples)` are the minimum
 |        number of samples for each split.
 |  
 |      .. versionchanged:: 0.18
 |         Added float values for fractions.
 |  
 |  min_samples_leaf : int, float, optional (default=1)
 |      The minimum number of samples required to be at a leaf node.
 |      A split point at any depth will only be considered if it leaves at
 |      least ``min_samples_leaf`` training samples in each of the left and
 |      right branches.  This may have the effect of smoothing the model,
 |      especially in regression.
 |  
 |      - If int, then consider `min_samples_leaf` as the minimum number.
 |      - If float, then `min_samples_leaf` is a fraction and
 |        `ceil(min_samples_leaf * n_samples)` are the minimum
 |        number of samples for each node.
 |  
 |      .. versionchanged:: 0.18
 |         Added float values for fractions.
 |  
 |  min_weight_fraction_leaf : float, optional (default=0.)
 |      The minimum weighted fraction of the sum total of weights (of all
 |      the input samples) required to be at a leaf node. Samples have
 |      equal weight when sample_weight is not provided.
 |  
 |  max_features : int, float, string or None, optional (default=None)
 |      The number of features to consider when looking for the best split:
 |  
 |          - If int, then consider `max_features` features at each split.
 |          - If float, then `max_features` is a fraction and
 |            `int(max_features * n_features)` features are considered at each
 |            split.
 |          - If "auto", then `max_features=sqrt(n_features)`.
 |          - If "sqrt", then `max_features=sqrt(n_features)`.
 |          - If "log2", then `max_features=log2(n_features)`.
 |          - If None, then `max_features=n_features`.
 |  
 |      Note: the search for a split does not stop until at least one
 |      valid partition of the node samples is found, even if it requires to
 |      effectively inspect more than ``max_features`` features.
 |  
 |  random_state : int, RandomState instance or None, optional (default=None)
 |      If int, random_state is the seed used by the random number generator;
 |      If RandomState instance, random_state is the random number generator;
 |      If None, the random number generator is the RandomState instance used
 |      by `np.random`.
 |  
 |  max_leaf_nodes : int or None, optional (default=None)
 |      Grow a tree with ``max_leaf_nodes`` in best-first fashion.
 |      Best nodes are defined as relative reduction in impurity.
 |      If None then unlimited number of leaf nodes.
 |  
 |  min_impurity_decrease : float, optional (default=0.)
 |      A node will be split if this split induces a decrease of the impurity
 |      greater than or equal to this value.
 |  
 |      The weighted impurity decrease equation is the following::
 |  
 |          N_t / N * (impurity - N_t_R / N_t * right_impurity
 |                              - N_t_L / N_t * left_impurity)
 |  
 |      where ``N`` is the total number of samples, ``N_t`` is the number of
 |      samples at the current node, ``N_t_L`` is the number of samples in the
 |      left child, and ``N_t_R`` is the number of samples in the right child.
 |  
 |      ``N``, ``N_t``, ``N_t_R`` and ``N_t_L`` all refer to the weighted sum,
 |      if ``sample_weight`` is passed.
 |  
 |      .. versionadded:: 0.19
 |  
 |  min_impurity_split : float, (default=1e-7)
 |      Threshold for early stopping in tree growth. A node will split
 |      if its impurity is above the threshold, otherwise it is a leaf.
 |  
 |      .. deprecated:: 0.19
 |         ``min_impurity_split`` has been deprecated in favor of
 |         ``min_impurity_decrease`` in 0.19. The default value of
 |         ``min_impurity_split`` will change from 1e-7 to 0 in 0.23 and it
 |         will be removed in 0.25. Use ``min_impurity_decrease`` instead.
 |  
 |  class_weight : dict, list of dicts, "balanced" or None, default=None
 |      Weights associated with classes in the form ``{class_label: weight}``.
 |      If not given, all classes are supposed to have weight one. For
 |      multi-output problems, a list of dicts can be provided in the same
 |      order as the columns of y.
 |  
 |      Note that for multioutput (including multilabel) weights should be
 |      defined for each class of every column in its own dict. For example,
 |      for four-class multilabel classification weights should be
 |      [{0: 1, 1: 1}, {0: 1, 1: 5}, {0: 1, 1: 1}, {0: 1, 1: 1}] instead of
 |      [{1:1}, {2:5}, {3:1}, {4:1}].
 |  
 |      The "balanced" mode uses the values of y to automatically adjust
 |      weights inversely proportional to class frequencies in the input data
 |      as ``n_samples / (n_classes * np.bincount(y))``
 |  
 |      For multi-output, the weights of each column of y will be multiplied.
 |  
 |      Note that these weights will be multiplied with sample_weight (passed
 |      through the fit method) if sample_weight is specified.
 |  
 |  presort : bool, optional (default=False)
 |      Whether to presort the data to speed up the finding of best splits in
 |      fitting. For the default settings of a decision tree on large
 |      datasets, setting this to true may slow down the training process.
 |      When using either a smaller dataset or a restricted depth, this may
 |      speed up the training.
 |  
 |  Attributes
 |  ----------
 |  classes_ : array of shape = [n_classes] or a list of such arrays
 |      The classes labels (single output problem),
 |      or a list of arrays of class labels (multi-output problem).
 |  
 |  feature_importances_ : array of shape = [n_features]
 |      The feature importances. The higher, the more important the
 |      feature. The importance of a feature is computed as the (normalized)
 |      total reduction of the criterion brought by that feature.  It is also
 |      known as the Gini importance [4]_.
 |  
 |  max_features_ : int,
 |      The inferred value of max_features.
 |  
 |  n_classes_ : int or list
 |      The number of classes (for single output problems),
 |      or a list containing the number of classes for each
 |      output (for multi-output problems).
 |  
 |  n_features_ : int
 |      The number of features when ``fit`` is performed.
 |  
 |  n_outputs_ : int
 |      The number of outputs when ``fit`` is performed.
 |  
 |  tree_ : Tree object
 |      The underlying Tree object. Please refer to
 |      ``help(sklearn.tree._tree.Tree)`` for attributes of Tree object and
 |      :ref:`sphx_glr_auto_examples_tree_plot_unveil_tree_structure.py`
 |      for basic usage of these attributes.
 |  
 |  Notes
 |  -----
 |  The default values for the parameters controlling the size of the trees
 |  (e.g. ``max_depth``, ``min_samples_leaf``, etc.) lead to fully grown and
 |  unpruned trees which can potentially be very large on some data sets. To
 |  reduce memory consumption, the complexity and size of the trees should be
 |  controlled by setting those parameter values.
 |  
 |  The features are always randomly permuted at each split. Therefore,
 |  the best found split may vary, even with the same training data and
 |  ``max_features=n_features``, if the improvement of the criterion is
 |  identical for several splits enumerated during the search of the best
 |  split. To obtain a deterministic behaviour during fitting,
 |  ``random_state`` has to be fixed.
 |  
 |  See also
 |  --------
 |  DecisionTreeRegressor
 |  
 |  References
 |  ----------
 |  
 |  .. [1] https://en.wikipedia.org/wiki/Decision_tree_learning
 |  
 |  .. [2] L. Breiman, J. Friedman, R. Olshen, and C. Stone, "Classification
 |         and Regression Trees", Wadsworth, Belmont, CA, 1984.
 |  
 |  .. [3] T. Hastie, R. Tibshirani and J. Friedman. "Elements of Statistical
 |         Learning", Springer, 2009.
 |  
 |  .. [4] L. Breiman, and A. Cutler, "Random Forests",
 |         https://www.stat.berkeley.edu/~breiman/RandomForests/cc_home.htm
 |  
 |  Examples
 |  --------
 |  >>> from sklearn.datasets import load_iris
 |  >>> from sklearn.model_selection import cross_val_score
 |  >>> from sklearn.tree import DecisionTreeClassifier
 |  >>> clf = DecisionTreeClassifier(random_state=0)
 |  >>> iris = load_iris()
 |  >>> cross_val_score(clf, iris.data, iris.target, cv=10)
 |  ...                             # doctest: +SKIP
 |  ...
 |  array([ 1.     ,  0.93...,  0.86...,  0.93...,  0.93...,
 |          0.93...,  0.93...,  1.     ,  0.93...,  1.      ])
 |  
 |  Method resolution order:
 |      DecisionTreeClassifier
 |      BaseDecisionTree
 |      sklearn.base.BaseEstimator
 |      sklearn.base.MultiOutputMixin
 |      sklearn.base.ClassifierMixin
 |      builtins.object
 |  
 |  Methods defined here:
 |  
 |  __init__(self, criterion='gini', splitter='best', max_depth=None, min_samples_split=2, min_samples_leaf=1, min_weight_fraction_leaf=0.0, max_features=None, random_state=None, max_leaf_nodes=None, min_impurity_decrease=0.0, min_impurity_split=None, class_weight=None, presort=False)
 |      Initialize self.  See help(type(self)) for accurate signature.
 |  
 |  fit(self, X, y, sample_weight=None, check_input=True, X_idx_sorted=None)
 |      Build a decision tree classifier from the training set (X, y).
 |      
 |      Parameters
 |      ----------
 |      X : array-like or sparse matrix, shape = [n_samples, n_features]
 |          The training input samples. Internally, it will be converted to
 |          ``dtype=np.float32`` and if a sparse matrix is provided
 |          to a sparse ``csc_matrix``.
 |      
 |      y : array-like, shape = [n_samples] or [n_samples, n_outputs]
 |          The target values (class labels) as integers or strings.
 |      
 |      sample_weight : array-like, shape = [n_samples] or None
 |          Sample weights. If None, then samples are equally weighted. Splits
 |          that would create child nodes with net zero or negative weight are
 |          ignored while searching for a split in each node. Splits are also
 |          ignored if they would result in any single class carrying a
 |          negative weight in either child node.
 |      
 |      check_input : boolean, (default=True)
 |          Allow to bypass several input checking.
 |          Don't use this parameter unless you know what you do.
 |      
 |      X_idx_sorted : array-like, shape = [n_samples, n_features], optional
 |          The indexes of the sorted training input samples. If many tree
 |          are grown on the same dataset, this allows the ordering to be
 |          cached between trees. If None, the data will be sorted here.
 |          Don't use this parameter unless you know what to do.
 |      
 |      Returns
 |      -------
 |      self : object
 |  
 |  predict_log_proba(self, X)
 |      Predict class log-probabilities of the input samples X.
 |      
 |      Parameters
 |      ----------
 |      X : array-like or sparse matrix of shape = [n_samples, n_features]
 |          The input samples. Internally, it will be converted to
 |          ``dtype=np.float32`` and if a sparse matrix is provided
 |          to a sparse ``csr_matrix``.
 |      
 |      Returns
 |      -------
 |      p : array of shape = [n_samples, n_classes], or a list of n_outputs
 |          such arrays if n_outputs > 1.
 |          The class log-probabilities of the input samples. The order of the
 |          classes corresponds to that in the attribute `classes_`.
 |  
 |  predict_proba(self, X, check_input=True)
 |      Predict class probabilities of the input samples X.
 |      
 |      The predicted class probability is the fraction of samples of the same
 |      class in a leaf.
 |      
 |      check_input : boolean, (default=True)
 |          Allow to bypass several input checking.
 |          Don't use this parameter unless you know what you do.
 |      
 |      Parameters
 |      ----------
 |      X : array-like or sparse matrix of shape = [n_samples, n_features]
 |          The input samples. Internally, it will be converted to
 |          ``dtype=np.float32`` and if a sparse matrix is provided
 |          to a sparse ``csr_matrix``.
 |      
 |      check_input : bool
 |          Run check_array on X.
 |      
 |      Returns
 |      -------
 |      p : array of shape = [n_samples, n_classes], or a list of n_outputs
 |          such arrays if n_outputs > 1.
 |          The class probabilities of the input samples. The order of the
 |          classes corresponds to that in the attribute `classes_`.
 |  
 |  ----------------------------------------------------------------------
 |  Data and other attributes defined here:
 |  
 |  __abstractmethods__ = frozenset()
 |  
 |  ----------------------------------------------------------------------
 |  Methods inherited from BaseDecisionTree:
 |  
 |  apply(self, X, check_input=True)
 |      Returns the index of the leaf that each sample is predicted as.
 |      
 |      .. versionadded:: 0.17
 |      
 |      Parameters
 |      ----------
 |      X : array_like or sparse matrix, shape = [n_samples, n_features]
 |          The input samples. Internally, it will be converted to
 |          ``dtype=np.float32`` and if a sparse matrix is provided
 |          to a sparse ``csr_matrix``.
 |      
 |      check_input : boolean, (default=True)
 |          Allow to bypass several input checking.
 |          Don't use this parameter unless you know what you do.
 |      
 |      Returns
 |      -------
 |      X_leaves : array_like, shape = [n_samples,]
 |          For each datapoint x in X, return the index of the leaf x
 |          ends up in. Leaves are numbered within
 |          ``[0; self.tree_.node_count)``, possibly with gaps in the
 |          numbering.
 |  
 |  decision_path(self, X, check_input=True)
 |      Return the decision path in the tree
 |      
 |      .. versionadded:: 0.18
 |      
 |      Parameters
 |      ----------
 |      X : array_like or sparse matrix, shape = [n_samples, n_features]
 |          The input samples. Internally, it will be converted to
 |          ``dtype=np.float32`` and if a sparse matrix is provided
 |          to a sparse ``csr_matrix``.
 |      
 |      check_input : boolean, (default=True)
 |          Allow to bypass several input checking.
 |          Don't use this parameter unless you know what you do.
 |      
 |      Returns
 |      -------
 |      indicator : sparse csr array, shape = [n_samples, n_nodes]
 |          Return a node indicator matrix where non zero elements
 |          indicates that the samples goes through the nodes.
 |  
 |  get_depth(self)
 |      Returns the depth of the decision tree.
 |      
 |      The depth of a tree is the maximum distance between the root
 |      and any leaf.
 |  
 |  get_n_leaves(self)
 |      Returns the number of leaves of the decision tree.
 |  
 |  predict(self, X, check_input=True)
 |      Predict class or regression value for X.
 |      
 |      For a classification model, the predicted class for each sample in X is
 |      returned. For a regression model, the predicted value based on X is
 |      returned.
 |      
 |      Parameters
 |      ----------
 |      X : array-like or sparse matrix of shape = [n_samples, n_features]
 |          The input samples. Internally, it will be converted to
 |          ``dtype=np.float32`` and if a sparse matrix is provided
 |          to a sparse ``csr_matrix``.
 |      
 |      check_input : boolean, (default=True)
 |          Allow to bypass several input checking.
 |          Don't use this parameter unless you know what you do.
 |      
 |      Returns
 |      -------
 |      y : array of shape = [n_samples] or [n_samples, n_outputs]
 |          The predicted classes, or the predict values.
 |  
 |  ----------------------------------------------------------------------
 |  Data descriptors inherited from BaseDecisionTree:
 |  
 |  feature_importances_
 |      Return the feature importances.
 |      
 |      The importance of a feature is computed as the (normalized) total
 |      reduction of the criterion brought by that feature.
 |      It is also known as the Gini importance.
 |      
 |      Returns
 |      -------
 |      feature_importances_ : array, shape = [n_features]
 |  
 |  ----------------------------------------------------------------------
 |  Methods inherited from sklearn.base.BaseEstimator:
 |  
 |  __getstate__(self)
 |  
 |  __repr__(self, N_CHAR_MAX=700)
 |      Return repr(self).
 |  
 |  __setstate__(self, state)
 |  
 |  get_params(self, deep=True)
 |      Get parameters for this estimator.
 |      
 |      Parameters
 |      ----------
 |      deep : boolean, optional
 |          If True, will return the parameters for this estimator and
 |          contained subobjects that are estimators.
 |      
 |      Returns
 |      -------
 |      params : mapping of string to any
 |          Parameter names mapped to their values.
 |  
 |  set_params(self, **params)
 |      Set the parameters of this estimator.
 |      
 |      The method works on simple estimators as well as on nested objects
 |      (such as pipelines). The latter have parameters of the form
 |      ``<component>__<parameter>`` so that it's possible to update each
 |      component of a nested object.
 |      
 |      Returns
 |      -------
 |      self
 |  
 |  ----------------------------------------------------------------------
 |  Data descriptors inherited from sklearn.base.BaseEstimator:
 |  
 |  __dict__
 |      dictionary for instance variables (if defined)
 |  
 |  __weakref__
 |      list of weak references to the object (if defined)
 |  
 |  ----------------------------------------------------------------------
 |  Methods inherited from sklearn.base.ClassifierMixin:
 |  
 |  score(self, X, y, sample_weight=None)
 |      Returns the mean accuracy on the given test data and labels.
 |      
 |      In multi-label classification, this is the subset accuracy
 |      which is a harsh metric since you require for each sample that
 |      each label set be correctly predicted.
 |      
 |      Parameters
 |      ----------
 |      X : array-like, shape = (n_samples, n_features)
 |          Test samples.
 |      
 |      y : array-like, shape = (n_samples) or (n_samples, n_outputs)
 |          True labels for X.
 |      
 |      sample_weight : array-like, shape = [n_samples], optional
 |          Sample weights.
 |      
 |      Returns
 |      -------
 |      score : float
 |          Mean accuracy of self.predict(X) wrt. y.

更多内容VX关注【小猪课堂】公众号，你想要的干活都在这里

Python实现简单的机器学习算法 master_chenchengg python python 办公效率 python开发 IT
Python实现简单的机器学习算法开篇：初探机器学习的奇妙之旅搭建环境：一切从安装开始必备工具箱第一步：安装Anaconda和JupyterNotebook小贴士：如何配置Python环境变量算法初体验：从零开始的Python机器学习线性回归：让数据说话数据准备：从哪里找数据编码实战：Python实现线性回归模型评估：如何判断模型好坏逻辑回归：从分类开始理论入门：什么是逻辑回归代码实现：使用skl
Python前沿技术：机器学习与人工智能 4.0啊 Python 人工智能 python 机器学习
Python前沿技术：机器学习与人工智能一、引言随着科技的飞速发展，机器学习和人工智能（AI）已经成为了计算机科学领域的热门话题。Python作为一门易学易用且功能强大的编程语言，已经成为了这两个领域的首选语言之一。本文将深入探讨Python在机器学习和人工智能领域的应用，以及一些前沿技术和工具。二、Python机器学习基础2.1机器学习概述机器学习是人工智能（AI）的一个关键子集，它的核心在于让
Python自动化办公2.0 即将发布百里图书自动化人工智能 python
第一节课：数据整理与清洗第二节课：数据筛选、过滤与排序第三节课：高级数据处理技巧第四节课：数据可视化与实践案例第五节课：统计分析与报表第六节：常见的Excel报表与下方的课程形成知识体系：Python自动化办公(面向2020,Python3.7,不断更新ing)_在线视频教程-CSDN程序员研修院https://edu.csdn.net/course/detail/28031Python机器学习教
【Python机器学习】循环神经网络（RNN）——传递数据并训练 zhangbin_237 Python机器学习机器学习 python rnn 人工智能开发语言深度学习神经网络
与其他Keras模型一样，我们需要向.fit()方法传递数据，并告诉它我们希望训练多少个训练周期（epoch）：model.fit(X_train,y_train,batch_size=batch_size,epochs=epochs,validation_data=(X_test,y_test))因为个人小电脑内存不足，所以吧maxlen参数改成了100重新运行。保存模型：model_struc
【Python机器学习】循环神经网络（RNN）——对RNN进行预测 zhangbin_237 Python机器学习机器学习 python rnn 深度学习人工智能自然语言处理
目录有状态性双向RNN编码向量如果有一个经过训练的模型，接下来就可以对其进行预测：sample_1="""Ihatethatthedismalweatherhadmedownforsolong,whenwillitbreak!Ugh,whendoeshappinessreturn?Thesunisblindingandthepuffycloudsaretoothin.Ican'twaitforth
Python 机器学习基础之数据表示与特征工程【分箱、离散化、线性模型与树 / 交互特征与多项式特征】的简单说明仙魁XAN Python 机器学习基础+实战案例机器学习 python 分箱离散化线性模型与树交互特征与多项式特征
Python机器学习基础之数据表示与特征工程【分箱、离散化、线性模型与树/交互特征与多项式特征】的简单说明目录Python机器学习基础之数据表示与特征工程【分箱、离散化、线性模型与树/交互特征与多项式特征】的简单说明一、简单介绍二、分箱、离散化、线性模型与树三、交互特征与多项式特征附录一、参考文献一、简单介绍Python是一种跨平台的计算机程序设计语言。是一种面向对象的动态类型语言，最初被设计用于
【Python机器学习】机器学习任务中常见的数据异质问题和模型异构问题是什么？解决策略是什么？惊鸿若梦一书生 Python机器学习 python 深度学习开发语言
文章目录数据异质模型异构数据异质数据异质问题（Heterogeneityindata）通常指数据集内部的不一致性，这些不一致性可能来自多种源。在实际应用中，数据异质性可以表现为多种形式，包括：不同来源的数据：数据可能来自不同的数据源，每个源可能采用不同的数据收集方法和标准。例如，社交媒体数据和传统调查数据就可能有很大的差异。不同类型的数据：数据可以是结构化的（例如，数据库中的表格数据），半结构化的
【Python机器学习】卷积神经网络（CNN）的工具包 zhangbin_237 Python机器学习机器学习 python cnn 神经网络自然语言处理开发语言
Python是神经网络工具包最丰富的语言之一。两个主要的神经网络架构分别是Theano和TensorFlow。这两者的底层计算深度依赖C语言，不过它们都提供了强大的PythonAPI。Torch在Python里面也有一个对应的API是PyTorch。这些框架都是高度抽象的工具集，适用于从头构建模型。Python社区开发了一些第三方库来简化这些底层架构的使用。其中Keras在API的友好性和功能性方
【Python机器学习】卷积神经网络（CNN） zhangbin_237 Python机器学习机器学习 python cnn 开发语言自然语言处理
卷积神经网络（CNN）得名于在数据样本上用滑动窗口（或卷积）的概念。卷积在数学中应用很广泛，通常与时间序列数据相关。它是用一个可视化盒子在一个区域内滑动，如下图所示：构建块卷积神经网络最早出现在图像处理和图像识别领域，它能够捕捉每个样本中数据点之间的空间关系，也就能识别出图像中是猫还是狗。卷积网络，也称为convnet，不像传统的前馈网络那样对每个元素（图中的像素）分配权重，而是定义了一组在图像上
python机器学习算法--贝叶斯算法在下小天n 机器学习 python 机器学习算法
1.贝叶斯定理在20世纪60年代初就引入到文字信息检索中，仍然是文字分类的一种热门（基准）方法。文字分类是以词频为特征判断文件所属类型或其他（如垃圾邮件、合法性、新闻分类等）的问题。原理牵涉到概率论的问题，不在详细说明。sklearn.naive_bayes.GaussianNB(priors=None,var_smoothing=1e-09)#Bayes函数·priors：矩阵，shape=[n
【Rust】——采用发布配置自定义构建 Y小夜 Rust（官方文档重点总结）rust 开发语言后端
博主现有专栏：C51单片机（STC89C516），c语言，c++，离散数学，算法设计与分析，数据结构，Python，Java基础，MySQL，linux，基于HTML5的网页设计及应用，Rust（官方文档重点总结），jQuery，前端vue.js，Javaweb开发，Python机器学习等主页链接：Y小夜-CSDN博客今日学习推荐：在当今这个飞速发展的信息时代，人工智能（AI）已经成为了一个不可或
【Rust】——高级类型 Y小夜 Rust（官方文档重点总结）rust 开发语言后端
博主现有专栏：C51单片机（STC89C516），c语言，c++，离散数学，算法设计与分析，数据结构，Python，Java基础，MySQL，linux，基于HTML5的网页设计及应用，Rust（官方文档重点总结），jQuery，前端vue.js，Javaweb开发，Python机器学习等主页链接：Y小夜-CSDN博客目录为了类型安全和抽象而使用的newtype模式类型别名用来创建类型同义词不返回
【Python机器学习】NLP词频背后的含义——隐性语义分析 zhangbin_237 Python机器学习 python 机器学习自然语言处理人工智能开发语言
隐性语义分析基于最古老和最常用的降维技术——奇异值分解（SVD）。SVD将一个矩阵分解成3个方阵，其中一个是对角矩阵。SVD的一个应用是求逆矩阵。一个矩阵可以分解成3个最简单的方阵，然后对这些方阵求转置后再把它们相乘，就得到了原始矩阵的逆矩阵。它为我们提供了一个对大型复杂矩阵求逆的捷径。SVD适用于桁架结构的应力和应变分析等机械工程问题，它对电气工程中的电路分析也很有用，它甚至在数据科学中被用于基
【Python机器学习】NLP分词——利用分词器构建词汇表（三）——度量词袋之间的重合度 zhangbin_237 Python机器学习机器学习自然语言处理人工智能 python 开发语言
如果能够度量两个向量词袋之间的重合度，就可以很好地估计他们所用词的相似程度，而这也是它们语义上重合度的一个很好的估计。因此，下面用点积来估计一些新句子和原始的Jefferson句子之间的词袋向量重合度：importpandasaspdsentence="""ThomasJeffersonBeganbulidingMonticelliastheageof26.\n"""sentence=senten
【Python机器学习】NLP概述——深度处理 zhangbin_237 Python机器学习 python 机器学习自然语言处理人工智能机器人
自然语言处理流水线的各个阶段可以看作是层，就像是前馈神经网络中的层一样。深度学习就是通过在传统的两层机器学习模型架构（特征提取+建模）中添加额外的处理层来创建更复杂的模型和行为。上图中，前四层对应于聊天机器人流水线中的前两个阶段（特征提取和特征分析）。例如，词性标注（POS标注）是在聊天机器人流水线的分析阶段生成特征的一种方法。POS标签由默认的SpaCY流水线自动生成，该流水线包括上图中所有的前
【Python机器学习】NLP分词——词干还原的挑战 zhangbin_237 Python机器学习自然语言处理人工智能机器学习 python 开发语言
要想使用自然语言处理的相关应用，第一件事就是需要一个强大的词汇表。我们要把文档或任何字符串拆分为离散的有意义的词条，这里说的词条仅限于词、标点符号和数值，但是这里使用的技术可以很容易推广到字符序列包含的任何其他有意义的单元，比如ASCII表情符号、Unicode表情符号和数学符号。从文档中检索词条需要一些字符串处理方法，这些方法不仅仅是str.split()，处理时需要把标点符号与词分开，还需要将
【Python机器学习】NLP概述——自然语言智商 zhangbin_237 Python机器学习机器学习自然语言处理人工智能 python 机器人
就像人类的智能一样，如果不考虑多个智能维度，单凭一个智商分数是无法轻易衡量NLP流水线的能力的。衡量机器人系统能力的一种常见方法是：根据系统行为的复杂性和所需的人类监督成都这两个维度来衡量。但是对自然语言处理流水线而言，其目标是建立一个完全自动化的自然语言处理系统，会消除所有的人工监督（一旦模型被训练和部署）。因此，一对更好的IQ维度应该能捕捉到自然语言流水线复杂的广度和深度。像Alexa或All
【Python机器学习】NLP概述——聊天机器人的自然语言流水线 zhangbin_237 Python机器学习自然语言处理机器人人工智能 python 机器学习
构建对话引擎或者聊天机器人所需的NLP流水线类似于某些问答系统。聊天机器人需要4个处理阶段和一个数据库来维护过去语句和回复的记录。这4个处理阶段中的每个阶段都可以包含一个或多个并行或串行工作的处理算法。如下图所示：1、解析：从自然语言文本中提取特征、结构化数值数；2、分析：通过对文本的情感、语法合法度及语义打分，生成和组合特征；3、生成：使用模板、搜索或语言模型生成可能的回复；4、执行：根据对话历
《Python机器学习项目实战》书籍介绍袁袁袁袁满 python 机器学习开发语言
文章目录书籍介绍主要内容书籍目录书籍介绍《Python机器学习项目实战》带领大家在构建实际项目的过程中，掌握关键的机器学习概念！使用机器学习，我们可完成客户行为分析、价格趋势预测、风险评估等任务。要想掌握机器学习，需要有优质的范例、清晰的讲解和大量的练习。《Python机器学习项目实战》完全满足这三点！《Python机器学习项目实战》展示了现实、实用的机器学习场景，并全面、清晰地介绍了机器学习的关
【Python机器学习】NLP的部分实际应用 zhangbin_237 Python机器学习机器学习自然语言处理人工智能 python 大数据
自然语言处理在现实中非常多的应用，下表是其中的一些例子：应用示例1示例2示例3搜索web文档自动补全编辑拼写语法风格对话聊天机器人助手行程安排写作索引用语索引目录电子邮件垃圾邮件过滤分类优先级排序文本挖掘摘要知识提取医学诊断法律法律断案先例搜索传票分类新闻事件检索真相核查标题排字归属剽窃检测文字取证风格指导情感分析团队士气监控产品评论分类客户关怀行为预测金融选举预测营销创作电影脚本诗歌歌词如果在索
python清华大学出版社答案_Python机器学习及实践 weixin_39805119 python清华大学出版社答案
第1章机器学习的基础知识1.1何谓机器学习1.1.1传感器和海量数据1.1.2机器学习的重要性1.1.3机器学习的表现1.1.4机器学习的主要任务1.1.5选择合适的算法1.1.6机器学习程序的步骤1.2综合分类1.3推荐系统和深度学习1.3.1推荐系统1.3.2深度学习1.4何为Python1.4.1使用Python软件的由来1.4.2为什么使用Python1.4.3Python设计定位1.4.
Python机器学习笔记：CART算法实战战争热诚
完整代码及其数据，请移步小编的GitHub传送门：请点击我如果点击有误：https://github.com/LeBron-Jian/MachineLearningNote前言在python机器学习笔记：深入学习决策树算法原理一文中我们提到了决策树里的ID3算法，C4.5算法，并且大概的了
python机器学习库Scikit-learn 崔吉龙
python语言中用来处理机器学习的库最重要的就是Scikit-learn，简称sklearn。被大多数科学家所钟爱，包括了构建良好的学习算法、误差函数和测试例程。在sklearn的核心有四种类型的类覆盖了所有机器学习功能：分类回归聚类分组转换数据虽然sklearn提供的算法比较多，但是他们都符合基本的接口定义，为了是使用不同的算法时，所使用的接口时统一的。sklearn提供了四个基本对象接口。评
optuna，一个好用的Python机器学习自动化超参数优化库牵着猫散步的鼠鼠 python 开发语言
️个人主页：鼠鼠我捏，要死了捏的主页️付费专栏：Python专栏️个人学习笔记，若有缺误，欢迎评论区指正前言超参数优化是机器学习中的重要问题，它涉及在训练模型时选择最优的超参数组合，以提高模型的性能和泛化能力。Optuna是一个用于自动化超参数优化的库，它提供了有效的参数搜索算法和方便的结果可视化工具。目录前言
【机器学习笔记】 6 机器学习库Scikit-learn RIKI_1 机器学习机器学习笔记 scikit-learn
Scikit-learn概述Scikit-learn是基于NumPy、SciPy和Matplotlib的开源Python机器学习包,它封装了一系列数据预处理、机器学习算法、模型选择等工具,是数据分析师首选的机器学习工具包。自2007年发布以来，scikit-learn已经成为Python重要的机器学习库了，scikit-learn简称sklearn，支持包括分类，回归，降维和聚类四大机器学习算法。
Python机器学习：Scikit-learn库与应用数据小爬虫 api 电商api 机器学习 python scikit-learn 开发语言运维服务器
当涉及到Python机器学习时，Scikit-learn是一个非常流行且功能强大的库。它提供了广泛的算法和工具，使得机器学习变得简单而高效。下面是一个简单的Scikit-learn库与应用示例，其中包括代码。首先，确保你已经安装了Scikit-learn库。你可以使用pip命令来安装它：bash复制代码pipinstallscikit-learn接下来，我们将使用Scikit-learn来执行一个
见世面的成本有多低？这几个技术公众号告诉你答案傅一平
独乐乐，不如众乐乐，为您精选以下公众号！人工智能爱好者社区专注人工智能、机器学习、数据科学等顶尖技术前沿科技成果研究、实战技巧。每周会有书豪采访记系列采访技术大佬文章和原创漫画文章，立即关注，掌握人工智能最新资讯与成果。号主是《R数据科学实战：工具详解与案例分析》书籍作者。大数据分析挖掘和Python机器学习商业智能BI、数据分析、数据挖掘、大数据、Python、机器学习、深度学习、算法等技术分享
如何安装Pytorch,CPU版本和GPU版本的安装流程。 JayGboy pytorch 人工智能 python
1.PyTorch简介：PyTorch是一个开源的Python机器学习框架，专注于深度学习任务。它由Facebook的人工智能研究团队开发并维护，提供了丰富的工具和库，用于构建和训练各种深度神经网络模型。PyTorch使用动态计算图的概念，允许用户在运行时动态地定义、修改和调试计算图。这种灵活性使得模型构建和调试更加直观和方便，同时也支持更复杂的模型结构和控制流程。PyTorch采用Pythoni
Python机器学习之交叉验证一只怂货小脑斧
交叉验证是一种非常常用的对于模型泛化能力进行评估方法，交叉验证既可以解决数据集的数据量不够大问题，也可以解决参数调优的问题。常用的交叉验证方法有：简单交叉验证（HoldOut检验，例如train_test_split）、k折交叉验证（例如KFold）、自助法kfold是将数据集划分为K-折，只是划分数据集；cross_val_score是根据模型进行计算，计算交叉验证的结果，你可以简单认为就是cr
浏览器F12调试知行合一。。。测试技术功能测试
系列文章目录提示：这里可以添加系列文章的所有文章的目录，目录需要自己手动添加例如：第一章Python机器学习入门之pandas的使用提示：写完文章后，目录可以自动生成，如何生成可参考右边的帮助文档文章目录系列文章目录1浏览器F12开发者工具1.1F12开发者工具基本介绍1.2F12常规设置2标签页2.1Elements查看器2.2Network网络2.3Network抓包分析案例1：以登录百度账号
html页面js获取参数值 0624chenhong html
1.js获取参数值js function GetQueryString(name) { var reg = new RegExp("(^|&)"+ name +"=([^&]*)(&|$)"); var r = windo
MongoDB 在多线程高并发下的问题 BigCat2013 mongodb DB 高并发重复数据
最近项目用到 MongoDB , 主要是一些读取数据及改状态位的操作. 因为是结合了最近流行的 Storm进行大数据的分析处理，并将分析结果插入Vertica数据库，所以在多线程高并发的情境下, 会发现 Vertica 数据库中有部分重复的数据. 这到底是什么原因导致的呢？笔者开始也是一筹莫展，重复去看 MongoDB 的 API , 终于有了新发现： com.mongodb.DB 这个类有
c++ 用类模版实现链表(c++语言程序设计第四版示例代码) CrazyMizzz 数据结构 C++
#include<iostream> #include<cassert> using namespace std; template<class T> class Node { private: Node<T> * next; public: T data;
最近情况麦田的设计者感慨考试生活
在五月黄梅天的岁月里，一年两次的软考又要开始了。到目前为止，我已经考了多达三次的软考，最后的结果就是通过了初级考试（程序员）。人啊，就是不满足，考了初级就希望考中级，于是，这学期我就报考了中级，明天就要考试。感觉机会不大，期待奇迹发生吧。这个学期忙于练车，写项目，反正最后是一团糟。后天还要考试科目二。这个星期真的是很艰难的一周，希望能快点度过。
linux系统中用pkill踢出在线登录用户被触发 linux
由于linux服务器允许多用户登录，公司很多人知道密码，工作造成一定的障碍所以需要有时踢出指定的用户 1/#who 查出当前有那些终端登录（用 w 命令更详细） # who root pts/0 2010-10-28 09:36 (192
仿QQ聊天第二版肆无忌惮_ qq
在第一版之上的改进内容: 第一版链接: http://479001499.iteye.com/admin/blogs/2100893 用map存起来号码对应的聊天窗口对象,解决私聊的时候所有消息发到一个窗口的问题. 增加ViewInfo类,这个是信息预览的窗口,如果是自己的信息,则可以进行编辑. 信息修改后上传至服务器再告诉所有用户,自己的窗口
java读取配置文件知了ing
1，java读取.properties配置文件 InputStream in; try { in = test.class.getClassLoader().getResourceAsStream("config/ipnetOracle.properties");//配置文件的路径 Properties p = new Properties()
__attribute__ 你知多少？矮蛋蛋 C++gcc
原文地址: http://www.cnblogs.com/astwish/p/3460618.html GNU C 的一大特色就是__attribute__ 机制。__attribute__ 可以设置函数属性（Function Attribute ）、变量属性（Variable Attribute ）和类型属性（Type Attribute ）。 __attribute__ 书写特征是：
jsoup使用笔记 alleni123 java 爬虫 JSoup
<dependency> <groupId>org.jsoup</groupId> <artifactId>jsoup</artifactId> <version>1.7.3</version> </dependency> 2014/08/28 今天遇到这种形式，
JAVA中的集合 Collectio 和Map的简单使用及方法百合不是茶 list map set
List ,set ,map的使用方法和区别 java容器类类库的用途是保存对象，并将其分为两个概念： Collection集合：一个独立的序列，这些序列都服从一条或多条规则;List必须按顺序保存元素，set不能重复元素；Queue按照排队规则来确定对象产生的顺序（通常与他们被插入的
杀LINUX的JOB进程 bijian1013 linux unix
今天发现数据库一个JOB一直在执行，都执行了好几个小时还在执行，所以想办法给删除掉系统环境： ORACLE 10G Linux操作系统操作步骤如下：第一步.查询出来那个job在运行，找个对应的SID字段 select * from dba_jobs_running--找到job对应的sid &n
Spring AOP详解 bijian1013 java spring AOP
最近项目中遇到了以下几点需求，仔细思考之后，觉得采用AOP来解决。一方面是为了以更加灵活的方式来解决问题，另一方面是借此机会深入学习Spring AOP相关的内容。例如，以下需求不用AOP肯定也能解决，至于是否牵强附会，仁者见仁智者见智。 1.对部分函数的调用进行日志记录，用于观察特定问题在运行过程中的函数调用
[Gson六]Gson类型适配器(TypeAdapter) bit1129 Adapter
TypeAdapter的使用动机 Gson在序列化和反序列化时，默认情况下，是按照POJO类的字段属性名和JSON串键进行一一映射匹配，然后把JSON串的键对应的值转换成POJO相同字段对应的值，反之亦然，在这个过程中有一个JSON串Key对应的Value和对象之间如何转换(序列化/反序列化)的问题。以Date为例，在序列化和反序列化时，Gson默认使用java.
【spark八十七】给定Driver Program，如何判断哪些代码在Driver运行，哪些代码在Worker上执行 bit1129 driver
Driver Program是用户编写的提交给Spark集群执行的application，它包含两部分作为驱动： Driver与Master、Worker协作完成application进程的启动、DAG划分、计算任务封装、计算任务分发到各个计算节点(Worker)、计算资源的分配等。计算逻辑本身，当计算任务在Worker执行时，执行计算逻辑完成application的计算任务
nginx 经验总结 ronin47 nginx 总结
　　　深感nginx的强大，只学了皮毛，把学下的记录。　　　获取Header 信息，一般是以$http_XX（ＸＸ是小写）获取body,通过接口，再展开，根据Ｋ取Ｖ　　　获取uri,以$arg_XX &n
轩辕互动-1.求三个整数中第二大的数2.整型数组的平衡点 bylijinnan 数组
import java.util.ArrayList; import java.util.Arrays; import java.util.List; public class ExoWeb { public static void main(String[] args) { ExoWeb ew=new ExoWeb(); System.out.pri
Netty源码学习-Java-NIO-Reactor bylijinnan java 多线程 netty
Netty里面采用了NIO-based Reactor Pattern 了解这个模式对学习Netty非常有帮助参考以下两篇文章： http://jeewanthad.blogspot.com/2013/02/reactor-pattern-explained-part-1.html http://gee.cs.oswego.edu/dl/cpjslides/nio.pdf
AOP通俗理解 cngolon spring AOP
1.我所知道的aop 初看aop,上来就是一大堆术语，而且还有个拉风的名字，面向切面编程，都说是OOP的一种有益补充等等。一下子让你不知所措，心想着：怪不得很多人都和我说aop多难多难。当我看进去以后，我才发现：它就是一些java基础上的朴实无华的应用，包括ioc，包括许许多多这样的名词，都是万变不离其宗而已。 2.为什么用aop&nb
cursor variable 实例 ctrain variable
create or replace procedure proc_test01 as type emp_row is record( empno emp.empno%type, ename emp.ename%type, job emp.job%type, mgr emp.mgr%type, hiberdate emp.hiredate%type, sal emp.sal%t
shell报bash: service: command not found解决方法 daizj linux shell service jps
今天在执行一个脚本时，本来是想在脚本中启动hdfs和hive等程序，可以在执行到service hive-server start等启动服务的命令时会报错，最终解决方法记录一下：脚本报错如下： ./olap_quick_intall.sh: line 57: service: command not found ./olap_quick_intall.sh: line 59
40个迹象表明你还是PHP菜鸟 dcj3sjt126com 设计模式 PHP 正则表达式 oop
你是PHP菜鸟，如果你：1. 不会利用如phpDoc 这样的工具来恰当地注释你的代码2. 对优秀的集成开发环境如Zend Studio 或Eclipse PDT 视而不见3. 从未用过任何形式的版本控制系统，如Subclipse4. 不采用某种编码与命名标准，以及通用约定，不能在项目开发周期里贯彻落实5. 不使用统一开发方式6. 不转换（或）也不验证某些输入或SQL查询串（译注：参考PHP相关函
Android逐帧动画的实现 dcj3sjt126com android
一、代码实现： private ImageView iv; private AnimationDrawable ad; @Override protected void onCreate(Bundle savedInstanceState) { super.onCreate(savedInstanceState); setContentView(R.layout
java远程调用linux的命令或者脚本 eksliang linux ganymed-ssh2
转载请出自出处： http://eksliang.iteye.com/blog/2105862 Java通过SSH2协议执行远程Shell脚本(ganymed-ssh2-build210.jar) 使用步骤如下： 1.导包官网下载: http://www.ganymed.ethz.ch/ssh2/ ma
adb端口被占用问题 gqdy365 adb
最近重新安装的电脑，配置了新环境，老是出现： adb server is out of date. killing... ADB server didn't ACK * failed to start daemon * 百度了一下，说是端口被占用，我开个eclipse，然后打开cmd，就提示这个，很烦人。一个比较彻底的解决办法就是修改
ASP.NET使用FileUpload上传文件 hvt .net C#hovertree asp.net webform
前台代码： <asp:FileUpload ID="fuKeleyi" runat="server" /> <asp:Button ID="BtnUp" runat="server" onclick="BtnUp_Click" Text="上传" />
代码之谜（四）- 浮点数（从惊讶到思考） justjavac 浮点数精度代码之谜 IEEE
在『代码之谜』系列的前几篇文章中，很多次出现了浮点数。浮点数在很多编程语言中被称为简单数据类型，其实，浮点数比起那些复杂数据类型（比如字符串）来说，一点都不简单。单单是说明 IEEE浮点数就可以写一本书了，我将用几篇博文来简单的说说我所理解的浮点数，算是抛砖引玉吧。一次面试记得多年前我招聘 Java 程序员时的一次关于浮点数、二分法、编码的面试，多年以后，他已经称为了一名很出色的
数据结构随记_1 lx.asymmetric 数据结构笔记
第一章 1.数据结构包括数据的逻辑结构、数据的物理/存储结构和数据的逻辑关系这三个方面的内容。 2.数据的存储结构可用四种基本的存储方法表示，它们分别是顺序存储、链式存储、索引存储和散列存储。 3.数据运算最常用的有五种，分别是查找/检索、排序、插入、删除、修改。 4.算法主要有以下五个特性：输入、输出、可行性、确定性和有穷性。 5.算法分析的
linux的会话和进程组网络接口 linux
会话：一个或多个进程组。起于用户登录，终止于用户退出。此期间所有进程都属于这个会话期。会话首进程：调用setsid创建会话的进程1.规定组长进程不能调用setsid，因为调用setsid后，调用进程会成为新的进程组的组长进程.如何保证？先调用fork，然后终止父进程，此时由于子进程的进程组ID为父进程的进程组ID，而子进程的ID是重新分配的，所以保证子进程不会是进程组长，从而子进程可以调用se
二维数组元素的连续求解 1140566087 二维数组 ACM
import java.util.HashMap; public class Title { public static void main(String[] args){ f(); } // 二位数组的应用 //12、二维数组中，哪一行或哪一列的连续存放的0的个数最多，是几个0。注意，是“连续”。 public static void f(){
也谈什么时候Java比C++快 windshome java C++
刚打开iteye就看到这个标题“Java什么时候比C++快”，觉得很好笑。你要比，就比同等水平的基础上的相比，笨蛋写得C代码和C++代码，去和高手写的Java代码比效率，有什么意义呢？我是写密码算法的，深刻知道算法C和C++实现和Java实现之间的效率差，甚至也比对过C代码和汇编代码的效率差，计算机是个死的东西，再怎么优化，Java也就是和C

【Python机器学习】——决策树DecisionTreeClassifier详解

你可能感兴趣的:(python机器学习)