Udacity机器学习(进阶)P3用到的函数笔记

pandas.DataFrame.drop

DataFrame.drop(labels=None, axis=0, index=None, columns=None, level=None, inplace=False, errors='raise')

示例:

>>> df = pd.DataFrame(np.arange(12).reshape(3,4),
                      columns=['A', 'B', 'C', 'D'])
>>> df
   A  B   C   D
0  0  1   2   3
1  4  5   6   7
2  8  9  10  11

删除列:

>>> df.drop(['B', 'C'], axis=1)
   A   D
0  0   3
1  4   7
2  8  11

>>> df.drop(columns=['B', 'C'])
   A   D
0  0   3
1  4   7
2  8  11

由索引号来删除行

>>> df.drop([0, 1])
   A  B   C   D
2  8  9  10  11

sklearn.tree.DecisionTreeRegressor

class sklearn.tree.DecisionTreeRegressor(criterion=’mse’, splitter=’best’, max_depth=None, min_samples_split=2, min_samples_leaf=1, min_weight_fraction_leaf=0.0, max_features=None, random_state=None, max_leaf_nodes=None, min_impurity_decrease=0.0, min_impurity_split=None, presort=False)

示例:

>>> from sklearn.datasets import load_boston
>>> from sklearn.model_selection import cross_val_score
>>> from sklearn.tree import DecisionTreeRegressor
>>> boston = load_boston()
>>> regressor = DecisionTreeRegressor(random_state=0)
>>> cross_val_score(regressor, boston.data, boston.target, cv=10)
...                    
...
array([ 0.61..., 0.57..., -0.34..., 0.41..., 0.75...,
        0.07..., 0.29..., 0.33..., -1.42..., -1.77...])

对中值和平均值相差很大的歪斜函数做非线性缩放

  • 一种实现这个缩放的方法是使用Box-Cox 变换,这个方法能够计算出能够最佳减小数据倾斜的指数变换方法。

  • 一个比较简单的并且在大多数情况下都适用的方法是使用自然对数。
    numpy.log()

numpy.percentile

示例:

>>> a = np.array([[10, 7, 4], [3, 2, 1]])
>>> a
array([[10,  7,  4],
       [ 3,  2,  1]])
>>> np.percentile(a, 50)
3.5
>>> np.percentile(a, 50, axis=0)
array([[ 6.5,  4.5,  2.5]])
>>> np.percentile(a, 50, axis=1)
array([ 7.,  2.])
>>> np.percentile(a, 50, axis=1, keepdims=True)
array([[ 7.],
       [ 2.]])
>>> m = np.percentile(a, 50, axis=0)
>>> out = np.zeros_like(m)
>>> np.percentile(a, 50, axis=0, out=out)
array([[ 6.5,  4.5,  2.5]])
>>> m
array([[ 6.5,  4.5,  2.5]])
>>> b = a.copy()
>>> np.percentile(b, 50, axis=1, overwrite_input=True)
array([ 7.,  2.])
>>> assert not np.all(a == b)

python list extend函数

  • 用法:extend() 函数用于在列表末尾一次性追加另一个序列中的多个值(用新列表扩展原来的列表)

  • 语法:

list.extend(seq)
  • 这里可以注意一下extend和append的区别

    extend是直接在list后面加
    append会变成[[1,2,3],[4,5,6]]

  • 示例:

#!/usr/bin/python

aList = [123, 'xyz', 'zara', 'abc', 123];
bList = [2009, 'manni'];
aList.extend(bList)

print "Extended List : ", aList ;
  • 输出:
Extended List :  [123, 'xyz', 'zara', 'abc', 123, 2009, 'manni']

你可能感兴趣的:(机器学习,python)