scikit-learn:4.6. Kernel Approximation

参考:http://scikit-learn.org/stable/modules/kernel_approximation.html


之所以使用approximate explicit feature maps compared to the kernel trick, 是因为这样便于online learning,且能够适用于大数据集。但是还是建议,如果可能,approximate and exact kernel methods应该对比着用。


1、Nystroem Method for Kernel Approximation

 Nystroem is a general method for low-rank approximations of kernels. It achieves this by essentially subsampling the data on which the kernel is evaluated. 默认情况下,Nystroem 使用rbf kernel,也可以自定义。 The number of samples used - which is also the dimensionality of the features computed - is given by the parameter n_components.


2、Radial Basis Function Kernel

The RBFSampler constructs an approximate mapping for the radial basis function kernel:

>>> from sklearn.kernel_approximation import RBFSampler
>>> from sklearn.linear_model import SGDClassifier
>>> X = [[0, 0], [1, 1], [1, 0], [0, 1]]
>>> y = [0, 0, 1, 1]
>>> rbf_feature = RBFSampler(gamma=1, random_state=1)
>>> X_features = rbf_feature.fit_transform(X)
>>> clf = SGDClassifier()   
>>> clf.fit(X_features, y)
SGDClassifier(alpha=0.0001, average=False, class_weight=None, epsilon=0.1,
       eta0=0.0, fit_intercept=True, l1_ratio=0.15,
       learning_rate='optimal', loss='hinge', n_iter=5, n_jobs=1,
       penalty='l2', power_t=0.5, random_state=None, shuffle=True,
       verbose=0, warm_start=False)
>>> clf.score(X_features, y)
1.0
The mapping relies on a Monte Carlo approximation to the kernel values. The  fit  function performs the Monte Carlo sampling, whereas the  transform  method performs the mapping of the data. 由于本身的随机性、多次调用fit结果可能不同。



3、Additive Chi Squared Kernel 

The additive chi squared kernel as used here is given by


The class AdditiveChi2Sampler implements this component wise deterministic sampling. Each component is sampled n times, yielding 2n+1 dimensions per input dimension (the multiple of two stems from the real and complex part of the Fourier transform). In the literature, n is usually chosen to be 1 or 2, transforming the dataset to size n_samples * 5 * n_features (in the case of n=2).


4、Skewed Chi Squared Kernel

The skewed chi squared kernel is given by:



好吧,我承认这几节都翻译的不好,但他内容就是这个样子。。。







你可能感兴趣的:(scikit-learn:4.6. Kernel Approximation)