boywaiter

Chapter 7 Ensemble Learning and Random Forests

OReilly.Hands-On Machine Learning with Scikit-Learn and TensorFlow读书笔记

A group of predictions is called ensemble. The technique that aggregates ensemble to make better predictions is called Ensemble Learning, and an Ensemble Learning algorithm is called an Ensemble method.

An ensemble of Decision Trees is called Random Forest.

The most popular Ensemble methods include bagging, boosting, stacking, and a few others.

7.1 Voting Classifier

The majority-vote classifier is called a hard voting classifier.

Even if each classifier is a weak learner (meaning it does only slightly better than random guessing), the ensemble can still be a strong learner (achieving high accuracy), provided there are a sufficient number of weak learners and they are sufficiently diverse.

Ensemble methods work best when the predictors are as independent from one another as possible. One way to get diverse classifiers is to train them using very different algorithms. This increases the chance that they will make very different types of errors, improving
the ensemble’s accuracy.

from sklearn.datasets import make_moons
from sklearn.model_selection import train_test_split
X, y = make_moons(n_samples=500, noise=0.30, random_state=42)
X_train,X_test,y_train,y_test=train_test_split(X,y,test_size=0.2)

from sklearn.ensemble import RandomForestClassifier
from sklearn.ensemble import VotingClassifier
from sklearn.linear_model import LogisticRegression
from sklearn.svm import SVC

log_clf = LogisticRegression(solver="liblinear", random_state=42)
rnd_clf = RandomForestClassifier(n_estimators=10, random_state=42)
svm_clf = SVC(gamma="auto", random_state=42)

voting_clf=VotingClassifier(
    estimators=[('lr',log_clf),('rf',rnd_clf),('svc',svm_clf)],
    voting='hard'
)
voting_clf.fit(X_train,y_train)

from sklearn.metrics import accuracy_score
for clf in (log_clf,rnd_clf,svm_clf,voting_clf):
    clf.fit(X_train,y_train)
    y_pred=clf.predict(X_test)
    print(clf.__class__.__name__,accuracy_score(y_test,y_pred))

If all classifiers are able to estimate class probabilities (i.e., they have a pre $dict_proba() \verb+dict_proba()+$ method), then you can tell Scikit-Learn to predict the class with the highest class probability, averaged over all the individual classifiers. This is called soft voting.

log_clf = LogisticRegression(solver="liblinear", random_state=42)
rnd_clf = RandomForestClassifier(n_estimators=10, random_state=42)
#setting parameter probability=True enable the SVC to have a #predict_prob() method
svm_clf = SVC(gamma="auto", probability=True, random_state=42)
voting_clf=VotingClassifier(
    estimators=[('lr',log_clf),('rf',rnd_clf),('svc',svm_clf)],
    voting='soft'
)

7.2 Bagging and Pasting

One way to get a diverse set of classifiers is to use very different training algorithms, as just discussed. Another approach is to use the same training algorithm for every predictor, but to train them on different random subsets of the training set. When sampling is performed with replacement, this method is called bagging (short for bootstrap aggregating). When sampling is performed without replacement, it is called pasting.

Once all predictors are trained, the ensemble can make a prediction for a new instance by simply aggregating the predictions of all predictors. The aggregation function is typically the statistical mode (i.e., the most frequent prediction, just like a hard voting classifier) for classification, or the average for regression.

7.2.1 Bagging and Pasting in Scikit-Learn

Scikit-Learn offers a simple API for both bagging and pasting with the $\verb+BaggingClassifier+$ class (or $\verb+BaggingRegressor+$ for regression). The following code trains an ensemble of 500 Decision Tree classifiers, each trained on 100 training instances randomly sampled from the training set with replacement (this is an example of bagging, but if you want to use pasting instead, just set $\verb+bootstrap=False+$ ). The $n_jobs \verb+n_jobs+$ parameter tells Scikit-Learn the number of CPU cores to use for training and predictions (–1 tells Scikit-Learn to use all available cores):

from sklearn.ensemble import BaggingClassifier
from sklearn.tree import DecisionTreeClassifier

bag_clf=BaggingClassifier(
    DecisionTreeClassifier(),n_estimators=500,
    max_samples=100,bootstrap=True,n_jobs=-1
)
bag_clf.fit(X_train,y_train)
y_pred=bag_clf.predict(X_test)

The $\verb+BaggingClassifier+$ automatically performs soft voting instead of hard voting if the base classifier can estimate class probabilities (i.e., if it has a $predict_proba() \verb+predict_proba()+$ method), which is the case with Decision Trees classifiers.

7.2.2 Out-of-Bag Evaluation

The training instances that are not sampled are called out-of-bag (oob) instances. As they are not seen by trained predictor, they can be used to evaluate the ensemble. In Scikit-Learn, you can set $oob_score=True \verb+oob_score=True+$ when creating a $\verb+BaggingClassifier+$ to request an automatic oob evaluation after training.

bag_clf=BaggingClassifier(
    DecisionTreeClassifier(),n_estimators=500,
    max_samples=100,bootstrap=True,n_jobs=-1,oob_score=True
)
bag_clf.fit(X_train,y_train)
bag_clf.oob_score_ #0.9175 
y_pred=bag_clf.predict(X_test)
accuracy_score(y_test,y_pred)# 0.9175
#lower than the results of 0.93/0.936 given by the authors
bag_clf.oob_decision_function_

7.3 Random Patches and Random Subspaces

The $\verb+BaggingClassifier+$ class supports sampling the features as well. This is controlled by two hyperparameters: $max_features \verb+max_features+$ and $bootstrap_features \verb+bootstrap_features+$ .

Sampling both instances and features is called the Random Patches method. Keeping all training instances (i.e., $\verb+bootstrap=False+$ and $max_samples=1.0 \verb+max_samples=1.0+$ ) but sampling features (i.e., $bootstrap_features=True \verb+bootstrap_features=True+$ and/or $max_features \verb+max_features+$ smaller than 1.0) is called Random Subspaces method.

Sampling features results in even more predictor diversity, trading a bit more bias for a lower variance.

7.4 Random Forest

from sklearn.ensemble import RandomForestClassifier
rnd_clf=RandomForestClassifier(n_estimators=500,
                               max_leaf_nodes=16,n_jobs=-1)
rnd_clf.fit(X_train,y_train)
y_pred_rf=rnd_clf.predict(X_test)

The following $\verb+BaggingClassifier+$ is roughly equivalent to the previous $\verb+RandomForestClassifier+$ :

bag_clf=BaggingClassifier(
    DecisionTreeClassifier(splitter="random",max_leaf_nodes=16),
    n_estimators=500,max_samples=1.0,bootstrap=True,n_jobs=-1
)

7.4.1 Extra-Trees

It is possible to make trees even more random by also using random thresholds for each feature rather than searching for the best possible thresholds (like regular Decision Trees do).

A forest of such extremely random trees is simply called an Extremely Randomized Trees ensemble (or Extra-Trees for short). Once again, this trades more bias for a lower variance. It also makes Extra-Trees much faster to train than regular Random Forests since finding the best possible threshold for each feature at every node is one of the most time-consuming tasks of growing a tree.

You can create an Extra-Trees classifier using Scikit-Learn’s $\verb+ExtraTreesClassifier+$ class.

7.4.2 Feature Importance

In a single Decision Tree, important features are likely to appear closer to the root of the tree, while unimportant features will often appear closer to the leaves (or not at all). You can
access the result using the $feature_importances_ \verb+feature_importances_+$ variable.

from sklearn.datasets import load_iris
iris=load_iris()
rnd_clf=RandomForestClassifier(n_estimators=500,n_jobs=-1)
rnd_clf.fit(iris.data,iris.target)
for name,score in zip(iris["feature_names"],
                      rnd_clf.feature_importances_):
    print(name,score)

7.5 Boosting

Boosting (originally called hypothesis boosting) refers to any Ensemble method that can combine several weak learners into a strong learner. The general idea of most boosting methods is to train predictors sequentially, each trying to correct its predecessor. There are many boosting methods available, but by far the most popular are AdaBoost (short for Adaptive Boosting) and Gradient Boosting.

7.5.1 AdaBoost

To build an AdaBoost classifier, a first base classifier (such as a Decision Tree) is trained and used to make predictions on the training set. The relative weight of misclassified training instances is then increased. A second classifier is trained using the updated weights and again it makes predictions on the training set, weights are updated, and so on.

Let’s take a closer look at the AdaBoost algorithm. Each instance weight $w^{(i)}$ is initially set to $\frac{1}{m}$ . A first predictor is trained and its weighted error rate $r_1$ is computed on the training set; see Equation 7-1.

Equation 7-1. Weighted error rate of the $j^{th}$ predictor
$$
r_j=\frac{\mathop{\sum_{i=1}^m}\limits_{\widehat y_j^{(i)}\neq y^{(i)}}w{(i)}}{\sum_{i=1}^m w^{(i)}}

\textrm{ where } \widehat y_j^{(i)}\textrm{ is the }j^{th} \textrm{ predictor’s prediction for the }i^{th} \textrm{ instance.}
$$
The predictor’s weight $\alpha_j$ is then computed using Equation 7-2, where $\eta$ is the learning rate hyperparameter (defaults to 1).

Equation 7-2. Predictor weight
$\alpha_j=\eta\log \frac{1-r_j}{r_j}$
Next the instance weights are updated using Equation 7-3: the misclassified instances are boosted.

Equation 7-3. Weight update rule
$\textrm{for }i=1,2,\cdots,m\\ w^{(i)}\leftarrow\left\{\begin{array}{ll} w^{(i)}& \textrm{ if } \widehat y_j^{(i)}=y^{(i)}\\ w^{(i)}\exp\left(\alpha_j\right)& \textrm{ if }y_j^{(i)}\neq y^{(i)} \end{array}\right.$
Then, all the instance weights are normalized (i.e., divided by $\sum_{i=1}^m w^{(i)}$ ) .

Finally, a new predictor is trained using the updated weights, and the whole process is repeated (the new predictor’s weight is computed, the instance weights are updated, then another predictor is trained, and so on). The algorithm stops when the desired number of predictors is reached, or when a perfect predictor is found.

To make predictions, AdaBoost simply computes the predictions of all the predictors and weighs them using the predictor weights $\alpha_j$ . The predicted class is the one that receives the majority of weighted votes (see Equation 7-4).
Equation 7-4. AdaBoost predictions
$\widehat y(\textbf x)=\mathop{\arg\max}\limits_{k}\mathop{\sum_{j=1}^N}\limits_{\widehat y_j(\textbf x)=k}\alpha_j \textrm{ where } N \textrm{ is the number of predictors.}$
Scikit-Learn actually uses a multiclass version of AdaBoost called SAMME (which stands for Stagewise Additive Modeling using a Multiclass Exponential loss function). When there are just two classes, SAMME is equivalent to AdaBoost. Moreover, if the predictors can estimate class probabilities (i.e., if they have a $predict_proba() \verb+predict_proba()+$ method), Scikit-Learn can use a variant of SAMME called SAMME.R (the R stands for “Real”), which relies on class probabilities rather than predictions and generally performs better.

from sklearn.ensemble import AdaBoostClassifier
ada_clf=AdaBoostClassifier(
    DecisionTreeClassifier(max_depth=1),n_estimators=200,
    algorithm="SAMME.R",learning_rate=0.5
)
ada_clf.fit(X_train,y_train)
y_pred=ada_clf.predict(X_test)
accuracy_score(y_test,y_pred)#0.87

7.5.2 Gradient Boosting

Instead of tweaking the instance weights at every iteration like AdaBoost does, Gradient Boosting tries to fit the new predictor to the residual errors made by the previous predictor.

Let’s go through a simple regression example using Decision Trees as the base predictors (of course Gradient Boosting also works great with regression tasks). This is called Gradient Tree Boosting, or Gradient Boosted Regression Trees (GBRT). First, let’s fit a $\verb+DecisionTreeRegressor+$ to the training set (for example, a noisy quadratic training set):

import numpy as np
np.random.seed(42)
m = 200
X = np.random.rand(m, 1)
y = 4 * (X - 0.5) ** 2
y = y + np.random.randn(m, 1) / 10

from sklearn.tree import DecisionTreeRegressor
tree_reg1 = DecisionTreeRegressor(max_depth=2)
tree_reg1.fit(X, y)

y2 = y - tree_reg1.predict(X)
tree_reg2 = DecisionTreeRegressor(max_depth=2)
tree_reg2.fit(X, y2)

y3 = y2 - tree_reg2.predict(X)
tree_reg3 = DecisionTreeRegressor(max_depth=2)
tree_reg3.fit(X, y3)

X_new  = np.random.rand(200, 1)
y_pred = sum(tree.predict(X_new) for tree in (tree_reg1, tree_reg2, tree_reg3))

A simpler way to train GBRT ensembles is to use Scikit-Learn’s $\verb+GradientBoostingRegressor+$ class.

from sklearn.ensemble import GradientBoostingRegressor
gbrt=GradientBoostingRegressor(max_depth=2,n_estimator=3,learning_rate=1.0)
gbrt.fit(X,y)

The $learning_rate \verb+learning_rate+$ hyperparameter scales the contribution of each tree. If you set it to a low value, such as 0.1, you will need more trees in the ensemble to fit the training set, but the predictions will usually generalize better. This is a regularization technique called shrinkage.

In order to find the optimal number of trees, you can use early stopping (see Chapter 4). A simple way to implement this is to use the $staged_predict() \verb+staged_predict()+$ method: it returns an iterator over the predictions made by the ensemble at each stage of training (with one tree, two trees, etc.). The following code trains a GBRT ensemble with 120 trees, then measures the validation error at each stage of training to find the optimal number of trees, and finally trains another GBRT ensemble using the optimal number of trees:

from sklearn.ensemble import GradientBoostingRegressor
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error

X_train,X_val,y_train,y_val=train_test_split(X,y)

gbrt=GradientBoostingRegressor(max_depth=2,n_estimators=120)
gbrt.fit(X_train,y_train)

errors=[ mean_squared_error(y_val,y_pred) 
        for y_pred in gbrt.staged_predict(X_val)]
best_n_estimators=np.argmin(errors)

gbrt_best=GradientBoostingRegressor(max_depth=2,n_estimators=best_n_estimators)
gbrt_best.fit(X_train,y_train)

It is also possible to implement early stopping by actually stopping training early (instead of training a large number of trees first and then looking back to find the optimal number). You can do so by setting $warm_start=True \verb+warm_start=True+$ , which makes ScikitLearn keep existing trees when the fit() method is called, allowing incremental training. The following code stops training when the validation error does not improve for five iterations in a row:

from sklearn.ensemble import GradientBoostingRegressor
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error

X_train,X_val,y_train,y_val=train_test_split(X,y)

gbrt=GradientBoostingRegressor(max_depth=2,n_estimators=120)
gbrt.fit(X_train,y_train)

errors=[ mean_squared_error(y_val,y_pred) 
        for y_pred in gbrt.staged_predict(X_val)]
best_n_estimators=np.argmin(errors)

gbrt_best=GradientBoostingRegressor(max_depth=2,n_estimators=best_n_estimators)
gbrt_best.fit(X_train,y_train)

The $\verb+GradientBoostingRegressor+$ class also supports a $\verb+subsample+$ hyperparameter, which specifies the fraction of training instances to be used for training each tree. For example, if subsample=0.25, then each tree is trained on 25% of the training instances, selected randomly. As you can probably guess by now, this trades a higher bias for a lower variance. It also speeds up training considerably. This technique is called Stochastic Gradient Boosting.

It is possible to use Gradient Boosting with other cost functions. This is controlled by the $\verb+loss+$ hyperparameter.

7.6 Stacking

Stacking (short for stacked generalization) is based on a simple idea: instead of using trivial functions (such as hard voting) to aggregate the predictions of all predictors in an ensemble,
why don’t we train a model to perform this aggregation? Each of predictors predicts a different value, and then the final predictor (called a blender, or a meta learner) takes these predictions as inputs and makes the final prediction.

To train the blender, a common approach is to use a hold-out set. First, the training set is split in two subsets. The first subset is used to train the predictors in the first layer. The second subset is sent to predictors in the first layer to obtain predicted value. Then these predicted values are combined together with the target value to form a training set to train a blender in the second layer.

To sum up, ensemble is a technique that aggregates predictions of multiple models to make better prediction.

Voting Classifiers: use different classifiers, but same training set. $\verb+VotingClasifier+$ .
- hard voting: adopt the most frequent prediction
- soft voting: calculate the average probability of each class over all the classifiers, choose the class with highest probability
Bagging and Pasting: use different random subsets of the training set, but same model. The aggregating function is the statistical mode (most frequent prediction), or the average for regression. Allow to train in parallel. $\verb+BaggingClassifier+$ , $\verb+BaggingRegressor+$ .
- bagging (bootstrap aggregating): sampling with replacement, out-of-bag evaluation
- pasting: sampling without replacement, $\verb+bootstrap=False+$ .
Random Patches and Random Subspaces: sampling the features. $\verb+BaggingClassifier+$ with $max_features \verb+max_features+$ and $bootstrap_features \verb+bootstrap_features+$ hyperparameters.
- Random Patches method: sampling both training instances and features
- Random Subspaces method: keeping all training instances but sampling feature
Random Forests: only a random subset of the features is considered for splitting. $\verb+RandomForestClassifier+$ , $\verb+RandomForestRegressor+$ .
- Extra-Trees: using random thresholds for each feature rather than searching for the best possible thresholds
Boosting (hypothesis boosting): train predictors sequentially, each trying to correct its predecessor. It cannot be parallelized.
- AdaBoost (Adaptive Bossting): new predictors pay more attention to the training instances that the predecessor underfitted. $\verb+AdaBoostClassifier+$ with parameter $\verb+algorithm="SAMME.R"+$ .
  1. The weigh $w^{(i)}$ of each instance is initialized equally, i.e., $1 / m$ , where $m$ is the number of training instances
  2. A first predictor is trained and its weighted error rate $r_1$ is computed on the training set. The weighted error rate of $j$ -th predictor is the ration of the sum of weights of instances with wrong prediction value by $j$ -th predictor to the sum of weights of all instances (Equation 7-1).
  3. The predictor’s weight $\alpha_j$ is then computed using Equation 7-2.
  4. The instance weights are updated using Equation 7-3: the misclassified instances are boosted (i.e., the weights of misclassified instances are boosted). Then, all weights are normalized.
  5. The predicted class is the one that receives the majority of weighted votes (see Equation 7-4)
- Gradient Boosting: tries to fit the new predictor to the residual errors made by the previous predictor. $\verb+GradientBoostingRegressor+$ .
  
  The error of a sample is the deviation of the sample from the (unobservable) population mean or actual function, while the residual of a sample is the difference between the sample and either (1) the (observed) sample mean or (2) the regressed (fitted) function value.
  
  https://en.wikipedia.org/wiki/Errors_and_residuals_in_statistics
  - early stopping:
    - use $staged_predict() \verb+staged_predict()+$ method of $\verb+GradientBoostingRegressor+$
    - by setting $warm_start=True \verb+warm_start=True+$
  - Stochastic Gradient Boosting: for example, by setting $\verb+subsample=0.25+$
  - use other cost functions: controlled by $\verb+loss+$ hyperparameter
Stacking (stacked generalization): train a model to perform aggregation. The training set is divided into two subsets. The first subset is used to train several regressors. Then these regressors make predictions on the second subset. The prediction values in combination with the target value are used as the training set of a final regressor.
difference between $\verb+eatimators+$ and $eatimators_ \verb+eatimators_+$ Ensemble classes:
- $\verb+eatimators+$ is a parameter of the model (or Ensemble class), which is an ensemble of the model. It can be modified by $set_params \verb+set_params+$ method.
- whereas $eatimators_ \verb+eatimators_+$ is an attribute, and returns the trained ensemble of a model.

你可能感兴趣的:(Hands-On,Machine,Learning,with,Scik)

机器学习与深度学习间关系与区别 ℒℴѵℯ心·动ꦿ໊ོ꫞ 人工智能学习深度学习 python
一、机器学习概述定义机器学习（MachineLearning,ML）是一种通过数据驱动的方法，利用统计学和计算算法来训练模型，使计算机能够从数据中学习并自动进行预测或决策。机器学习通过分析大量数据样本，识别其中的模式和规律，从而对新的数据进行判断。其核心在于通过训练过程，让模型不断优化和提升其预测准确性。主要类型1.监督学习（SupervisedLearning）监督学习是指在训练数据集中包含输入
简单了解 JVM 记得开心一点啊 jvm
目录♫什么是JVM♫JVM的运行流程♫JVM运行时数据区♪虚拟机栈♪本地方法栈♪堆♪程序计数器♪方法区/元数据区♫类加载的过程♫双亲委派模型♫垃圾回收机制♫什么是JVMJVM是JavaVirtualMachine的简称，意为Java虚拟机。虚拟机是指通过软件模拟的具有完整硬件功能的、运行在一个完全隔离的环境中的完整计算机系统（如：JVM、VMwave、VirtualBox）。JVM和其他两个虚拟机
JVM、JRE和 JDK：理解Java开发的三大核心组件 Y雨何时停T Java java
Java是一门跨平台的编程语言，它的成功离不开背后强大的运行环境与开发工具的支持。在Java的生态中，JVM（Java虚拟机）、JRE（Java运行时环境）和JDK（Java开发工具包）是三个至关重要的核心组件。本文将探讨JVM、JDK和JRE的区别，帮助你更好地理解Java的运行机制。1.JVM：Java虚拟机（JavaVirtualMachine）什么是JVM？JVM，即Java虚拟机，是Ja
深度 Qlearning：在直播推荐系统中的应用 AGI通用人工智能之禅程序员提升自我硅基计算碳基计算认知计算生物计算深度学习神经网络大数据 AIGC AGI LLM Java Python 架构设计 Agent 程序员实现财富自由
深度Q-learning：在直播推荐系统中的应用关键词：深度Q-learning,强化学习,直播推荐系统,个性化推荐1.背景介绍1.1问题的由来随着互联网技术的飞速发展,直播平台如雨后春笋般涌现。面对海量的直播内容,用户很难快速找到自己感兴趣的内容。因此,个性化推荐系统在直播平台中扮演着越来越重要的角色。1.2研究现状目前,主流的个性化推荐算法包括协同过滤、基于内容的推荐等。这些方法在一定程度上缓
管理员权限的软件不能开机自启动的解决方法 ss_ctrl
这是几种解决方法：1.将启动参数写入到32位注册表里面去在64位系统下我们64位的程序访问此HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows\CurrentVersion\Run注册表路径，是可以正确访问的，32位程序访问此注册表路径时，默认会被系统自动映射到HKEY_LOCAL_MACHINE\SOFTWARE\WOW6432Node\Microsoft
golang学习笔记--MPG模型 xxzed golang #学习笔记学习笔记 golang
MPG模式：M（Machine）：操作系统的主线程P（Processor）：协程执行需要的资源（上下文context），可以看作一个局部的调度器，使go代码在一个线程上跑，他是实现从N：1到N：M映射的关键G（Goroutine）：协程，有自己的栈。包含指令指针（instructionpointer）和其它信息（正在等待的channel等等），用于调度。一个P下面可以有多个G1、当前程序有三个M,
【开发环境搭建】Macbook M1搭建Java开发环境 weixin_44329069 java 开发语言
JDK安装与配置下载并安装JDK：ARM64DMG安装包下载链接：JDK21forMac(ARM64)。双击下载的DMG文件，按照提示安装JDK。配置环境变量：打开终端，使用vim编辑.bash_profile文件：vim~/.bash_profile在文件中添加以下内容来设置JAVA_HOME：exportJAVA_HOME=/Library/Java/JavaVirtualMachines/j
云服务业界动态简报-20180128 Captain7
一、青云青云QingCloud推出深度学习平台DeepLearningonQingCloud，包含了主流的深度学习框架及数据科学工具包，通过QingCloudAppCenter一键部署交付，可以让算法工程师和数据科学家快速构建深度学习开发环境，将更多的精力放在模型和算法调优。二、腾讯云1.腾讯云正式发布腾讯专有云TCE(TencentCloudEnterprise)矩阵，涵盖企业版、大数据版、AI
机器学习VS深度学习 nfgo 机器学习
机器学习（MachineLearning,ML）和深度学习（DeepLearning,DL）是人工智能（AI）的两个子领域，它们有许多相似之处，但在技术实现和应用范围上也有显著区别。下面从几个方面对两者进行区分：1.概念层面机器学习：是让计算机通过算法从数据中自动学习和改进的技术。它依赖于手动设计的特征和数学模型来进行学习，常用的模型有决策树、支持向量机、线性回归等。深度学习：是机器学习的一个子领
ResNet的半监督和半弱监督模型 Valar_Morghulis
Billion-scalesemi-supervisedlearningforimageclassificationhttps://arxiv.org/pdf/1905.00546.pdfhttps://github.com/facebookresearch/semi-supervised-ImageNet1K-models/权重在timm中也有：https://hub.fastgit.org/r
联邦学习 Federated learning Google I/O‘19 笔记努力搬砖的星期五笔记联邦学习机器学习机器学习 tensorflow
FederatedLearning:MachineLearningonDecentralizeddatahttps://www.youtube.com/watch?v=89BGjQYA0uE文章目录FederatedLearning:MachineLearningonDecentralizeddata1.DecentralizeddataEdgedevicesGboard:mobilekeyboa
PCL 怎样可视化深度图像 LeonDL168 PCL 计算机视觉人工智能视觉检测图像处理算法
本小节讲解如何可视化深度图像的两种方法，在3D视窗中以点云形式进行可视化（深度图像来源于点云），另一种是，将深度值映射为颜色，从而以彩色图像方式可视化深度图像。代码首先，在PCL（PointCloudLearning）中国协助发行的书提供光盘的第7章例2文件夹中，打开名为range_image_visualization.cpp的代码文件，同文件夹下可以找到相关的测试点云文件room_scan1.
FISCO BCOS（十七）——— go SDK的使用林中有神君 #FISCO BCOS 2.8.0 golang 服务器 linux fisco bcos 区块链
1、创建一个工作目录root@wyg-virtual-machine:~/fisco#mkdirgoWorkSpace2、下载go-sdkroot@wyg-virtual-machine:~/fisco/
Git报错（一）fatal: Could not read from remote repository. librarycode
解决方案来自CSDN：https://blog.csdn.net/cxwtsh123/article/details/79194263?utm_medium=distribute.pc_relevant.none-task-blog-BlogCommendFromMachineLearnPai2-3.control&dist_request_id=&depth_1-utm_source=distr
VOC数据集转换为CoCo数据集（亲测有效）情书学长人工智能学习笔记图像处理
#VOC数据集格式VOC格式的数据集分为3部分，Annotations、ImageSets、JPEGImages。（一）Annotations：存放数据标注的xml文件，格式如下：CUMID_train0001.pngC:\Users\86182\Desktop\CUMID_train\0001.pngUnknown2040136830MachineUnspecified0011933491451
【Vesta发号器源码】PropertyMachineIdsProvider DeanChangDM
Vesta发号器源码解析——PropertyMachineIdsProvider属性配置文件持有Id的模式,没啥东西，比单个的多了一个获取下一个的方法封装实现上略有一点点区别privatelong[]machineIds;privateintcurrentIndex;publiclonggetNextMachineId(){returngetMachineId();}publiclonggetMa
Awesome TensorFlow weixin_30594001 人工智能移动开发大数据
AwesomeTensorFlowAcuratedlistofawesomeTensorFlowexperiments,libraries,andprojects.Inspiredbyawesome-machine-learning.WhatisTensorFlow?TensorFlowisanopensourcesoftwarelibraryfornumericalcomputationusin
【ShuQiHere】探索人工智能核心：机器学习的奥秘 ShuQiHere 人工智能机器学习
【ShuQiHere】什么是机器学习？机器学习（MachineLearning,ML）是人工智能（ArtificialIntelligence,AI）中最关键的组成部分之一。它使得计算机不仅能够处理数据，还能从数据中学习，从而做出预测和决策。无论是语音识别、自动驾驶还是推荐系统，背后都依赖于机器学习模型。机器学习与传统的编程不同，它不再依赖于人类编写的固定规则，而是通过数据自我改进模型，从而更灵活
综述论文“A Survey of Zero-Shot Learning: Settings, Methods, and Applications” 硅谷秋水机器学习机器学习神经网络深度学习
该零样本学习综述，发表于ACMTrans.Intell.Syst.Technol.10,2,Article13(January2019)摘要：大多数机器学习方法着重于对已经在训练中看到其类别的实例进行分类。实际上，许多应用程序需要对实例进行分类，而这些实例的类以前没有见过。零样本学习（Zero-ShotLearning）是一种强大而有前途的学习范例，其中训练实例涵盖的类别与想分类的类别是不相交的。
go-etcd实战小书go golang 实战演练 golang etcd 服务发现服务注册微服务
etcd简介etcdisastronglyconsistent,distributedkey-valuestorethatprovidesareliablewaytostoredatathatneedstobeaccessedbyadistributedsystemorclusterofmachines.Itgracefullyhandlesleaderelectionsduringnetwork
梯度提升机 (Gradient Boosting Machines, GBM) ALGORITHM LOL boosting 集成学习机器学习
梯度提升机(GradientBoostingMachines,GBM)通俗易懂算法梯度提升机（GradientBoostingMachines，GBM）是一种集成学习算法，主要用于回归和分类问题。GBM本质上是通过训练一系列简单的模型（通常是决策树），然后将这些模型组合起来，从而提高整体预测性能。基本步骤初始模型：首先，我们用一个简单的模型（如一个常数值）作为预测模型，记为F0(x)F_0(x)F
机器学习 VS 表示学习 VS 深度学习 Efred.D 人工智能机器学习深度学习人工智能
文章目录前言一、机器学习是什么?二、表示学习三、深度学习总结前言本文主要阐述机器学习,表示学习和深度学习的原理和区别.一、机器学习是什么?机器学习(machinelearning),是从有限的数据集中学习到一定的规律,再把学到的规律应用到一些相似的样本集中做预测.机器学习的历史可以追溯到20世纪40年代McCulloch提出的人工神经元网络,目前学界大致把机器学习分为传统机器学习和机器学习两个类别
端到端的自动驾驶论文与代码整理大别山伧父自动驾驶
LearningbyCheatinggithubcodearxivpaperconferenceonrobotlearning最新进展(May2021)Checkoutourlatestfollow-upwork:WorldonRails(2020)Checkoutoursubmissiontothe2020CARLAChallenge!pass
JVM 架构 : 运行时数据区 & 内存结构光剑书架上的书
JVM:JavaVirtualMachine架构JVMArchitectureRuntimeDataArea/MemoryStructureClassloaderClassloaderisasubsysteminJVM,whichisprimarilyresponasibleforloadingthejavaclasses,thereare3differentclassloaders:Bootst
Lt-8 Multithreading yanlingyun0210 java
IntendedLearningOutcomesTounderstandtheconceptofconcurrency.Tounderstandthedifferenceofaprocessandathread.TodefineathreadusingtheThreadclassandRunnableinterface.TocontrolthreadswithvariousThreadmethod
如何使用Pytorch-Metric-Learning？鱼儿也有烦恼 PyTorch pytorch
文章目录如何使用Pytorch-Metric-Learning？1.Pytorch-Metric-Learning库9个模块的功能1.1Sampler模块1.2Miner模块1.3Loss模块1.4Reducer模块1.5Distance模块1.6Regularizer模块1.7Trainer模块1.8Tester模块1.9Utils模块2.如何使用PyTorchMetricLearning库中的
risc-v特权模式狮子座硅农（Leo ICer） risc-v
risc-v架构定义了3种工作模式，又称为特权模式（privilegedmode）。机器模式（machinemode），简称M模式；监督模式（supervisormode），简称S模式；用户模式（usermode），简称U模式。risc-v架构定义机器模式为必选模式，另外两种模式为可选模式，通过不同的模式组合可以实现不同的系统。risc-v架构支持几种不同的存储器地址管理机制，包括对物理地址和虚拟
推荐开源项目：PyTorch-Metric-Learning 潘惟妍
推荐开源项目：PyTorch-Metric-Learningpytorch-metric-learningTheeasiestwaytousedeepmetriclearninginyourapplication.Modular,flexible,andextensible.WritteninPyTorch.项目地址:https://gitcode.com/gh_mirrors/py/pytorc
推荐：FastAPI驱动的稳定扩散LLMs演示项目褚知茉Jade
推荐：FastAPI驱动的稳定扩散LLMs演示项目FastAPI-for-Machine-Learning-Live-DemoThisrepositorycontainsthefilestobuildyourveryownAIimagegenerationwebapplication!OutlinedarethecorecomponentsoftheFastAPIwebframework,anda
【python】【Ray的概述】资源存储库 python 开发语言
Overview概述Rayisanopen-sourceunifiedframeworkforscalingAIandPythonapplicationslikemachinelearning.Itprovidesthecomputelayerforparallelprocessingsothatyoudon’tneedtobeadistributedsystemsexpert.Rayminimi
关于旗正规则引擎规则中的上传和下载问题何必如此文件下载压缩 jsp 文件上传
文件的上传下载都是数据流的输入输出，大致流程都是一样的。一、文件打包下载 1.文件写入压缩包 string mainPath="D:\upload\"; 下载路径 string tmpfileName=jar.zip; &n
【Spark九十九】Spark Streaming的batch interval时间内的数据流转源码分析 bit1129 Stream
以如下代码为例（SocketInputDStream）： Spark Streaming从Socket读取数据的代码是在SocketReceiver的receive方法中，撇开异常情况不谈(Receiver有重连机制，restart方法，默认情况下在Receiver挂了之后，间隔两秒钟重新建立Socket连接)，读取到的数据通过调用store(textRead)方法进行存储。数据
spark master web ui 端口8080被占用解决方法 daizj 8080 端口占用 spark master web ui
spark master web ui 默认端口为8080，当系统有其它程序也在使用该接口时，启动master时也不会报错，spark自己会改用其它端口，自动端口号加1，但为了可以控制到指定的端口，我们可以自行设置，修改方法： 1、cd SPARK_HOME/sbin 2、vi start-master.sh 3、定位到下面部分
oracle_执行计划_谓词信息和数据获取周凡杨 oracle 执行计划
oracle_执行计划_谓词信息和数据获取(上) 一：简要说明在查看执行计划的信息中，经常会看到两个谓词filter和access，它们的区别是什么，理解了这两个词对我们解读Oracle的执行计划信息会有所帮助。简单说，执行计划如果显示是access，就表示这个谓词条件的值将会影响数据的访问路径（表还是索引），而filter表示谓词条件的值并不会影响数据访问路径，只起到
spring中datasource配置 g21121 dataSource
datasource配置有很多种，我介绍的一种是采用c3p0的，它的百科地址是： http://baike.baidu.com/view/920062.htm  <bean name="propertiesConfig" class="org.springframework.b
web报表工具FineReport使用中遇到的常见报错及解决办法（三）老A不折腾 finereport FAQ 报表软件
这里写点抛砖引玉，希望大家能把自己整理的问题及解决方法晾出来，Mark一下，利人利己。出现问题先搜一下文档上有没有，再看看度娘有没有，再看看论坛有没有。有报错要看日志。下面简单罗列下常见的问题，大多文档上都有提到的。 1、repeated column width is largerthan paper width：这个看这段话应该是很好理解的。比如做的模板页面宽度只能放
mysql 用户管理墙头上一根草 linux mysql user
1.新建用户 //登录MYSQL@>mysql -u root -p@>密码//创建用户mysql> insert into mysql.user(Host,User,Password) values(‘localhost’,'jeecn’,password(‘jeecn’));//刷新系统权限表mysql>flush privileges;这样就创建了一个名为：
关于使用Spring导致c3p0数据库死锁问题 aijuans spring Spring 入门 Spring 实例 Spring3 Spring 教程
这个问题我实在是为整个 springsource 的员工蒙羞如果大家使用 spring 控制事务，使用 Open Session In View 模式， com.mchange.v2.resourcepool.TimeoutException: A client timed out while waiting to acquire a resource from com.mchange.
百度词库联想 annan211 百度
<!DOCTYPE html> <html> <head> <meta http-equiv="Content-Type" content="text/html; charset=UTF-8"> <title>RunJS</title&g
int数据与byte之间的相互转换实现代码百合不是茶位移 int转byte byte转int 基本数据类型的实现
在BMP文件和文件压缩时需要用到的int与byte转换,现将理解的贴出来; 主要是要理解;位移等概念 http://baihe747.iteye.com/blog/2078029 int转byte; byte转int; /** * 字节转成int,int转成字节 * @author Administrator *
简单模拟实现数据库连接池 bijian1013 java thread java多线程简单模拟实现数据库连接池
简单模拟实现数据库连接池实例1： package com.bijian.thread; public class DB { //private static final int MAX_COUNT = 10; private static final DB instance = new DB(); private int count = 0; private i
一种基于Weblogic容器的鉴权设计 bijian1013 java weblogic
服务器对请求的鉴权可以在请求头中加Authorization之类的key，将用户名、密码保存到此key对应的value中，当然对于用户名、密码这种高机密的信息，应该对其进行加砂加密等，最简单的方法如下： String vuser_id = "weblogic"; String vuse
【RPC框架Hessian二】Hessian 对象序列化和反序列化 bit1129 hessian
任何一个对象从一个JVM传输到另一个JVM，都要经过序列化为二进制数据(或者字符串等其他格式，比如JSON)，然后在反序列化为Java对象，这最后都是通过二进制的数据在不同的JVM之间传输(一般是通过Socket和二进制的数据传输)，本文定义一个比较符合工作中。 1. 定义三个POJO Person类 package com.tom.hes
【Hadoop十四】Hadoop提供的脚本的功能 bit1129 hadoop
1. hadoop-daemon.sh 1.1 启动HDFS ./hadoop-daemon.sh start namenode ./hadoop-daemon.sh start datanode 通过这种逐步启动的方式，比start-all.sh方式少了一个SecondaryNameNode进程，这不影响Hadoop的使用，其实在 Hadoop2.0中，SecondaryNa
中国互联网走在“灰度”上 ronin47 管理灰度
中国互联网走在“灰度”上（转）文/孕峰第一次听说灰度这个词，是任正非说新型管理者所需要的素质。第二次听说是来自马化腾。似乎其他人包括马云也用不同的语言说过类似的意思。灰度这个词所包含的意义和视野是广远的。要理解这个词，可能同样要用“灰度”的心态。灰度的反面，是规规矩矩，清清楚楚，泾渭分明，严谨条理，是决不妥协，不转弯，认死理。黑白分明不是灰度，像彩虹那样
java-51-输入一个矩阵，按照从外向里以顺时针的顺序依次打印出每一个数字。 bylijinnan java
public class PrintMatrixClockwisely { /** * Q51.输入一个矩阵，按照从外向里以顺时针的顺序依次打印出每一个数字。例如：如果输入如下矩阵： 1 2 3 4 5 6 7 8 9
mongoDB 用户管理开窍的石头 mongoDB用户管理
1:添加用户第一次设置用户需要进入admin数据库下设置超级用户（use admin） db.addUsr({user:'useName',pwd:'111111',roles:[readWrite,dbAdmin]}); 第一个参数用户的名字第二个参数
[游戏与生活]玩暗黑破坏神3的一些问题 comsci 生活
暗黑破坏神3是有史以来最让人激动的游戏。。。。但是有几个问题需要我们注意玩这个游戏的时间，每天不要超过一个小时，且每次玩游戏最好在白天结束游戏之后，最好在太阳下面来晒一下身上的暗黑气息，让自己恢复人的生气 &nb
java 二维数组如何存入数据库 cuiyadll java
using System; using System.Linq; using System.Text; using System.Windows.Forms; using System.Xml; using System.Xml.Serialization; using System.IO; namespace WindowsFormsApplication1 {
本地事务和全局事务Local Transaction and Global Transaction(JTA) darrenzhu java spring local global transaction
Configuring Spring and JTA without full Java EE http://spring.io/blog/2011/08/15/configuring-spring-and-jta-without-full-java-ee/ Spring doc -Transaction Management http://docs.spring.io/spri
Linux命令之alias - 设置命令的别名，让 Linux 命令更简练 dcj3sjt126com linux alias
用途说明设置命令的别名。在linux系统中如果命令太长又不符合用户的习惯，那么我们可以为它指定一个别名。虽然可以为命令建立“链接”解决长文件名的问题，但对于带命令行参数的命令，链接就无能为力了。而指定别名则可以解决此类所有问题【1】。常用别名来简化ssh登录【见示例三】，使长命令变短，使常用的长命令行变短，强制执行命令时询问等。常用参数格式：alias 格式：ali
yii2 restful web服务[格式响应] dcj3sjt126com PHP yii2
响应格式当处理一个 RESTful API 请求时，一个应用程序通常需要如下步骤来处理响应格式：确定可能影响响应格式的各种因素，例如媒介类型，语言，版本，等等。这个过程也被称为 content negotiation。资源对象转换为数组，如在 Resources 部分中所描述的。通过 [[yii\rest\Serializer]]
MongoDB索引调优（2）——[十] eksliang mongodb MongoDB索引优化
转载请出自出处：http://eksliang.iteye.com/blog/2178555 一、概述上一篇文档中也说明了，MongoDB的索引几乎与关系型数据库的索引一模一样，优化关系型数据库的技巧通用适合MongoDB，所有这里只讲MongoDB需要注意的地方二、索引内嵌文档可以在嵌套文档的键上建立索引，方式与正常
当滑动到顶部和底部时，实现Item的分离效果的ListView gundumw100 android
拉动ListView，Item之间的间距会变大，释放后恢复原样； package cn.tangdada.tangbang.widget; import android.annotation.TargetApi; import android.content.Context; import android.content.res.TypedArray; import andr
程序员用HTML5制作的爱心树表白动画 ini JavaScript jquery Web html5 css
体验效果：http://keleyi.com/keleyi/phtml/html5/31.htmHTML代码如下： <!DOCTYPE html> <html xmlns="http://www.w3.org/1999/xhtml"><head><meta charset="UTF-8" > <ti
预装windows 8 系统GPT模式的ThinkPad T440改装64位 windows 7旗舰版 kakajw ThinkPad 预装改装 windows 7 windows 8
该教程具有普遍参考性，特别适用于联想的机器，其他品牌机器的处理过程也大同小异。该教程是个人多次尝试和总结的结果，实用性强，推荐给需要的人！缘由小弟最近入手笔记本ThinkPad T440，但是特别不能习惯笔记本出厂预装的Windows 8系统，而且厂商自作聪明地预装了一堆没用的应用软件，消耗不少的系统资源（本本的内存为4G，系统启动完成时，物理内存占用比
Nginx学习笔记 mcj8089 nginx
一、安装nginx 1、在nginx官方网站下载一个包，下载地址是： http://nginx.org/download/nginx-1.4.2.tar.gz 2、WinSCP(ftp上传工
mongodb 聚合查询每天论坛链接点击次数 qiaolevip 每天进步一点点学习永无止境 mongodb 纵观千象
/* 18 */ { "_id" : ObjectId("5596414cbe4d73a327e50274"), "msgType" : "text", "sendTime" : ISODate("2015-07-03T08:01:16.000Z"
java术语（PO/POJO/VO/BO/DAO/DTO） Luob. DAO POJO DTO po VO BO
PO(persistant object) 持久对象在o/r 映射的时候出现的概念,如果没有o/r映射,就没有这个概念存在了.通常对应数据模型(数据库),本身还有部分业务逻辑的处理.可以看成是与数据库中的表相映射的java对象.最简单的PO就是对应数据库中某个表中的一条记录,多个记录可以用PO的集合.PO中应该不包含任何对数据库的操作. VO(value object) 值对象通
算法复杂度 Wuaner Algorithm
Time Complexity & Big-O： http://stackoverflow.com/questions/487258/plain-english-explanation-of-big-o http://bigocheatsheet.com/ http://www.sitepoint.com/time-complexity-algorithms/