愤斗的橘子

工具系列：TensorFlow决策森林_(7)检查和调试决策森林模型

文章目录

- 设置
- 训练一个简单的随机森林
- 绘制模型
- 检查模型结构
- 手动创建模型
- 结束树写作

在本文中，您将学习如何直接检查和创建模型的结构。我们假设您已经熟悉了在初级和中级介绍的概念。

在本文中，您将：

训练一个随机森林模型并以编程方式访问其结构。
手动创建一个随机森林模型，并将其用作经典模型。

设置

# 安装 TensorFlow Decision Forests 库
!pip install tensorflow_decision_forests

# 安装 wurlitzer 库，用于显示训练日志
!pip install wurlitzer

Collecting tensorflow_decision_forests
  Using cached tensorflow_decision_forests-1.1.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (16.2 MB)
Requirement already satisfied: wheel in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tensorflow_decision_forests) (0.37.1)
Requirement already satisfied: numpy in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tensorflow_decision_forests) (1.24.0rc2)
Requirement already satisfied: absl-py in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tensorflow_decision_forests) (1.3.0)
Requirement already satisfied: six in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tensorflow_decision_forests) (1.16.0)
Collecting wurlitzer
  Using cached wurlitzer-3.0.3-py3-none-any.whl (7.3 kB)
Requirement already satisfied: pandas in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tensorflow_decision_forests) (1.5.2)
Requirement already satisfied: tensorflow~=2.11.0 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tensorflow_decision_forests) (2.11.0)
Requirement already satisfied: setuptools in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tensorflow~=2.11.0->tensorflow_decision_forests) (65.6.3)
Requirement already satisfied: gast<=0.4.0,>=0.2.1 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tensorflow~=2.11.0->tensorflow_decision_forests) (0.4.0)
Requirement already satisfied: h5py>=2.9.0 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tensorflow~=2.11.0->tensorflow_decision_forests) (3.7.0)
Requirement already satisfied: libclang>=13.0.0 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tensorflow~=2.11.0->tensorflow_decision_forests) (14.0.6)
Requirement already satisfied: flatbuffers>=2.0 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tensorflow~=2.11.0->tensorflow_decision_forests) (22.12.6)
Requirement already satisfied: tensorboard<2.12,>=2.11 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tensorflow~=2.11.0->tensorflow_decision_forests) (2.11.0)
Requirement already satisfied: typing-extensions>=3.6.6 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tensorflow~=2.11.0->tensorflow_decision_forests) (4.4.0)
Requirement already satisfied: keras<2.12,>=2.11.0 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tensorflow~=2.11.0->tensorflow_decision_forests) (2.11.0)
Requirement already satisfied: wrapt>=1.11.0 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tensorflow~=2.11.0->tensorflow_decision_forests) (1.14.1)
Requirement already satisfied: astunparse>=1.6.0 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tensorflow~=2.11.0->tensorflow_decision_forests) (1.6.3)
Requirement already satisfied: tensorflow-estimator<2.12,>=2.11.0 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tensorflow~=2.11.0->tensorflow_decision_forests) (2.11.0)
Requirement already satisfied: protobuf<3.20,>=3.9.2 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tensorflow~=2.11.0->tensorflow_decision_forests) (3.19.6)
Requirement already satisfied: opt-einsum>=2.3.2 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tensorflow~=2.11.0->tensorflow_decision_forests) (3.3.0)
Requirement already satisfied: grpcio<2.0,>=1.24.3 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tensorflow~=2.11.0->tensorflow_decision_forests) (1.51.1)
Requirement already satisfied: packaging in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tensorflow~=2.11.0->tensorflow_decision_forests) (22.0)
Requirement already satisfied: termcolor>=1.1.0 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tensorflow~=2.11.0->tensorflow_decision_forests) (2.1.1)
Requirement already satisfied: tensorflow-io-gcs-filesystem>=0.23.1 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tensorflow~=2.11.0->tensorflow_decision_forests) (0.28.0)
Requirement already satisfied: google-pasta>=0.1.1 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tensorflow~=2.11.0->tensorflow_decision_forests) (0.2.0)
Requirement already satisfied: python-dateutil>=2.8.1 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from pandas->tensorflow_decision_forests) (2.8.2)
Requirement already satisfied: pytz>=2020.1 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from pandas->tensorflow_decision_forests) (2022.6)
Requirement already satisfied: werkzeug>=1.0.1 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tensorboard<2.12,>=2.11->tensorflow~=2.11.0->tensorflow_decision_forests) (2.2.2)
Requirement already satisfied: markdown>=2.6.8 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tensorboard<2.12,>=2.11->tensorflow~=2.11.0->tensorflow_decision_forests) (3.4.1)
Requirement already satisfied: tensorboard-plugin-wit>=1.6.0 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tensorboard<2.12,>=2.11->tensorflow~=2.11.0->tensorflow_decision_forests) (1.8.1)
Requirement already satisfied: tensorboard-data-server<0.7.0,>=0.6.0 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tensorboard<2.12,>=2.11->tensorflow~=2.11.0->tensorflow_decision_forests) (0.6.1)
Requirement already satisfied: requests<3,>=2.21.0 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tensorboard<2.12,>=2.11->tensorflow~=2.11.0->tensorflow_decision_forests) (2.28.1)
Requirement already satisfied: google-auth-oauthlib<0.5,>=0.4.1 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tensorboard<2.12,>=2.11->tensorflow~=2.11.0->tensorflow_decision_forests) (0.4.6)
Requirement already satisfied: google-auth<3,>=1.6.3 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tensorboard<2.12,>=2.11->tensorflow~=2.11.0->tensorflow_decision_forests) (2.15.0)
Requirement already satisfied: cachetools<6.0,>=2.0.0 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from google-auth<3,>=1.6.3->tensorboard<2.12,>=2.11->tensorflow~=2.11.0->tensorflow_decision_forests) (5.2.0)
Requirement already satisfied: rsa<5,>=3.1.4 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from google-auth<3,>=1.6.3->tensorboard<2.12,>=2.11->tensorflow~=2.11.0->tensorflow_decision_forests) (4.9)
Requirement already satisfied: pyasn1-modules>=0.2.1 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from google-auth<3,>=1.6.3->tensorboard<2.12,>=2.11->tensorflow~=2.11.0->tensorflow_decision_forests) (0.3.0rc1)
Requirement already satisfied: requests-oauthlib>=0.7.0 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from google-auth-oauthlib<0.5,>=0.4.1->tensorboard<2.12,>=2.11->tensorflow~=2.11.0->tensorflow_decision_forests) (1.3.1)
Requirement already satisfied: importlib-metadata>=4.4 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from markdown>=2.6.8->tensorboard<2.12,>=2.11->tensorflow~=2.11.0->tensorflow_decision_forests) (5.1.0)
Requirement already satisfied: charset-normalizer<3,>=2 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from requests<3,>=2.21.0->tensorboard<2.12,>=2.11->tensorflow~=2.11.0->tensorflow_decision_forests) (2.1.1)
Requirement already satisfied: idna<4,>=2.5 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from requests<3,>=2.21.0->tensorboard<2.12,>=2.11->tensorflow~=2.11.0->tensorflow_decision_forests) (3.4)
Requirement already satisfied: certifi>=2017.4.17 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from requests<3,>=2.21.0->tensorboard<2.12,>=2.11->tensorflow~=2.11.0->tensorflow_decision_forests) (2022.12.7)
Requirement already satisfied: urllib3<1.27,>=1.21.1 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from requests<3,>=2.21.0->tensorboard<2.12,>=2.11->tensorflow~=2.11.0->tensorflow_decision_forests) (1.26.13)
Requirement already satisfied: MarkupSafe>=2.1.1 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from werkzeug>=1.0.1->tensorboard<2.12,>=2.11->tensorflow~=2.11.0->tensorflow_decision_forests) (2.1.1)
Requirement already satisfied: zipp>=0.5 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from importlib-metadata>=4.4->markdown>=2.6.8->tensorboard<2.12,>=2.11->tensorflow~=2.11.0->tensorflow_decision_forests) (3.11.0)
Requirement already satisfied: pyasn1<0.6.0,>=0.4.6 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from pyasn1-modules>=0.2.1->google-auth<3,>=1.6.3->tensorboard<2.12,>=2.11->tensorflow~=2.11.0->tensorflow_decision_forests) (0.5.0rc2)
Requirement already satisfied: oauthlib>=3.0.0 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from requests-oauthlib>=0.7.0->google-auth-oauthlib<0.5,>=0.4.1->tensorboard<2.12,>=2.11->tensorflow~=2.11.0->tensorflow_decision_forests) (3.2.2)
Installing collected packages: wurlitzer, tensorflow_decision_forests
Successfully installed tensorflow_decision_forests-1.1.0 wurlitzer-3.0.3
Requirement already satisfied: wurlitzer in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (3.0.3)

# 导入tensorflow_decision_forests库
import tensorflow_decision_forests as tfdf

# 导入os、numpy、pandas、tensorflow、matplotlib.pyplot、math、collections库
import os
import numpy as np
import pandas as pd
import tensorflow as tf
import matplotlib.pyplot as plt
import math
import collections

2022-12-14 12:24:51.050867: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer.so.7'; dlerror: libnvinfer.so.7: cannot open shared object file: No such file or directory
2022-12-14 12:24:51.050964: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer_plugin.so.7'; dlerror: libnvinfer_plugin.so.7: cannot open shared object file: No such file or directory
2022-12-14 12:24:51.050973: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly.

隐藏的代码单元格限制了在colab中的输出高度。


# 导入所需的模块
from IPython.core.magic import register_line_magic
from IPython.display import Javascript
from IPython.display import display as ipy_display

# 定义一个魔术命令，用于设置单元格的最大高度
@register_line_magic
def set_cell_height(size):
  # 调用Javascript代码，设置单元格的最大高度
  ipy_display(
      Javascript("google.colab.output.setIframeHeight(0, true, {maxHeight: " +
                 str(size) + "})"))

训练一个简单的随机森林

我们像在初学者colab中一样训练一个随机森林。

# 下载数据集
!wget -q https://storage.googleapis.com/download.tensorflow.org/data/palmer_penguins/penguins.csv -O /tmp/penguins.csv

# 将数据集加载到Pandas Dataframe中
dataset_df = pd.read_csv("/tmp/penguins.csv")

# 显示前三个示例
print(dataset_df.head(3))

# 将Pandas Dataframe转换为tf数据集
dataset_tf = tfdf.keras.pd_dataframe_to_tf_dataset(dataset_df, label="species")

# 训练随机森林模型
model = tfdf.keras.RandomForestModel(compute_oob_variable_importances=True)
model.fit(x=dataset_tf)

  species     island  bill_length_mm  bill_depth_mm  flipper_length_mm  \
0  Adelie  Torgersen            39.1           18.7              181.0   
1  Adelie  Torgersen            39.5           17.4              186.0   
2  Adelie  Torgersen            40.3           18.0              195.0   

   body_mass_g     sex  year  
0       3750.0    male  2007  
1       3800.0  female  2007  
2       3250.0  female  2007  
Warning: The `num_threads` constructor argument is not set and the number of CPU is os.cpu_count()=32 > 32. Setting num_threads to 32. Set num_threads manually to use more than 32 cpus.


WARNING:absl:The `num_threads` constructor argument is not set and the number of CPU is os.cpu_count()=32 > 32. Setting num_threads to 32. Set num_threads manually to use more than 32 cpus.


Use /tmpfs/tmp/tmpvr7urazn as temporary training directory
Reading training dataset...
WARNING:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.9/site-packages/tensorflow/python/autograph/pyct/static_analysis/liveness.py:83: Analyzer.lamba_check (from tensorflow.python.autograph.pyct.static_analysis.liveness) is deprecated and will be removed after 2023-09-23.
Instructions for updating:
Lambda fuctions will be no more assumed to be used in the statement where they are used, or at least in the same block. https://github.com/tensorflow/tensorflow/issues/56089


WARNING:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.9/site-packages/tensorflow/python/autograph/pyct/static_analysis/liveness.py:83: Analyzer.lamba_check (from tensorflow.python.autograph.pyct.static_analysis.liveness) is deprecated and will be removed after 2023-09-23.
Instructions for updating:
Lambda fuctions will be no more assumed to be used in the statement where they are used, or at least in the same block. https://github.com/tensorflow/tensorflow/issues/56089


Training dataset read in 0:00:02.961832. Found 344 examples.
Training model...
Model trained in 0:00:00.093680
Compiling model...


[INFO 2022-12-14T12:24:58.955519768+00:00 kernel.cc:1175] Loading model from path /tmpfs/tmp/tmpvr7urazn/model/ with prefix fb8057db01324481
[INFO 2022-12-14T12:24:58.971817533+00:00 abstract_model.cc:1306] Engine "RandomForestGeneric" built
[INFO 2022-12-14T12:24:58.97187255+00:00 kernel.cc:1021] Use fast generic engine


WARNING:tensorflow:AutoGraph could not transform  and will run it as-is.
Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output.
Cause: could not get source code
To silence this warning, decorate the function with @tf.autograph.experimental.do_not_convert


WARNING:tensorflow:AutoGraph could not transform  and will run it as-is.
Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output.
Cause: could not get source code
To silence this warning, decorate the function with @tf.autograph.experimental.do_not_convert


WARNING: AutoGraph could not transform  and will run it as-is.
Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output.
Cause: could not get source code
To silence this warning, decorate the function with @tf.autograph.experimental.do_not_convert
Model compiled.

请注意模型构造函数中的compute_oob_variable_importances=True超参数。此选项在训练过程中计算袋外（OOB）变量重要性。这是随机森林模型的一种流行的排列变量重要性。

计算OOB变量重要性不会影响最终模型，但会减慢大型数据集的训练速度。

请检查模型摘要：

# 打印模型的概述信息
model.summary()




Model: "random_forest_model"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
=================================================================
Total params: 1
Trainable params: 0
Non-trainable params: 1
_________________________________________________________________
Type: "RANDOM_FOREST"
Task: CLASSIFICATION
Label: "__LABEL"

Input Features (7):
	bill_depth_mm
	bill_length_mm
	body_mass_g
	flipper_length_mm
	island
	sex
	year

No weights

Variable Importance: MEAN_DECREASE_IN_ACCURACY:
    1.    "bill_length_mm"  0.151163 ################
    2.            "island"  0.008721 #
    3.     "bill_depth_mm"  0.000000 
    4.       "body_mass_g"  0.000000 
    5.               "sex"  0.000000 
    6.              "year"  0.000000 
    7. "flipper_length_mm" -0.002907 

Variable Importance: MEAN_DECREASE_IN_AP_1_VS_OTHERS:
    1.    "bill_length_mm"  0.083305 ################
    2.            "island"  0.007664 #
    3. "flipper_length_mm"  0.003400 
    4.     "bill_depth_mm"  0.002741 
    5.       "body_mass_g"  0.000722 
    6.               "sex"  0.000644 
    7.              "year"  0.000000 

Variable Importance: MEAN_DECREASE_IN_AP_2_VS_OTHERS:
    1.    "bill_length_mm"  0.508510 ################
    2.            "island"  0.023487 
    3.     "bill_depth_mm"  0.007744 
    4. "flipper_length_mm"  0.006008 
    5.       "body_mass_g"  0.003017 
    6.               "sex"  0.001537 
    7.              "year" -0.000245 

Variable Importance: MEAN_DECREASE_IN_AP_3_VS_OTHERS:
    1.            "island"  0.002192 ################
    2.    "bill_length_mm"  0.001572 ############
    3.     "bill_depth_mm"  0.000497 #######
    4.               "sex"  0.000000 ####
    5.              "year"  0.000000 ####
    6.       "body_mass_g" -0.000053 ####
    7. "flipper_length_mm" -0.000890 

Variable Importance: MEAN_DECREASE_IN_AUC_1_VS_OTHERS:
    1.    "bill_length_mm"  0.071306 ################
    2.            "island"  0.007299 #
    3. "flipper_length_mm"  0.004506 #
    4.     "bill_depth_mm"  0.002124 
    5.       "body_mass_g"  0.000548 
    6.               "sex"  0.000480 
    7.              "year"  0.000000 

Variable Importance: MEAN_DECREASE_IN_AUC_2_VS_OTHERS:
    1.    "bill_length_mm"  0.108642 ################
    2.            "island"  0.014493 ##
    3.     "bill_depth_mm"  0.007406 #
    4. "flipper_length_mm"  0.005195 
    5.       "body_mass_g"  0.001012 
    6.               "sex"  0.000480 
    7.              "year" -0.000053 

Variable Importance: MEAN_DECREASE_IN_AUC_3_VS_OTHERS:
    1.            "island"  0.002126 ################
    2.    "bill_length_mm"  0.001393 ###########
    3.     "bill_depth_mm"  0.000293 #####
    4.               "sex"  0.000000 ###
    5.              "year"  0.000000 ###
    6.       "body_mass_g" -0.000037 ###
    7. "flipper_length_mm" -0.000550 

Variable Importance: MEAN_DECREASE_IN_PRAUC_1_VS_OTHERS:
    1.    "bill_length_mm"  0.083122 ################
    2.            "island"  0.010887 ##
    3. "flipper_length_mm"  0.003425 
    4.     "bill_depth_mm"  0.002731 
    5.       "body_mass_g"  0.000719 
    6.               "sex"  0.000641 
    7.              "year"  0.000000 

Variable Importance: MEAN_DECREASE_IN_PRAUC_2_VS_OTHERS:
    1.    "bill_length_mm"  0.497611 ################
    2.            "island"  0.024045 
    3.     "bill_depth_mm"  0.007734 
    4. "flipper_length_mm"  0.006017 
    5.       "body_mass_g"  0.003000 
    6.               "sex"  0.001528 
    7.              "year" -0.000243 

Variable Importance: MEAN_DECREASE_IN_PRAUC_3_VS_OTHERS:
    1.            "island"  0.002187 ################
    2.    "bill_length_mm"  0.001568 ############
    3.     "bill_depth_mm"  0.000495 #######
    4.               "sex"  0.000000 ####
    5.              "year"  0.000000 ####
    6.       "body_mass_g" -0.000053 ####
    7. "flipper_length_mm" -0.000886 

Variable Importance: MEAN_MIN_DEPTH:
    1.           "__LABEL"  3.479602 ################
    2.              "year"  3.463891 ###############
    3.               "sex"  3.430498 ###############
    4.       "body_mass_g"  2.898112 ###########
    5.            "island"  2.388925 ########
    6.     "bill_depth_mm"  2.336100 #######
    7.    "bill_length_mm"  1.282960 
    8. "flipper_length_mm"  1.270079 

Variable Importance: NUM_AS_ROOT:
    1. "flipper_length_mm" 157.000000 ################
    2.    "bill_length_mm" 76.000000 #######
    3.     "bill_depth_mm" 52.000000 #####
    4.            "island" 12.000000 
    5.       "body_mass_g"  3.000000 

Variable Importance: NUM_NODES:
    1.    "bill_length_mm" 778.000000 ################
    2.     "bill_depth_mm" 463.000000 #########
    3. "flipper_length_mm" 414.000000 ########
    4.            "island" 342.000000 ######
    5.       "body_mass_g" 338.000000 ######
    6.               "sex" 36.000000 
    7.              "year" 19.000000 

Variable Importance: SUM_SCORE:
    1.    "bill_length_mm" 36515.793787 ################
    2. "flipper_length_mm" 35120.434174 ###############
    3.            "island" 14669.408395 ######
    4.     "bill_depth_mm" 14515.446617 ######
    5.       "body_mass_g" 3485.330881 #
    6.               "sex" 354.201073 
    7.              "year" 49.737758 



Winner takes all: true
Out-of-bag evaluation: accuracy:0.976744 logloss:0.0678223
Number of trees: 300
Total number of nodes: 5080

Number of nodes by tree:
Count: 300 Average: 16.9333 StdDev: 3.10197
Min: 11 Max: 31 Ignored: 0
----------------------------------------------
[ 11, 12)  6   2.00%   2.00% #
[ 12, 13)  0   0.00%   2.00%
[ 13, 14) 46  15.33%  17.33% #####
[ 14, 15)  0   0.00%  17.33%
[ 15, 16) 70  23.33%  40.67% ########
[ 16, 17)  0   0.00%  40.67%
[ 17, 18) 84  28.00%  68.67% ##########
[ 18, 19)  0   0.00%  68.67%
[ 19, 20) 46  15.33%  84.00% #####
[ 20, 21)  0   0.00%  84.00%
[ 21, 22) 30  10.00%  94.00% ####
[ 22, 23)  0   0.00%  94.00%
[ 23, 24) 13   4.33%  98.33% ##
[ 24, 25)  0   0.00%  98.33%
[ 25, 26)  2   0.67%  99.00%
[ 26, 27)  0   0.00%  99.00%
[ 27, 28)  2   0.67%  99.67%
[ 28, 29)  0   0.00%  99.67%
[ 29, 30)  0   0.00%  99.67%
[ 30, 31]  1   0.33% 100.00%

Depth by leafs:
Count: 2690 Average: 3.53271 StdDev: 1.06789
Min: 2 Max: 7 Ignored: 0
----------------------------------------------
[ 2, 3) 545  20.26%  20.26% ######
[ 3, 4) 747  27.77%  48.03% ########
[ 4, 5) 888  33.01%  81.04% ##########
[ 5, 6) 444  16.51%  97.55% #####
[ 6, 7)  62   2.30%  99.85% #
[ 7, 7]   4   0.15% 100.00%

Number of training obs by leaf:
Count: 2690 Average: 38.3643 StdDev: 44.8651
Min: 5 Max: 155 Ignored: 0
----------------------------------------------
[   5,  12) 1474  54.80%  54.80% ##########
[  12,  20)  124   4.61%  59.41% #
[  20,  27)   48   1.78%  61.19%
[  27,  35)   74   2.75%  63.94% #
[  35,  42)   58   2.16%  66.10%
[  42,  50)   85   3.16%  69.26% #
[  50,  57)   96   3.57%  72.83% #
[  57,  65)   87   3.23%  76.06% #
[  65,  72)   49   1.82%  77.88%
[  72,  80)   23   0.86%  78.74%
[  80,  88)   30   1.12%  79.85%
[  88,  95)   23   0.86%  80.71%
[  95, 103)   42   1.56%  82.27%
[ 103, 110)   62   2.30%  84.57%
[ 110, 118)  115   4.28%  88.85% #
[ 118, 125)  115   4.28%  93.12% #
[ 125, 133)   98   3.64%  96.77% #
[ 133, 140)   49   1.82%  98.59%
[ 140, 148)   31   1.15%  99.74%
[ 148, 155]    7   0.26% 100.00%

Attribute in nodes:
	778 : bill_length_mm [NUMERICAL]
	463 : bill_depth_mm [NUMERICAL]
	414 : flipper_length_mm [NUMERICAL]
	342 : island [CATEGORICAL]
	338 : body_mass_g [NUMERICAL]
	36 : sex [CATEGORICAL]
	19 : year [NUMERICAL]

Attribute in nodes with depth <= 0:
	157 : flipper_length_mm [NUMERICAL]
	76 : bill_length_mm [NUMERICAL]
	52 : bill_depth_mm [NUMERICAL]
	12 : island [CATEGORICAL]
	3 : body_mass_g [NUMERICAL]

Attribute in nodes with depth <= 1:
	250 : bill_length_mm [NUMERICAL]
	244 : flipper_length_mm [NUMERICAL]
	183 : bill_depth_mm [NUMERICAL]
	170 : island [CATEGORICAL]
	53 : body_mass_g [NUMERICAL]

Attribute in nodes with depth <= 2:
	462 : bill_length_mm [NUMERICAL]
	320 : flipper_length_mm [NUMERICAL]
	310 : bill_depth_mm [NUMERICAL]
	287 : island [CATEGORICAL]
	162 : body_mass_g [NUMERICAL]
	9 : sex [CATEGORICAL]
	5 : year [NUMERICAL]

Attribute in nodes with depth <= 3:
	669 : bill_length_mm [NUMERICAL]
	410 : bill_depth_mm [NUMERICAL]
	383 : flipper_length_mm [NUMERICAL]
	328 : island [CATEGORICAL]
	286 : body_mass_g [NUMERICAL]
	32 : sex [CATEGORICAL]
	10 : year [NUMERICAL]

Attribute in nodes with depth <= 5:
	778 : bill_length_mm [NUMERICAL]
	462 : bill_depth_mm [NUMERICAL]
	413 : flipper_length_mm [NUMERICAL]
	342 : island [CATEGORICAL]
	338 : body_mass_g [NUMERICAL]
	36 : sex [CATEGORICAL]
	19 : year [NUMERICAL]

Condition type in nodes:
	2012 : HigherCondition
	378 : ContainsBitmapCondition
Condition type in nodes with depth <= 0:
	288 : HigherCondition
	12 : ContainsBitmapCondition
Condition type in nodes with depth <= 1:
	730 : HigherCondition
	170 : ContainsBitmapCondition
Condition type in nodes with depth <= 2:
	1259 : HigherCondition
	296 : ContainsBitmapCondition
Condition type in nodes with depth <= 3:
	1758 : HigherCondition
	360 : ContainsBitmapCondition
Condition type in nodes with depth <= 5:
	2010 : HigherCondition
	378 : ContainsBitmapCondition
Node format: NOT_SET

Training OOB:
	trees: 1, Out-of-bag evaluation: accuracy:0.964286 logloss:1.28727
	trees: 13, Out-of-bag evaluation: accuracy:0.959064 logloss:0.4869
	trees: 31, Out-of-bag evaluation: accuracy:0.95614 logloss:0.284603
	trees: 54, Out-of-bag evaluation: accuracy:0.973837 logloss:0.175283
	trees: 73, Out-of-bag evaluation: accuracy:0.97093 logloss:0.175816
	trees: 85, Out-of-bag evaluation: accuracy:0.973837 logloss:0.171781
	trees: 96, Out-of-bag evaluation: accuracy:0.97093 logloss:0.077417
	trees: 116, Out-of-bag evaluation: accuracy:0.976744 logloss:0.0761788
	trees: 127, Out-of-bag evaluation: accuracy:0.976744 logloss:0.0745239
	trees: 137, Out-of-bag evaluation: accuracy:0.976744 logloss:0.0753508
	trees: 150, Out-of-bag evaluation: accuracy:0.976744 logloss:0.0741464
	trees: 160, Out-of-bag evaluation: accuracy:0.976744 logloss:0.0749481
	trees: 170, Out-of-bag evaluation: accuracy:0.979651 logloss:0.0719624
	trees: 190, Out-of-bag evaluation: accuracy:0.976744 logloss:0.0711787
	trees: 203, Out-of-bag evaluation: accuracy:0.976744 logloss:0.0701121
	trees: 213, Out-of-bag evaluation: accuracy:0.976744 logloss:0.0682979
	trees: 224, Out-of-bag evaluation: accuracy:0.976744 logloss:0.0689686
	trees: 248, Out-of-bag evaluation: accuracy:0.976744 logloss:0.0674086
	trees: 260, Out-of-bag evaluation: accuracy:0.976744 logloss:0.068218
	trees: 270, Out-of-bag evaluation: accuracy:0.976744 logloss:0.0680733
	trees: 280, Out-of-bag evaluation: accuracy:0.976744 logloss:0.0685965
	trees: 290, Out-of-bag evaluation: accuracy:0.976744 logloss:0.0683421
	trees: 300, Out-of-bag evaluation: accuracy:0.976744 logloss:0.0678223

注意，变量重要性有多个名称为MEAN_DECREASE_IN_*。

绘制模型

接下来，绘制模型。

随机森林是一个庞大的模型（该模型有300棵树和约5k个节点；请参见上面的摘要）。因此，只绘制第一棵树，并将节点限制在深度3。

# 使用model_plotter模块中的plot_model_in_colab函数来绘制模型
# 参数model表示要绘制的模型
# 参数tree_idx表示要绘制的树的索引，这里设置为0表示绘制第一棵树
# 参数max_depth表示要绘制的树的最大深度，这里设置为3表示绘制到第三层
tfdf.model_plotter.plot_model_in_colab(model, tree_idx=0, max_depth=3)

/**

Plotting of decision trees generated by TF-DF.
A tree is a recursive structure of node objects.
A node contains one or more of the following components:
- A value: Representing the output of the node. If the node is not a leaf,

 the value is only present for analysis i.e. it is not used for

```
 predictions.
```
- A condition : For non-leaf nodes, the condition (also known as split)

 defines a binary test to branch to the positive or negative child.

- An explanation: Generally a plot showing the relation between the label

 and the condition to give insights about the effect of the condition.

- Two children : For non-leaf nodes, the children nodes. The first

 children (i.e. "node.children[0]") is the negative children (drawn in

 red). The second children is the positive one (drawn in green).

/**

Plots a single decision tree into a DOM element.
@param {!options} options Dictionary of configurations.
@param {!tree} raw_tree Recursive tree structure.
@param {string} canvas_id Id of the output dom element.
*/
function display_tree(options, raw_tree, canvas_id) {
console.log(options);

// Determine the node placement.
const tree_struct = d3.tree().nodeSize(
[options.node_y_offset, options.node_x_offset])(d3.hierarchy(raw_tree));

// Boundaries of the node placement.
let x_min = Infinity;
let x_max = -x_min;
let y_min = Infinity;
let y_max = -x_min;

tree_struct.each(d => {
if (d.x > x_max) x_max = d.x;
if (d.x < x_min) x_min = d.x;
if (d.y > y_max) y_max = d.y;
if (d.y < y_min) y_min = d.y;
});

// Size of the plot.
const width = y_max - y_min + options.node_x_size + options.margin * 2;
const height = x_max - x_min + options.node_y_size + options.margin * 2 +
options.node_y_offset - options.node_y_size;

const plot = d3.select(canvas_id);

// Tool tip
options.tooltip = plot.append(‘div’)
.attr(‘width’, 100)
.attr(‘height’, 100)
.style(‘padding’, ‘4px’)
.style(‘background’, ‘#fff’)
.style(‘box-shadow’, ‘4px 4px 0px rgba(0,0,0,0.1)’)
.style(‘border’, ‘1px solid black’)
.style(‘font-family’, ‘sans-serif’)
.style(‘font-size’, options.font_size)
.style(‘position’, ‘absolute’)
.style(‘z-index’, ‘10’)
.attr(‘pointer-events’, ‘none’)
.style(‘display’, ‘none’);

// Create canvas
const svg = plot.append(‘svg’).attr(‘width’, width).attr(‘height’, height);
const graph =
svg.style(‘overflow’, ‘visible’)
.append(‘g’)
.attr(‘font-family’, ‘sans-serif’)
.attr(‘font-size’, options.font_size)
.attr(
‘transform’,
() => translate(${options.margin},${ - x_min + options.node_y_offset / 2 + options.margin}));

// Plot bounding box.
if (options.show_plot_bounding_box) {
svg.append(‘rect’)
.attr(‘width’, width)
.attr(‘height’, height)
.attr(‘fill’, ‘none’)
.attr(‘stroke-width’, 1.0)
.attr(‘stroke’, ‘black’);
}

// Draw the edges.
display_edges(options, graph, tree_struct);

// Draw the nodes.
display_nodes(options, graph, tree_struct);
}

/**

Draw the nodes of the tree.
@param {!options} options Dictionary of configurations.
@param {!graph} graph D3 search handle containing the graph.
@param {!tree_struct} tree_struct Structure of the tree (node placement,
```
data, etc.).
```

*/
function display_nodes(options, graph, tree_struct) {
const nodes = graph.append(‘g’)
.selectAll(‘g’)
.data(tree_struct.descendants())
.join(‘g’)
.attr(‘transform’, d => translate(${d.y},${d.x}));

nodes.append(‘rect’)
.attr(‘x’, 0.5)
.attr(‘y’, 0.5)
.attr(‘width’, options.node_x_size)
.attr(‘height’, options.node_y_size)
.attr(‘stroke’, ‘lightgrey’)
.attr(‘stroke-width’, 1)
.attr(‘fill’, ‘white’)
.attr(‘y’, -options.node_y_size / 2);

// Brackets on the right of condition nodes without children.
non_leaf_node_without_children =
nodes.filter(node => node.data.condition != null && node.children == null)
.append(‘g’)
.attr(‘transform’, translate(${options.node_x_size},0));

non_leaf_node_without_children.append(‘path’)
.attr(‘d’, ‘M0,0 C 10,0 0,10 10,10’)
.attr(‘fill’, ‘none’)
.attr(‘stroke-width’, 1.0)
.attr(‘stroke’, ‘#F00’);

non_leaf_node_without_children.append(‘path’)
.attr(‘d’, ‘M0,0 C 10,0 0,-10 10,-10’)
.attr(‘fill’, ‘none’)
.attr(‘stroke-width’, 1.0)
.attr(‘stroke’, ‘#0F0’);

const node_content = nodes.append(‘g’).attr(
‘transform’,
translate(0,${options.node_padding - options.node_y_size / 2}));

node_content.append(node => create_node_element(options, node));
}

/**

Creates the D3 content for a single node.
@param {!options} options Dictionary of configurations.
@param {!node} node Node to draw.
@return {!d3} D3 content.
*/
function create_node_element(options, node) {
// Output accumulator.
let output = {
// Content to draw.
content: d3.create(‘svg:g’),
// Vertical offset to the next element to draw.
vertical_offset: 0
};

// Conditions.
if (node.data.condition != null) {
display_condition(options, node.data.condition, output);
}

// Values.
if (node.data.value != null) {
display_value(options, node.data.value, output);
}

// Explanations.
if (node.data.explanation != null) {
display_explanation(options, node.data.explanation, output);
}

return output.content.node();
}

/**

Adds a single line of text inside of a node.
@param {!options} options Dictionary of configurations.
@param {string} text Text to display.
@param {!output} output Output display accumulator.
*/
function display_node_text(options, text, output) {
output.content.append(‘text’)
.attr(‘x’, options.node_padding)
.attr(‘y’, output.vertical_offset)
.attr(‘alignment-baseline’, ‘hanging’)
.text(text);
output.vertical_offset += 10;
}

/**

Adds a single line of text inside of a node with a tooltip.
@param {!options} options Dictionary of configurations.
@param {string} text Text to display.
@param {string} tooltip Text in the Tooltip.
@param {!output} output Output display accumulator.
*/
function display_node_text_with_tooltip(options, text, tooltip, output) {
const item = output.content.append(‘text’)
.attr(‘x’, options.node_padding)
.attr(‘alignment-baseline’, ‘hanging’)
.text(text);

add_tooltip(options, item, () => tooltip);
output.vertical_offset += 10;
}

/**

Adds a tooltip to a dom element.
@param {!options} options Dictionary of configurations.
@param {!dom} target Dom element to equip with a tooltip.
@param {!func} get_content Generates the html content of the tooltip.
*/
function add_tooltip(options, target, get_content) {
function show(d) {
options.tooltip.style(‘display’, ‘block’);
options.tooltip.html(get_content());
}

function hide(d) {
options.tooltip.style(‘display’, ‘none’);
}

function move(d) {
options.tooltip.style(‘display’, ‘block’);
options.tooltip.style(‘left’, (d.pageX + 5) + ‘px’);
options.tooltip.style(‘top’, d.pageY + ‘px’);
}

target.on(‘mouseover’, show);
target.on(‘mouseout’, hide);
target.on(‘mousemove’, move);
}

/**

Adds a condition inside of a node.
@param {!options} options Dictionary of configurations.
@param {!condition} condition Condition to display.
@param {!output} output Output display accumulator.
*/
function display_condition(options, condition, output) {
threshold_format = d3.format(‘r’);

if (condition.type === ‘IS_MISSING’) {
display_node_text(options, ${condition.attribute} is missing, output);
return;
}

if (condition.type === ‘IS_TRUE’) {
display_node_text(options, ${condition.attribute} is true, output);
return;
}

if (condition.type === ‘NUMERICAL_IS_HIGHER_THAN’) {
format = d3.format(‘r’);
display_node_text(
options,
${condition.attribute} >= ${threshold_format(condition.threshold)},
output);
return;
}

if (condition.type === ‘CATEGORICAL_IS_IN’) {
display_node_text_with_tooltip(
options, ${condition.attribute} in [...],
${condition.attribute} in [${condition.mask}], output);
return;
}

if (condition.type === ‘CATEGORICAL_SET_CONTAINS’) {
display_node_text_with_tooltip(
options, ${condition.attribute} intersect [...],
${condition.attribute} intersect [${condition.mask}], output);
return;
}

if (condition.type === ‘NUMERICAL_SPARSE_OBLIQUE’) {
display_node_text_with_tooltip(
options, Sparse oblique split...,
[${condition.attributes}]*[${condition.weights}]>=${ threshold_format(condition.threshold)},
output);
return;
}

display_node_text(
options, Non supported condition ${condition.type}, output);
}

/**

Adds a value inside of a node.
@param {!options} options Dictionary of configurations.
@param {!value} value Value to display.
@param {!output} output Output display accumulator.
*/
function display_value(options, value, output) {
if (value.type === ‘PROBABILITY’) {
const left_margin = 0;
const right_margin = 50;
const plot_width = options.node_x_size - options.node_padding * 2 -
left_margin - right_margin;

let cusum = Array.from(d3.cumsum(value.distribution));
cusum.unshift(0);
const distribution_plot = output.content.append(‘g’).attr(
‘transform’, translate(0,${output.vertical_offset + 0.5}));

distribution_plot.selectAll(‘rect’)
.data(value.distribution)
.join(‘rect’)
.attr(‘height’, 10)
.attr(
‘x’,
(d, i) =>
(cusum[i] * plot_width + left_margin + options.node_padding))
.attr(‘width’, (d, i) => d * plot_width)
.style(‘fill’, (d, i) => d3.schemeSet1[i]);

const num_examples =
output.content.append(‘g’)
.attr(‘transform’, translate(0,${output.vertical_offset}))
.append(‘text’)
.attr(‘x’, options.node_x_size - options.node_padding)
.attr(‘alignment-baseline’, ‘hanging’)
.attr(‘text-anchor’, ‘end’)
.text((${value.num_examples}));

const distribution_details = d3.create(‘ul’);
distribution_details.selectAll(‘li’)
.data(value.distribution)
.join(‘li’)
.append(‘span’)
.text(
(d, i) =>
‘class ’ + i + ‘: ’ + d3.format(’.3%’)(value.distribution[i]));

add_tooltip(options, distribution_plot, () => distribution_details.html());
add_tooltip(options, num_examples, () => ‘Number of examples’);

output.vertical_offset += 10;
return;
}

if (value.type === ‘REGRESSION’) {
display_node_text(
options,
‘value: ’ + d3.format(‘r’)(value.value) + ( +
d3.format(’.6’)(value.num_examples) + ),
output);
return;
}

display_node_text(options, Non supported value ${value.type}, output);
}

/**

Adds an explanation inside of a node.
@param {!options} options Dictionary of configurations.
@param {!explanation} explanation Explanation to display.
@param {!output} output Output display accumulator.
*/
function display_explanation(options, explanation, output) {
// Margin before the explanation.
output.vertical_offset += 10;

display_node_text(
options, Non supported explanation ${explanation.type}, output);
}

/**

Draw the edges of the tree.
@param {!options} options Dictionary of configurations.
@param {!graph} graph D3 search handle containing the graph.
@param {!tree_struct} tree_struct Structure of the tree (node placement,
```
data, etc.).
```

*/
function display_edges(options, graph, tree_struct) {
// Draw an edge between a parent and a child node with a bezier.
function draw_single_edge(d) {
return ‘M’ + (d.source.y + options.node_x_size) + ‘,’ + d.source.x + ’ C’ +
(d.source.y + options.node_x_size + options.edge_rounding) + ‘,’ +
d.source.x + ’ ’ + (d.target.y - options.edge_rounding) + ‘,’ +
d.target.x + ’ ’ + d.target.y + ‘,’ + d.target.x;
}

graph.append(‘g’)
.attr(‘fill’, ‘none’)
.attr(‘stroke-width’, 1.2)
.selectAll(‘path’)
.data(tree_struct.links())
.join(‘path’)
.attr(‘d’, draw_single_edge)
.attr(
‘stroke’, d => (d.target === d.source.children[0]) ? ‘#0F0’ : ‘#F00’);
}

display_tree({“margin”: 10, “node_x_size”: 160, “node_y_size”: 28, “node_x_offset”: 180, “node_y_offset”: 33, “font_size”: 10, “edge_rounding”: 20, “node_padding”: 2, “show_plot_bounding_box”: false}, {“value”: {“type”: “PROBABILITY”, “distribution”: [0.47093023255813954, 0.19476744186046513, 0.33430232558139533], “num_examples”: 344.0}, “condition”: {“type”: “NUMERICAL_IS_HIGHER_THAN”, “attribute”: “bill_length_mm”, “threshold”: 43.25}, “children”: [{“value”: {“type”: “PROBABILITY”, “distribution”: [0.005847953216374269, 0.3567251461988304, 0.6374269005847953], “num_examples”: 171.0}, “condition”: {“type”: “CATEGORICAL_IS_IN”, “attribute”: “island”, “mask”: [“Biscoe”]}, “children”: [{“value”: {“type”: “PROBABILITY”, “distribution”: [0.00909090909090909, 0.0, 0.990909090909091], “num_examples”: 110.0}, “condition”: {“type”: “NUMERICAL_IS_HIGHER_THAN”, “attribute”: “bill_depth_mm”, “threshold”: 17.225584030151367}, “children”: [{“value”: {“type”: “PROBABILITY”, “distribution”: [0.16666666666666666, 0.0, 0.8333333333333334], “num_examples”: 6.0}}, {“value”: {“type”: “PROBABILITY”, “distribution”: [0.0, 0.0, 1.0], “num_examples”: 104.0}}]}, {“value”: {“type”: “PROBABILITY”, “distribution”: [0.0, 1.0, 0.0], “num_examples”: 61.0}}]}, {“value”: {“type”: “PROBABILITY”, “distribution”: [0.930635838150289, 0.03468208092485549, 0.03468208092485549], “num_examples”: 173.0}, “condition”: {“type”: “NUMERICAL_IS_HIGHER_THAN”, “attribute”: “bill_depth_mm”, “threshold”: 15.100000381469727}, “children”: [{“value”: {“type”: “PROBABILITY”, “distribution”: [0.9640718562874252, 0.03592814371257485, 0.0], “num_examples”: 167.0}, “condition”: {“type”: “NUMERICAL_IS_HIGHER_THAN”, “attribute”: “flipper_length_mm”, “threshold”: 187.5}, “children”: [{“value”: {“type”: “PROBABILITY”, “distribution”: [1.0, 0.0, 0.0], “num_examples”: 104.0}}, {“value”: {“type”: “PROBABILITY”, “distribution”: [0.9047619047619048, 0.09523809523809523, 0.0], “num_examples”: 63.0}, “condition”: {“type”: “NUMERICAL_IS_HIGHER_THAN”, “attribute”: “bill_length_mm”, “threshold”: 42.30000305175781}}]}, {“value”: {“type”: “PROBABILITY”, “distribution”: [0.0, 0.0, 1.0], “num_examples”: 6.0}}]}]}, “#tree_plot_05707b35c4f748738efd3da21ab9197f”)

检查模型结构

模型结构和元数据可以通过make_inspector()创建的inspector来获取。

**注意：**根据学习算法和超参数的不同，inspector将暴露不同的专门属性。例如，winner_take_all字段是随机森林模型特有的。

# 创建一个模型检查器对象，用于检查模型的性能和质量
inspector = model.make_inspector()

对于我们的模型，可用的检查员字段有：

# 使用列表推导式，遍历inspector模块中的所有属性
# 过滤掉以"_"开头的属性
fields = [field for field in dir(inspector) if not field.startswith("_")]

['MODEL_NAME',
 'dataspec',
 'evaluation',
 'export_to_tensorboard',
 'extract_all_trees',
 'extract_tree',
 'features',
 'header',
 'iterate_on_nodes',
 'label',
 'label_classes',
 'metadata',
 'model_type',
 'num_trees',
 'objective',
 'specialized_header',
 'task',
 'training_logs',
 'tuning_logs',
 'variable_importances',
 'winner_take_all_inference']

记得查看API参考或使用?查看内置文档。

?inspector.model_type

一些模型元数据：

# 打印模型类型
print("Model type:", inspector.model_type())

# 打印模型中树的数量
print("Number of trees:", inspector.num_trees())

# 打印模型的目标函数
print("Objective:", inspector.objective())

# 打印模型的输入特征
print("Input features:", inspector.features())

Model type: RANDOM_FOREST
Number of trees: 300
Objective: Classification(label=__LABEL, class=None, num_classes=3)
Input features: ["bill_depth_mm" (1; #0), "bill_length_mm" (1; #1), "body_mass_g" (1; #2), "flipper_length_mm" (1; #3), "island" (4; #4), "sex" (4; #5), "year" (1; #6)]

evaluate()是在训练期间计算的模型评估。用于此评估的数据集取决于算法。例如，它可以是验证数据集或袋外数据集。

**注意：**虽然在训练期间计算，但evaluate()从未对训练数据集进行评估。

# 创建一个名为inspector的对象
inspector = Inspector()
# 调用inspector对象的evaluation()方法
inspector.evaluation()

Evaluation(num_examples=344, accuracy=0.9767441860465116, loss=0.06782230959804512, rmse=None, ndcg=None, aucs=None, auuc=None, qini=None)

变量重要性如下：

The variable importances are:

# 打印可用的变量重要性
print(f"Available variable importances:")

# 遍历变量重要性字典的键，并打印出来
for importance in inspector.variable_importances().keys():
    print("\t", importance)

Available variable importances:
	 MEAN_DECREASE_IN_AP_1_VS_OTHERS
	 MEAN_DECREASE_IN_PRAUC_3_VS_OTHERS
	 SUM_SCORE
	 MEAN_DECREASE_IN_PRAUC_1_VS_OTHERS
	 MEAN_DECREASE_IN_ACCURACY
	 MEAN_DECREASE_IN_AUC_1_VS_OTHERS
	 MEAN_DECREASE_IN_AP_3_VS_OTHERS
	 NUM_AS_ROOT
	 MEAN_DECREASE_IN_AP_2_VS_OTHERS
	 MEAN_DECREASE_IN_AUC_2_VS_OTHERS
	 MEAN_MIN_DEPTH
	 MEAN_DECREASE_IN_AUC_3_VS_OTHERS
	 NUM_NODES
	 MEAN_DECREASE_IN_PRAUC_2_VS_OTHERS

不同的变量重要性具有不同的语义。例如，具有平均减少auc为0.05的特征意味着从训练数据集中移除该特征会使AUC降低/受损5%。

# 获取类别1与其他类别之间的AUC的平均减少量
mean_decrease_in_auc_1_vs_others = inspector.variable_importances()["MEAN_DECREASE_IN_AUC_1_VS_OTHERS"]

[("bill_length_mm" (1; #1), 0.0713061951754389),
 ("island" (4; #4), 0.007298519736842035),
 ("flipper_length_mm" (1; #3), 0.004505893640351366),
 ("bill_depth_mm" (1; #0), 0.0021244517543865804),
 ("body_mass_g" (1; #2), 0.0005482456140351033),
 ("sex" (4; #5), 0.00047971491228060437),
 ("year" (1; #6), 0.0)]

绘制使用Matplotlib的检查器中的变量重要性

import matplotlib.pyplot as plt

plt.figure(figsize=(12, 4))  # 创建一个大小为12x4的图形

# 平均AUC下降值（class 1相对于其他类别）
variable_importance_metric = "MEAN_DECREASE_IN_AUC_1_VS_OTHERS"
variable_importances = inspector.variable_importances()[variable_importance_metric]

# 提取特征名称和重要性值
#
# `variable_importances` 是一个包含<特征, 重要性>元组的列表
feature_names = [vi[0].name for vi in variable_importances]  # 提取特征名称
feature_importances = [vi[1] for vi in variable_importances]  # 提取重要性值
# 特征按重要性值降序排列
feature_ranks = range(len(feature_names))

bar = plt.barh(feature_ranks, feature_importances, label=[str(x) for x in feature_ranks])  # 创建水平条形图
plt.yticks(feature_ranks, feature_names)  # 设置y轴刻度为特征名称
plt.gca().invert_yaxis()  # 反转y轴刻度顺序，使重要性高的特征在上方

# TODO: 当可用时，替换为 "plt.bar_label()"
# 使用值标记每个条形图
for importance, patch in zip(feature_importances, bar.patches):
  plt.text(patch.get_x() + patch.get_width(), patch.get_y(), f"{importance:.4f}", va="top")

plt.xlabel(variable_importance_metric)  # 设置x轴标签为重要性度量
plt.title("Mean decrease in AUC of the class 1 vs the others")  # 设置图形标题
plt.tight_layout()  # 调整图形布局，以防止标签重叠
plt.show()  # 显示图形

最后，访问实际的树结构：

# 从inspector对象中提取树的信息
# 参数tree_idx表示要提取的树的索引，这里为0表示提取第一棵树的信息
inspector.extract_tree(tree_idx=0)

Tree(root=NonLeafNode(condition=(bill_length_mm >= 43.25; miss=True, score=0.5482327342033386), pos_child=NonLeafNode(condition=(island in ['Biscoe']; miss=True, score=0.6515106558799744), pos_child=NonLeafNode(condition=(bill_depth_mm >= 17.225584030151367; miss=False, score=0.027205035090446472), pos_child=LeafNode(value=ProbabilityValue([0.16666666666666666, 0.0, 0.8333333333333334],n=6.0), idx=7), neg_child=LeafNode(value=ProbabilityValue([0.0, 0.0, 1.0],n=104.0), idx=6), value=ProbabilityValue([0.00909090909090909, 0.0, 0.990909090909091],n=110.0)), neg_child=LeafNode(value=ProbabilityValue([0.0, 1.0, 0.0],n=61.0), idx=5), value=ProbabilityValue([0.005847953216374269, 0.3567251461988304, 0.6374269005847953],n=171.0)), neg_child=NonLeafNode(condition=(bill_depth_mm >= 15.100000381469727; miss=True, score=0.150658518075943), pos_child=NonLeafNode(condition=(flipper_length_mm >= 187.5; miss=True, score=0.036139510571956635), pos_child=LeafNode(value=ProbabilityValue([1.0, 0.0, 0.0],n=104.0), idx=4), neg_child=NonLeafNode(condition=(bill_length_mm >= 42.30000305175781; miss=True, score=0.23430533707141876), pos_child=LeafNode(value=ProbabilityValue([0.0, 1.0, 0.0],n=5.0), idx=3), neg_child=NonLeafNode(condition=(bill_length_mm >= 40.55000305175781; miss=True, score=0.043961383402347565), pos_child=LeafNode(value=ProbabilityValue([0.8, 0.2, 0.0],n=5.0), idx=2), neg_child=LeafNode(value=ProbabilityValue([1.0, 0.0, 0.0],n=53.0), idx=1), value=ProbabilityValue([0.9827586206896551, 0.017241379310344827, 0.0],n=58.0)), value=ProbabilityValue([0.9047619047619048, 0.09523809523809523, 0.0],n=63.0)), value=ProbabilityValue([0.9640718562874252, 0.03592814371257485, 0.0],n=167.0)), neg_child=LeafNode(value=ProbabilityValue([0.0, 0.0, 1.0],n=6.0), idx=0), value=ProbabilityValue([0.930635838150289, 0.03468208092485549, 0.03468208092485549],n=173.0)), value=ProbabilityValue([0.47093023255813954, 0.19476744186046513, 0.33430232558139533],n=344.0)), label_classes=None)

提取树并不高效。如果速度很重要，可以使用iterate_on_nodes()方法来进行模型检查。这个方法是对模型的所有节点进行深度优先的前序遍历迭代器。

注意：extract_tree()是使用iterate_on_nodes()实现的。

以下示例计算每个特征被使用的次数（这是一种结构变量重要性的指标）：

# 创建一个默认字典number_of_use，用于记录每个特征在其条件中被使用的次数
number_of_use = collections.defaultdict(lambda: 0)

# 对所有节点进行深度优先的前序遍历
for node_iter in inspector.iterate_on_nodes():

  # 如果节点不是叶节点，则跳过
  if not isinstance(node_iter.node, tfdf.py_tree.node.NonLeafNode):
    continue

  # 遍历节点条件中使用的所有特征
  # 默认情况下，模型是"oblique"的，即每个节点测试一个特征
  for feature in node_iter.node.condition.features():
    # 特征在使用次数上加1
    number_of_use[feature] += 1

# 打印每个特征的条件节点数
print("Number of condition nodes per features:")
for feature, count in number_of_use.items():
  print("\t", feature.name, ":", count)

Number of condition nodes per features:
	 bill_length_mm : 778
	 bill_depth_mm : 463
	 flipper_length_mm : 414
	 island : 342
	 body_mass_g : 338
	 year : 19
	 sex : 36

手动创建模型

在本节中，您将手动创建一个小的随机森林模型。为了使其更加简单，该模型只包含一个简单的树：

3个标签类别：红色、蓝色和绿色。
2个特征：f1（数值型）和f2（字符串分类型）

f1>=1.5
    ├─(正)─ f2在["猫","狗"]中
    │         ├─(正)─ 值：[0.8, 0.1, 0.1]
    │         └─(负)─ 值：[0.1, 0.8, 0.1]
    └─(负)─ 值：[0.1, 0.1, 0.8]

# 创建模型构建器
builder = tfdf.builder.RandomForestBuilder(
    path="/tmp/manual_model",  # 指定模型保存的路径
    objective=tfdf.py_tree.objective.ClassificationObjective(
        label="color",  # 指定目标变量为"color"
        classes=["red", "blue", "green"]))  # 指定目标变量的类别为["red", "blue", "green"]

每棵树都逐个添加。

注意： 树对象（tfdf.py_tree.tree.Tree）与前一节中extract_tree()返回的树对象相同。

# 导入所需的模块和类
Tree = tfdf.py_tree.tree.Tree  # 树结构
SimpleColumnSpec = tfdf.py_tree.dataspec.SimpleColumnSpec  # 列规范
ColumnType = tfdf.py_tree.dataspec.ColumnType  # 列类型
NonLeafNode = tfdf.py_tree.node.NonLeafNode  # 非叶节点
LeafNode = tfdf.py_tree.node.LeafNode  # 叶节点
NumericalHigherThanCondition = tfdf.py_tree.condition.NumericalHigherThanCondition  # 数值大于条件
CategoricalIsInCondition = tfdf.py_tree.condition.CategoricalIsInCondition  # 类别在条件
ProbabilityValue = tfdf.py_tree.value.ProbabilityValue  # 概率值

# 创建树结构并添加到builder中
builder.add_tree(
    Tree(
        NonLeafNode(
            condition=NumericalHigherThanCondition(
                feature=SimpleColumnSpec(name="f1", type=ColumnType.NUMERICAL),  # 数值特征"f1"
                threshold=1.5,  # 阈值为1.5
                missing_evaluation=False),  # 不考虑缺失值
            pos_child=NonLeafNode(
                condition=CategoricalIsInCondition(
                    feature=SimpleColumnSpec(name="f2",type=ColumnType.CATEGORICAL),  # 类别特征"f2"
                    mask=["cat", "dog"],  # 类别为"cat"或"dog"
                    missing_evaluation=False),  # 不考虑缺失值
                pos_child=LeafNode(value=ProbabilityValue(probability=[0.8, 0.1, 0.1], num_examples=10)),  # 正向子节点为叶节点，概率值为[0.8, 0.1, 0.1]，样本数为10
                neg_child=LeafNode(value=ProbabilityValue(probability=[0.1, 0.8, 0.1], num_examples=20))),  # 负向子节点为叶节点，概率值为[0.1, 0.8, 0.1]，样本数为20
            neg_child=LeafNode(value=ProbabilityValue(probability=[0.1, 0.1, 0.8], num_examples=30)))))  # 负向子节点为叶节点，概率值为[0.1, 0.1, 0.8]，样本数为30

结束树写作

# 关闭builder对象
builder.close()

[INFO 2022-12-14T12:25:00.790486355+00:00 kernel.cc:1175] Loading model from path /tmp/manual_model/tmp/ with prefix e09a067144bc479b
[INFO 2022-12-14T12:25:00.790802259+00:00 decision_forest.cc:640] Model loaded with 1 root(s), 5 node(s), and 2 input feature(s).
[INFO 2022-12-14T12:25:00.790878962+00:00 kernel.cc:1021] Use fast generic engine
WARNING:absl:Found untraced functions such as call_get_leaves, _update_step_xla while saving (showing 2 of 2). These functions will not be directly callable after loading.


INFO:tensorflow:Assets written to: /tmp/manual_model/assets


INFO:tensorflow:Assets written to: /tmp/manual_model/assets

现在您可以将该模型作为常规的keras模型打开，并进行预测：

# 加载预训练模型
manual_model = tf.keras.models.load_model("/tmp/manual_model")

[INFO 2022-12-14T12:25:01.436506097+00:00 kernel.cc:1175] Loading model from path /tmp/manual_model/assets/ with prefix e09a067144bc479b
[INFO 2022-12-14T12:25:01.436871761+00:00 decision_forest.cc:640] Model loaded with 1 root(s), 5 node(s), and 2 input feature(s).
[INFO 2022-12-14T12:25:01.436909696+00:00 kernel.cc:1021] Use fast generic engine

# 创建一个tf.data.Dataset对象，从给定的张量中切片得到数据集
# 数据集包含两个特征"f1"和"f2"，分别是浮点数和字符串类型
# 数据集中的每个样本是一个字典，包含"f1"和"f2"两个键
# 样本数据为：
#   "f1": [1.0, 2.0, 3.0]
#   "f2": ["cat", "cat", "bird"]
# 使用batch(2)方法将数据集划分为大小为2的批次
examples = tf.data.Dataset.from_tensor_slices({
    "f1": [1.0, 2.0, 3.0],
    "f2": ["cat", "cat", "bird"]
}).batch(2)

# 使用manual_model对examples进行预测
predictions = manual_model.predict(examples)

# 打印预测结果
print("predictions:\n", predictions)

1/2 [==============>...............] - ETA: 0s
2/2 [==============================] - 0s 2ms/step
predictions:
 [[0.1 0.1 0.8]
 [0.8 0.1 0.1]
 [0.1 0.8 0.1]]

访问结构：

注意： 由于模型是序列化和反序列化的，您需要使用一种替代但等效的形式。

# 代码注释

# 获取yggdrasil模型路径
yggdrasil_model_path = manual_model.yggdrasil_model_path_tensor().numpy().decode("utf-8")
print("yggdrasil_model_path:",yggdrasil_model_path)

# 创建一个模型检查器，用于检查模型的输入特征
inspector = tfdf.inspector.make_inspector(yggdrasil_model_path)
print("Input features:", inspector.features())

yggdrasil_model_path: /tmp/manual_model/assets/
Input features: ["f1" (1; #1), "f2" (4; #2)]

当然，您可以手动绘制这个构建的模型：

# 导入tfdf库中的plot_model_in_colab函数
import tensorflow_decision_forests as tfdf

# 使用plot_model_in_colab函数绘制manual_model模型的结构图
tfdf.model_plotter.plot_model_in_colab(manual_model)

/**

Plotting of decision trees generated by TF-DF.
A tree is a recursive structure of node objects.
A node contains one or more of the following components:
- A value: Representing the output of the node. If the node is not a leaf,

 the value is only present for analysis i.e. it is not used for

```
 predictions.
```
- A condition : For non-leaf nodes, the condition (also known as split)

 defines a binary test to branch to the positive or negative child.

- An explanation: Generally a plot showing the relation between the label

 and the condition to give insights about the effect of the condition.

- Two children : For non-leaf nodes, the children nodes. The first

 children (i.e. "node.children[0]") is the negative children (drawn in

 red). The second children is the positive one (drawn in green).

/**

Plots a single decision tree into a DOM element.
@param {!options} options Dictionary of configurations.
@param {!tree} raw_tree Recursive tree structure.
@param {string} canvas_id Id of the output dom element.
*/
function display_tree(options, raw_tree, canvas_id) {
console.log(options);

// Determine the node placement.
const tree_struct = d3.tree().nodeSize(
[options.node_y_offset, options.node_x_offset])(d3.hierarchy(raw_tree));

// Boundaries of the node placement.
let x_min = Infinity;
let x_max = -x_min;
let y_min = Infinity;
let y_max = -x_min;

tree_struct.each(d => {
if (d.x > x_max) x_max = d.x;
if (d.x < x_min) x_min = d.x;
if (d.y > y_max) y_max = d.y;
if (d.y < y_min) y_min = d.y;
});

const plot = d3.select(canvas_id);

// Draw the edges.
display_edges(options, graph, tree_struct);

// Draw the nodes.
display_nodes(options, graph, tree_struct);
}

/**

Draw the nodes of the tree.
@param {!options} options Dictionary of configurations.
@param {!graph} graph D3 search handle containing the graph.
@param {!tree_struct} tree_struct Structure of the tree (node placement,
```
data, etc.).
```

non_leaf_node_without_children.append(‘path’)
.attr(‘d’, ‘M0,0 C 10,0 0,10 10,10’)
.attr(‘fill’, ‘none’)
.attr(‘stroke-width’, 1.0)
.attr(‘stroke’, ‘#F00’);

non_leaf_node_without_children.append(‘path’)
.attr(‘d’, ‘M0,0 C 10,0 0,-10 10,-10’)
.attr(‘fill’, ‘none’)
.attr(‘stroke-width’, 1.0)
.attr(‘stroke’, ‘#0F0’);

const node_content = nodes.append(‘g’).attr(
‘transform’,
translate(0,${options.node_padding - options.node_y_size / 2}));

node_content.append(node => create_node_element(options, node));
}

/**

Creates the D3 content for a single node.
@param {!options} options Dictionary of configurations.
@param {!node} node Node to draw.
@return {!d3} D3 content.
*/
function create_node_element(options, node) {
// Output accumulator.
let output = {
// Content to draw.
content: d3.create(‘svg:g’),
// Vertical offset to the next element to draw.
vertical_offset: 0
};

// Conditions.
if (node.data.condition != null) {
display_condition(options, node.data.condition, output);
}

// Values.
if (node.data.value != null) {
display_value(options, node.data.value, output);
}

// Explanations.
if (node.data.explanation != null) {
display_explanation(options, node.data.explanation, output);
}

return output.content.node();
}

/**

Adds a single line of text inside of a node.
@param {!options} options Dictionary of configurations.
@param {string} text Text to display.
@param {!output} output Output display accumulator.
*/
function display_node_text(options, text, output) {
output.content.append(‘text’)
.attr(‘x’, options.node_padding)
.attr(‘y’, output.vertical_offset)
.attr(‘alignment-baseline’, ‘hanging’)
.text(text);
output.vertical_offset += 10;
}

/**

Adds a single line of text inside of a node with a tooltip.
@param {!options} options Dictionary of configurations.
@param {string} text Text to display.
@param {string} tooltip Text in the Tooltip.
@param {!output} output Output display accumulator.
*/
function display_node_text_with_tooltip(options, text, tooltip, output) {
const item = output.content.append(‘text’)
.attr(‘x’, options.node_padding)
.attr(‘alignment-baseline’, ‘hanging’)
.text(text);

add_tooltip(options, item, () => tooltip);
output.vertical_offset += 10;
}

/**

Adds a tooltip to a dom element.
@param {!options} options Dictionary of configurations.
@param {!dom} target Dom element to equip with a tooltip.
@param {!func} get_content Generates the html content of the tooltip.
*/
function add_tooltip(options, target, get_content) {
function show(d) {
options.tooltip.style(‘display’, ‘block’);
options.tooltip.html(get_content());
}

function hide(d) {
options.tooltip.style(‘display’, ‘none’);
}

function move(d) {
options.tooltip.style(‘display’, ‘block’);
options.tooltip.style(‘left’, (d.pageX + 5) + ‘px’);
options.tooltip.style(‘top’, d.pageY + ‘px’);
}

target.on(‘mouseover’, show);
target.on(‘mouseout’, hide);
target.on(‘mousemove’, move);
}

/**

Adds a condition inside of a node.
@param {!options} options Dictionary of configurations.
@param {!condition} condition Condition to display.
@param {!output} output Output display accumulator.
*/
function display_condition(options, condition, output) {
threshold_format = d3.format(‘r’);

if (condition.type === ‘IS_MISSING’) {
display_node_text(options, ${condition.attribute} is missing, output);
return;
}

if (condition.type === ‘IS_TRUE’) {
display_node_text(options, ${condition.attribute} is true, output);
return;
}

if (condition.type === ‘CATEGORICAL_IS_IN’) {
display_node_text_with_tooltip(
options, ${condition.attribute} in [...],
${condition.attribute} in [${condition.mask}], output);
return;
}

display_node_text(
options, Non supported condition ${condition.type}, output);
}

/**

Adds a value inside of a node.
@param {!options} options Dictionary of configurations.
@param {!value} value Value to display.
@param {!output} output Output display accumulator.
*/
function display_value(options, value, output) {
if (value.type === ‘PROBABILITY’) {
const left_margin = 0;
const right_margin = 50;
const plot_width = options.node_x_size - options.node_padding * 2 -
left_margin - right_margin;

let cusum = Array.from(d3.cumsum(value.distribution));
cusum.unshift(0);
const distribution_plot = output.content.append(‘g’).attr(
‘transform’, translate(0,${output.vertical_offset + 0.5}));

distribution_plot.selectAll(‘rect’)
.data(value.distribution)
.join(‘rect’)
.attr(‘height’, 10)
.attr(
‘x’,
(d, i) =>
(cusum[i] * plot_width + left_margin + options.node_padding))
.attr(‘width’, (d, i) => d * plot_width)
.style(‘fill’, (d, i) => d3.schemeSet1[i]);

const num_examples =
output.content.append(‘g’)
.attr(‘transform’, translate(0,${output.vertical_offset}))
.append(‘text’)
.attr(‘x’, options.node_x_size - options.node_padding)
.attr(‘alignment-baseline’, ‘hanging’)
.attr(‘text-anchor’, ‘end’)
.text((${value.num_examples}));

const distribution_details = d3.create(‘ul’);
distribution_details.selectAll(‘li’)
.data(value.distribution)
.join(‘li’)
.append(‘span’)
.text(
(d, i) =>
‘class ’ + i + ‘: ’ + d3.format(’.3%’)(value.distribution[i]));

add_tooltip(options, distribution_plot, () => distribution_details.html());
add_tooltip(options, num_examples, () => ‘Number of examples’);

output.vertical_offset += 10;
return;
}

if (value.type === ‘REGRESSION’) {
display_node_text(
options,
‘value: ’ + d3.format(‘r’)(value.value) + ( +
d3.format(’.6’)(value.num_examples) + ),
output);
return;
}

display_node_text(options, Non supported value ${value.type}, output);
}

/**

Adds an explanation inside of a node.
@param {!options} options Dictionary of configurations.
@param {!explanation} explanation Explanation to display.
@param {!output} output Output display accumulator.
*/
function display_explanation(options, explanation, output) {
// Margin before the explanation.
output.vertical_offset += 10;

display_node_text(
options, Non supported explanation ${explanation.type}, output);
}

/**

Draw the edges of the tree.
@param {!options} options Dictionary of configurations.
@param {!graph} graph D3 search handle containing the graph.
@param {!tree_struct} tree_struct Structure of the tree (node placement,
```
data, etc.).
```

display_tree({“margin”: 10, “node_x_size”: 160, “node_y_size”: 28, “node_x_offset”: 180, “node_y_offset”: 33, “font_size”: 10, “edge_rounding”: 20, “node_padding”: 2, “show_plot_bounding_box”: false, “labels”: “[“red”, “blue”, “green”]”}, {“condition”: {“type”: “NUMERICAL_IS_HIGHER_THAN”, “attribute”: “f1”, “threshold”: 1.5}, “children”: [{“condition”: {“type”: “CATEGORICAL_IS_IN”, “attribute”: “f2”, “mask”: [“cat”, “dog”]}, “children”: [{“value”: {“type”: “PROBABILITY”, “distribution”: [0.8, 0.1, 0.1], “num_examples”: 10.0}}, {“value”: {“type”: “PROBABILITY”, “distribution”: [0.1, 0.8, 0.1], “num_examples”: 20.0}}]}, {“value”: {“type”: “PROBABILITY”, “distribution”: [0.1, 0.1, 0.8], “num_examples”: 30.0}}]}, “#tree_plot_34c8fb6cf7ca49eda845b971be7f0560”)

你可能感兴趣的:(数据挖掘,tensorflow,人工智能)

眼见不一定为实，孙悟空教你AI换脸换声的技术原理及如何用火眼金睛识别新型诈骗非知名人士人工智能
话说俺老孙自从大闹天宫归来，闲来无事，忽闻人间兴起一门奇术——所谓“换脸换声”。听说那乃现代科学家利用人工智能之奥秘，将人脸、声音通通变换得跟戏法似的，让人真假难辨。俺老孙心生好奇，便跃上筋斗云，直奔这科技之都，打算探个究竟。今就由俺老孙来给你们摆一摆，这换脸换声究竟是咋回事，就像俺大闹天宫时施展变化，变化无穷，妙趣横生！话说那日俺老孙正在花果山上闲逛，忽然听见猪八戒捧着一部闪闪发光的“小机灵”—
LLM大模型安全概述 LLM教程安全人工智能 chatgpt embedding langchain llama
引言2022年底以来，以ChatGPT为代表的大模型飞速发展，正在成为#驱动新质生产力发展#的新动能、人类探索未知的新工具.在显著提升人工智能(artificialintelligence,AI)模型通用理解和生成能力的同时，也带来了前所未有的安全风险.大模型的能力与风险生成式大模型因其强大的智能能力和巨大的应用潜力吸引了众多研究者和企业的关注.从智能能力的角度来看，研究人员观测到：当训练数据和参
tensorflow keras 报错：No gradients provided for any variable 原因与解决办法研志必有功 tensorflow报错 tensorflow 深度学习机器学习神经网络自然语言处理
错误分析Nogradientsprovidedforanyvariable这个意思是没有梯度给已知的所有函数，为什么会出现这个错误呢，因为在深度学习中，梯度的更新是由于反向传播算法的实现的，如果损失函数没有与已知的任何（除输入）层关联，那么，损失函数就无法求出关于各个函数的梯度，导致错误解决办法例如损失函数defcontrastive_loss_layer(left_inputs,right_in
TensorFlow的基本框架和理解-初学者通过这一篇文章就够了无人不智能，机器不学习 TensorFlow TensorFlow 基本框架 python
tensorflow的理解Tensorflow是一种机器学习框架，如果我们有大量的数据，我们可以利用他协助医生检查糖尿病性视网膜病变来预防患者失明等应用新版本中一个有趣的功能是eagerexecution，允许用户在不创建图形的情况下运行tensorflow代码，一种动态图机制它是一个命令式、由运行定义的接口，一旦从Python被调用，其操作立即被执行。这使得入门TensorFlow变的更简单，也
Deepseek r1回答：人工智能的本质是什么？给人类社会带来的意义是什么？相比于之前的互联网革命，这一轮的人工智能革命的特点和异同点有哪些？强化学习曾小健 Deepseek原理与使用人工智能
嗯，用户问的是人工智能的本质、对社会的意义，以及和互联网革命的比较。首先得拆解这三个问题，逐个分析。用户可能对AI有一定了解，但想深入理解其核心和影响。可能需要从技术、社会、历史角度切入。先想AI的本质。本质是什么？是模拟人类智能，还是数据处理？可能需要提到机器学习、深度学习，尤其是神经网络。但本质可能更深层，比如自主决策、模式识别。还要区分弱AI和强AI，现在都是弱AI，专注于特定任务。然后是意
DeepSeek安全：AI网络安全评估与防护策略黑客Jack 安全 web安全
点击文末小卡片，免费获取网络安全全套资料，资料在手，涨薪更快本文基于现有的公开资料，从企业资深网络安全专家的视角，系统梳理DeepSeek技术在网络安全领域的潜在贡献与核心风险，并结合中国自主可控的发展战略，提出针对性的应对策略，旨在为相关企业和机构提供全面、深入的参考，以更好地应对人工智能时代的网络安全挑战。一、引言近年来，人工智能技术在全球范围内引发了深刻的科技变革与产业变革，而中国人工智能初
AI芯片概述-分类、应用、技术（APU、CPU、DPU、GPU、NPU和TPU）及厂家一码当前 AI基础人工智能分类数据挖掘
写这篇文章的起因是老板想了解下AI芯片（NPU/GPU区别等），他不是搞技术那一挂的，所以就简单整理下，留作记录，顺便分享给各位。文章目录一、AI芯片是什么？二、AI芯片分类1.Training(训练)2.Inference(推理)三、AI芯片应用领域四、AI芯片技术路线五、APU、CPU、DPU、GPU、NPU和TPU六、AI芯片厂家一、AI芯片是什么？AI芯片：针对人工智能算法做了特殊加速设计
1.6 从 GPT-1 到 GPT-3.5：一路的风云变幻少林码僧 AI大模型应用实战专栏 gpt gpt-3
从GPT-1到GPT-3.5：一路的风云变幻人工智能的进步一直是科技领域的一个重要话题，而在自然语言处理（NLP）领域，GPT（GenerativePre-trainedTransformer）系列模型的发布，标志着一个又一个技术突破。从2018年发布的GPT-1到2022年推出的GPT-3.5，OpenAI的每一次更新都在推动着人工智能的发展，改变了我们与计算机互动的方式。本文将带你一起回顾GP
《探秘课程蒸馏体系“三阶训练法”：解锁知识层级递进式迁移的密码》人工智能深度学习
在人工智能与教育科技深度融合的时代，如何高效地实现知识传递与能力提升，成为众多学者、教育工作者以及技术专家共同探索的课题。课程蒸馏体系中的“三阶训练法”，作为一种创新的知识迁移模式，正逐渐崭露头角，为解决这一难题提供了全新的思路。从概念上讲，课程蒸馏体系借鉴了机器学习中知识蒸馏的思想，将复杂、庞大的知识体系进行提炼和压缩，使其能够更有效地被学习者吸收。而“三阶训练法”作为该体系的核心，通过精心设计
GPT-4 Turbo的重大升级与深远影响 CodeJourney. 算法数据库人工智能
在人工智能飞速发展的浪潮中，OpenAI推出的GPT-4Turbo犹如一颗璀璨的新星，引发了全球的广泛关注。这一版本不仅是技术层面的常规迭代，更是一次具有深远意义的重大突破，从性能提升到功能拓展，再到应用场景的延伸，都展现出了令人瞩目的变革。性能飞跃：速度与成本的双重优化GPT-4Turbo最直观的升级体现在速度上。OpenAI对模型架构和计算资源进行了深度优化，使其响应速度大幅提升。在实际应用中
在 DeepSeek-R1 的本地指导下部署 DeepSeek Coder（第 1 部分） csdn_aspnet DeepSeek windows DeepSeek windows
驱动器使用CursorAI和ClaudeSonet已经有一段时间了，这绝对是一次令人兴奋的体验。自从我将人工智能驱动的编码辅助功能纳入我的工作流程后，我的工作效率轻松提升了近50%。事实上，我发现这些人工智能工具不仅加快了琐碎的编码任务，还鼓励我探索我可能忽略的新库和框架。同时，作为一名开源爱好者，我始终有一个挥之不去的想法：“如果我能够使用开源模型获得类似的结果，同时又能完全控制我的数据，那会怎
一文搞懂AI专用名词，全面解析人工智能术语码上飞扬人工智能
友情提示：本文内容由银河易创（https://ai.eaigx.com）AI创作平台DeepSeek-v3模型生成，文中所有概念解释均有AI生成，仅供参考。引言人工智能（AI）作为当今科技领域的热门话题，已经渗透到我们生活的方方面面。然而，对于初学者或非技术背景的读者来说，AI领域中的各种专业术语和缩写常常令人困惑。本文旨在通过系统化的梳理，帮助读者快速掌握AI中的关键名词，从基础概念到进阶术语，
智能教育：DeepSeek在个性化学习中的应用与代码实现 Evaporator Core #DeepSeek快速入门 #深度学习人工智能学习
个性化学习是教育技术领域的核心目标之一，它通过分析学生的学习行为、兴趣和能力，提供定制化的学习内容和路径，以最大化学习效果。DeepSeek作为人工智能技术的引领者，正在通过其强大的算法和数据处理能力，推动个性化学习的创新应用。本文将结合代码实现，深入探讨DeepSeek在个性化学习中的应用。一、个性化学习系统：从数据到定制化内容个性化学习的核心在于根据学生的学习行为数据，生成定制化的学习内容。D
GitHub每日最火火火项目（3.7） FutureUniant github日推 github 人工智能计算机视觉音视频 ai
ai-hedge-fund项目介绍：ai-hedge-fund是由virattt开发的项目，本质上是一个将人工智能技术应用于对冲基金领域的团队或平台。在金融市场中，对冲基金旨在通过各种策略获取超额收益，而人工智能具备强大的数据分析和预测能力，二者结合能为投资决策带来新的思路和方法。该项目可能运用机器学习、深度学习等人工智能算法，对大量的金融数据进行深入分析，包括股票、债券、期货等市场的历史价格、交
AI大模型报告 | 《中国数字人发展报告(2024)》（完整版PDF免费附下载） AI大模型_学习君人工智能 pdf AI大模型 RAG 大模型技术中国数字人发展报告2024 数字人
世界上的相遇都是久别重逢~数字人是通过多种数字智能技术创建，具备人类外观形象、声音语言、肢体动作与思维功能等特征的数字智能体。在技术层面，数字人通过数字建模手段实现，涵盖计算机图形学、动作捕捉、图形渲染、语音合成、深度学习等多项技术。当前，数字人正成为人工智能活跃的应用落地入口，对大数据、智能终端、具身智能等产业链接度、嵌入度、融合度较强，或将成为下一代互联网活跃的交互界面之一。公开数据显示，目前
论分布式存储系统架构设计一休哥助手架构软考系统架构师分布式
一、引言随着大数据、人工智能和物联网等技术的快速发展，数据存储需求呈现爆发式增长。传统集中式的存储系统架构逐渐暴露出性能瓶颈、可靠性差、扩展性不足等问题，无法满足日益增长的数据存储需求。在这种背景下，分布式存储系统（DistributedStorageSystem）应运而生。分布式存储系统通过将数据分散在多台设备上，实现了负载均衡、可靠性提升以及高效的数据访问，成为现代大规模数据存储的主流方案。本
通用型AI智能体Manus：技术突破与OpenManus云平台革命 Loving_enjoy 实用技巧人工智能
一、通用型AI智能体的进化：Manus的技术突破**在人工智能技术从专用型向通用型跨越的浪潮中，Manus作为新一代通用AI智能体，正重新定义人机协作的边界。其核心价值在于突破了传统AI模型"单一场景适配"的局限，构建了可自主进化、多模态交互、跨领域迁移的智能体系。**1.Manus的四大技术支柱**（1）**元学习驱动的认知框架**Manus采用混合式元学习架构（HybridMeta-Learn
Node.js调用DeepSeek Api 实现本地智能聊天的简单应用 egekm_sefg 面试学习路线阿里巴巴 node.js
在人工智能快速发展的今天，如何快速构建一个智能对话应用成为了开发者们普遍关注的话题。本文将为大家介绍一个基于Node.js的命令行聊天应用，它通过调用硅基流动（SiliconFlow）的API接口，实现了与DeepSeek模型的智能对话功能。这个项目不仅实现了流式响应输出，还提供了对话记录的自动保存功能，是一个非常实用的AI对话工具。代码下载：https://gitee.com/phpervip/
Python 在 AI 领域的应用：从零构建你的第一个 AI 模型嵌入式Jerry Python python 人工智能开发语言嵌入式硬件 windows ubuntu
引言人工智能（AI）已经成为现代科技的核心，而Python是AI领域最受欢迎的编程语言之一。其强大的库和框架，如TensorFlow、PyTorch、scikit-learn，使AI开发变得更加简单高效。本文将带你深入理解Python在AI中的应用，并通过机器学习（MachineLearning）和深度学习（DeepLearning）的实际示例，讲解如何构建一个AI模型。1.Python为什么适合
Python自学指南：从入门到进阶（第一天） Small踢倒coffee_氕氘氚经验分享笔记 python
Python作为一门简洁、易读且功能强大的编程语言，深受初学者和专业开发者的喜爱。无论你是编程新手，还是有一定编程经验想学习新语言，Python都是一个绝佳的选择。本文将为你提供一份详细的Python自学指南，帮助你从入门到进阶。---##一、为什么选择Python？1.**简单易学**：Python语法简洁，接近自然语言，适合初学者快速上手。2.**应用广泛**：Python在数据分析、人工智能
数据挖掘实战-基于Catboost算法的艾滋病数据可视化与建模分析艾派森数据挖掘实战合集 python 人工智能数据挖掘信息可视化数据分析
‍♂️个人主页：@艾派森的个人主页✍作者简介：Python学习者希望大家多多支持，我们一起进步！如果文章对你有帮助的话，欢迎评论点赞收藏加关注+目录1.项目背景2.数据集介绍
Manus：一夜爆火的“AI全能员工”如何重塑人工智能边界？阿新- 人工智能人工智能 Manus
引言：从“助手”到“执行者”的颠覆性跨越2025年3月6日，一款名为‌Manus‌的AI代理突然刷爆技术圈——其封闭测试邀请码在黑市被炒至10万元，甚至引发科技博主集体“求码”热潮‌。不同于传统AI仅提供建议，Manus能像人类一样‌自主完成全流程操作‌：从解压简历生成报告到编写代码部署网站，甚至联动硬件设备‌。这场技术风暴为何兴起？它将对AI领域带来哪些变革？分析一、Manus的核心突破：从“大
【数据仓库与数据挖掘基础】第一章概论/基础知识精神病不行计算机不上班数据仓库与数据挖掘基础数据挖掘数据仓库
知识点复习：事务（关于事务的一些知识点可以点这里）一、数据仓库的一些基本的知识1.从数据库到数据仓库1.1数据库用于事务处理1.1.1定义：事务处理是指对数据库中数据的操作，这些操作通常包括插入、更新、删除和查询等。事务处理的核心是确保数据的一致性和完整性。事务的定义：事务是数据库操作的基本单位，包含一组逻辑上相关的操作。事务要么全部成功，要么全部失败。ACID特性：原子性（Atomicity）：
研究发现，LLM基于数据的内在含义进行表示，并以其主导语言推理新加坡内哥谈技术人工智能自然语言处理语言模型深度学习 copilot
每周跟踪AI热点新闻动向和震撼发展想要探索生成式人工智能的前沿进展吗？订阅我们的简报，深入解析最新的技术突破、实际应用案例和未来的趋势。与全球数同行一同，从行业内部的深度分析和实用指南中受益。不要错过这个机会，成为AI领域的领跑者。点击订阅，与未来同行！订阅：https://rengongzhineng.io/【本周AI新闻:Deepseek崛起背后：AI智能代理时代正式到来？】https://w
用数据唤醒深度好眠，时序数据库 TDengine 助力安提思脑科学研究涛思数据（TDengine）时序数据库 tdengine 数据库
在智能医疗与脑科学快速发展的今天，高效的数据处理能力已成为突破创新的关键。安提思专注于睡眠监测与神经调控，基于人工智能和边缘计算，实现从生理体征监测、智能干预到效果评估的闭环。面对海量生理数据的存储与实时计算需求，安提思选择TDengine云服务作为核心时序数据库，借助其高效的数据压缩能力和毫秒级查询性能，确保精准分析与稳定运行。目前，安提思已完成经颅磁刺激系统的医疗器械型式检验，并计划开展多中心
全员DeepSeek时代，前端能做些什么？二川bro 前端智能AI 前端 deepseek
全员DeepSeek时代，前端能做些什么？前些天发现了一个巨牛的人工智能学习网站，通俗易懂，风趣幽默，可以分享一下给大家。点击跳转到网站。https://www.captainbed.cn/cccDeepSeek开发阶段测试阶段部署阶段智能代码生成设计稿转代码实时代码审查测试用例生成自动化问题定位构建优化建议性能预测模型一、DeepSeek带来的前端范式变革1.1传统前端开发痛点分析DeepSee
光学超表面的人工智能 Luis Li 的猫猫人工智能专区基础及拓展超表面设计人工智能机器学习算法
光学超表面，即能够控制光传播的平面人工介质，正在从实验室过渡到商业应用。这种转变需要先进的超结构和超表面设计，考虑可制造性并通过后处理算法提高光学性能。人工智能，尤其是机器学习的优化，为这些需求提供了解决方案。该文章系统地回顾了AI在三个关键领域的潜在影响：AI支持的超表面可制造性设计（DFM）、超越经典局部相位近似的设计以及AI赋能的计算后端。Introduction超表面是超材料的二维（2D）
企业AI数据安全白皮书：深寻模型会话保护与安当TDE实战安当加密人工智能
一、引言人工智能正在重塑企业的业务流程与创新模式，从智能客服到辅助决策，从图像识别到自然语言处理，AI模型正逐步渗透到企业运营的各个环节。然而，随着AI技术的深入应用，数据安全问题也如影随形。对于部署在企业内网的DeepSeek模型而言，员工与模型的会话内容往往包含企业的核心商业信息、敏感技术参数以及员工个人隐私等关键数据。一旦这些数据遭到泄露、篡改或恶意利用，不仅会给企业带来巨大的经济损失，还可
就在刚刚！马斯克决定将“地球上最聪明的人工智能”Grok-3免费了！源代码杀手 AI技术快讯人工智能 python
Grok-3概述与关键功能Grok-3是由xAI开发的先进AI模型，于2025年2月19日发布，旨在提升推理能力、计算能力和适应性，特别适用于数学、科学和编程问题。作为xAI系列模型的最新版本，Grok-3延续了公司对构建强大且安全的AI系统的承诺，并推动人工智能在多个领域的应用。Grok-3的核心优势在于其大规模强化学习（RL）优化，能够在几秒到几分钟内进行深度推理，适应复杂任务的需求。配备的D
Python开发行业薪资多少？ Java大师兄-威哥 Python 编程 IT技术程序员 IT
大家都知道，人工智能越来越受欢迎了。而Python由于简单易用，是人工智能领域中使用最广泛的编程语言之一，它可以无缝地与数据结构和其他常用的AI算法一起使用。Python开发行业薪资多少？我们看看图片就能知道个大概。无论是国内还是国外对于编程语言的热度调查中，Python都是数得上名的。Python热度的持续升温，自然也引起了开源团队的项目。由于OSI认可的开放源码许可，程序员可以使用Python
Java序列化进阶篇 g21121 java序列化
1.transient 类一旦实现了Serializable 接口即被声明为可序列化，然而某些情况下并不是所有的属性都需要序列化，想要人为的去阻止这些属性被序列化，就需要用到transient 关键字。
escape()、encodeURI()、encodeURIComponent()区别详解 aigo JavaScript Web
原文：http://blog.sina.com.cn/s/blog_4586764e0101khi0.html JavaScript中有三个可以对字符串编码的函数，分别是： escape,encodeURI,encodeURIComponent，相应3个解码函数：,decodeURI,decodeURIComponent 。下面简单介绍一下它们的区别 1 escape()函
ArcgisEngine实现对地图的放大、缩小和平移 Cb123456 添加矢量数据对地图的放大、缩小和平移 Engine
ArcgisEngine实现对地图的放大、缩小和平移: 个人觉得是平移，不过网上的都是漫游，通俗的说就是把一个地图对象从一边拉到另一边而已。就看人说话吧. 具体实现: 一、引入命名空间 using ESRI.ArcGIS.Geometry; using ESRI.ArcGIS.Controls; 二、代码实现.
Java集合框架概述天子之骄 Java集合框架概述
集合框架集合框架可以理解为一个容器，该容器主要指映射(map)、集合(set)、数组(array)和列表(list)等抽象数据结构。从本质上来说，Java集合框架的主要组成是用来操作对象的接口。不同接口描述不同的数据类型。简单介绍： Collection接口是最基本的接口，它定义了List和Set，List又定义了LinkLi
旗正4.0页面跳转传值问题何必如此 java jsp
跳转和成功提示 a) 成功字段非空forward 成功字段非空forward，不会弹出成功字段，为jsp转发，页面能超链接传值,传输变量时需要拼接。接拼接方式list.jsp?test="+strweightUnit+"或list.jsp?test="+weightUnit+&qu
全网唯一:移动互联网服务器端开发课程 cocos2d-x小菜 web开发移动开发移动端开发移动互联程序员
移动互联网时代来了！ App市场爆发式增长为Web开发程序员带来新一轮机遇，近两年新增创业者，几乎全部选择了移动互联网项目！传统互联网企业中超过98%的门户网站已经或者正在从单一的网站入口转向PC、手机、Pad、智能电视等多端全平台兼容体系。据统计，AppStore中超过85%的App项目都选择了PHP作为后端程
Log4J通用配置|注意问题笔记 7454103 DAO apache tomcat log4j Web
关于日志的等级那些去百度就知道了！这几天要搭个新框架配置了日志记下来！做个备忘！ #这里定义能显示到的最低级别,若定义到INFO级别,则看不到DEBUG级别的信息了~! log4j.rootLogger=INFO,allLog # DAO层 log记录到dao.log 控制台和总日志文件 log4j.logger.DAO=INFO,dao,C
SQLServer TCP/IP 连接失败问题 ---SQL Server Configuration Manager darkranger sql c windows SQL Server XP
当你安装完之后,连接数据库的时候可能会发现你的TCP/IP 没有启动.. 发现需要启动客户端协议 : TCP/IP 需要打开 SQL Server Configuration Manager... 却发现无法打开 SQL Server Configuration Manager..?? 解决方法: C:\WINDOWS\system32目录搜索framedyn.
[置顶] 做有中国特色的程序员 aijuans 程序员
从出版业说起网络作品排到靠前的，都不会太难看，一般人不爱看某部作品也是因为不喜欢这个类型，而此人也不会全不喜欢这些网络作品。究其原因，是因为网络作品都是让人先白看的，看的好了才出了头。而纸质作品就不一定了，排行榜靠前的，有好作品，也有垃圾。许多大牛都是写了博客，后来出了书。这些书也都不次，可能有人让为不好，是因为技术书不像小说，小说在读故事，技术书是在学知识或温习知识，有些技术书读得可
document.domain 跨域问题 avords document
document.domain用来得到当前网页的域名。比如在地址栏里输入：javascript:alert(document.domain); //www.315ta.com我们也可以给document.domain属性赋值，不过是有限制的，你只能赋成当前的域名或者基础域名。比如：javascript:alert(document.domain = "315ta.com");
关于管理软件的一些思考 houxinyou 管理
工作好多看年了,一直在做管理软件,不知道是我最开始做的时候产生了一些惯性的思维,还是现在接触的管理软件水平有所下降.换过好多年公司,越来越感觉现在的管理软件做的越来越乱. 在我看来,管理软件不论是以前的结构化编程,还是现在的面向对象编程,不管是CS模式,还是BS模式.模块的划分是很重要的.当然,模块的划分有很多种方式.我只是以我自己的划分方式来说一下. 做为管理软件,就像现在讲究MVC这
NoSQL数据库之Redis数据库管理(String类型和hash类型) bijian1013 redis 数据库 NoSQL
一.Redis的数据类型 1.String类型及操作 String是最简单的类型，一个key对应一个value，string类型是二进制安全的。Redis的string可以包含任何数据，比如jpg图片或者序列化的对象。 Set方法：设置key对应的值为string类型的value
Tomcat 一些技巧征客丶 java tomcat dos
以下操作都是在windows 环境下一、Tomcat 启动时配置 JAVA_HOME 在 tomcat 安装目录，bin 文件夹下的 catalina.bat 或 setclasspath.bat 中添加 set JAVA_HOME=JAVA 安装目录 set JRE_HOME=JAVA 安装目录/jre 即可；二、查看Tomcat 版本在 tomcat 安装目
【Spark七十二】Spark的日志配置 bit1129 spark
在测试Spark Streaming时，大量的日志显示到控制台，影响了Spark Streaming程序代码的输出结果的查看(代码中通过println将输出打印到控制台上)，可以通过修改Spark的日志配置的方式，不让Spark Streaming把它的日志显示在console 在Spark的conf目录下，把log4j.properties.template修改为log4j.p
Haskell版冒泡排序 bookjovi 冒泡排序 haskell
面试的时候问的比较多的算法题要么是binary search，要么是冒泡排序，真的不想用写C写冒泡排序了，贴上个Haskell版的，思维简单，代码简单，下次谁要是再要我用C写冒泡排序，直接上个haskell版的，让他自己去理解吧。 sort [] = [] sort [x] = [x] sort (x:x1:xs) | x>x1 = x1:so
java 路径配置文件读取 bro_feng java
这几天做一个项目，关于路径做如下笔记，有需要供参考。取工程内的文件，一般都要用相对路径，这个自然不用多说。在src统计目录建配置文件目录res,在res中放入配置文件。读取文件使用方式： 1. MyTest.class.getResourceAsStream("/res/xx.properties") 2. properties.load(MyTest.
读《研磨设计模式》-代码笔记-简单工厂模式 bylijinnan java 设计模式
声明：本文只为方便我个人查阅和理解，详细的分析以及源代码请移步原作者的博客http://chjavach.iteye.com/ package design.pattern; /* * 个人理解：简单工厂模式就是IOC; * 客户端要用到某一对象，本来是由客户创建的，现在改成由工厂创建，客户直接取就好了 */ interface IProduct {
SVN与JIRA的关联 chenyu19891124 SVN
SVN与JIRA的关联一直都没能装成功，今天凝聚心思花了一天时间整合好了。下面是自己整理的步骤：一、搭建好SVN环境，尤其是要把SVN的服务注册成系统服务二、装好JIRA，自己用是jira-4.3.4破解版三、下载SVN与JIRA的插件并解压，然后拷贝插件包下lib包里的三个jar，放到Atlassian\JIRA 4.3.4\atlassian-jira\WEB-INF\lib下，再
JWFDv0.96 最新设计思路 comsci 数据结构算法工作企业应用公告
随着工作流技术的发展，工作流产品的应用范围也不断的在扩展，开始进入了像金融行业(我已经看到国有四大商业银行的工作流产品招标公告了)，实时生产控制和其它比较重要的工程领域，而
vi 保存复制内容格式粘贴 daizj vi 粘贴复制保存原格式不变形
vi是linux中非常好用的文本编辑工具，功能强大无比，但对于复制带有缩进格式的内容时，粘贴的时候内容错位很严重，不会按照复制时的格式排版，vi能不能在粘贴时，按复制进的格式进行粘贴呢？答案是肯定的，vi有一个很强大的命令可以实现此功能。在命令模式输入:set paste，则进入paste模式，这样再进行粘贴时
shell脚本运行时报错误：/bin/bash^M: bad interpreter 的解决办法 dongwei_6688 shell脚本
出现原因：windows上写的脚本，直接拷贝到linux系统上运行由于格式不兼容导致解决办法： 1. 比如文件名为myshell.sh，vim myshell.sh 2. 执行vim中的命令 : set ff?查看文件格式，如果显示fileformat=dos，证明文件格式有问题 3. 执行vim中的命令 :set fileformat=unix 将文件格式改过来就可以了，然后:w
高一上学期难记忆单词 dcj3sjt126com word english
honest 诚实的；正直的 argue 争论 classical 古典的 hammer 锤子 share 分享；共有 sorrow 悲哀；悲痛 adventure 冒险 error 错误；差错 closet 壁橱；储藏室 pronounce 发音；宣告 repeat 重做；重复 majority 大多数；大半 native 本国的，本地的，本国
hibernate查询返回DTO对象，DTO封装了多个pojo对象的属性 frankco POJO hibernate查询 DTO
DTO-数据传输对象；pojo-最纯粹的java对象与数据库中的表一一对应。简单讲：DTO起到业务数据的传递作用，pojo则与持久层数据库打交道。有时候我们需要查询返回DTO对象，因为DTO
Partition List hcx2013 partition
Given a linked list and a value x, partition it such that all nodes less than x come before nodes greater than or equal to x. You should preserve the original relative order of th
Spring MVC测试框架详解——客户端测试 jinnianshilongnian
上一篇《Spring MVC测试框架详解——服务端测试》已经介绍了服务端测试，接下来再看看如果测试Rest客户端，对于客户端测试以前经常使用的方法是启动一个内嵌的jetty/tomcat容器，然后发送真实的请求到相应的控制器；这种方式的缺点就是速度慢；自Spring 3.2开始提供了对RestTemplate的模拟服务器测试方式，也就是说使用RestTemplate测试时无须启动服务器，而是模拟一
关于推荐个人观点 liyonghui160com 推荐系统关于推荐个人观点
回想起来，我也做推荐了3年多了，最近公司做了调整招聘了很多算法工程师，以为需要多么高大上的算法才能搭建起来的，从实践中走过来，我只想说【不是这样的】第一次接触推荐系统是在四年前入职的时候，那时候，机器学习和大数据都是没有的概念，什么大数据处理开源软件根本不存在，我们用多台计算机web程序记录用户行为，用.net的w
不间断旋转的动画 pangyulei 动画
CABasicAnimation* rotationAnimation; rotationAnimation = [CABasicAnimation animationWithKeyPath:@"transform.rotation.z"]; rotationAnimation.toValue = [NSNumber numberWithFloat: M
自定义annotation sha1064616837 java enum annotation reflect
对象有的属性在页面上可编辑，有的属性在页面只可读，以前都是我们在页面上写死的，时间一久有时候会混乱，此处通过自定义annotation在类属性中定义。越来越发现Java的Annotation真心很强大，可以帮我们省去很多代码，让代码看上去简洁。下面这个例子主要用到了 1.自定义annotation：@interface，以及几个配合着自定义注解使用的几个注解 2.简单的反射 3.枚举
Spring 源码 up2pu spring
1.Spring源代码 https://github.com/SpringSource/spring-framework/branches/3.2.x 注：兼容svn检出 2.运行脚本 import-into-eclipse.bat 注：需要设置JAVA_HOME为jdk 1.7 build.gradle compileJava { sourceCompatibilit
利用word分词来计算文本相似度 yangshangchuan word word分词文本相似度余弦相似度简单共有词
word分词提供了多种文本相似度计算方式：方式一：余弦相似度，通过计算两个向量的夹角余弦值来评估他们的相似度实现类：org.apdplat.word.analysis.CosineTextSimilarity 用法如下： String text1 = "我爱购物"; String text2 = "我爱读书"; String text3 =