TensorBoard是一个工具,可以图形化显示TensorFlow程序(神经网络),还可以显示许多训练过程中的指标数据,如损失、准确度或权重等。TensorBoard可以帮助用户更好地理解、调试和优化TensorFlow程序。
下面是一个线性回归模型的图形:
TensorBoard通过读取TensorFlow事件日志文件来呈现神经网络。TensorFlow事件日志文件记录了神经网络(计算图)构建详情。
TensorBoard界面
TensorBoard主界面如下所示:
主要栏目:
- Scalars/标量: 在模型训练期间显示不同的有用信息
- Graphs/图: 显示模型图形
- Histogram/直方图: 用直方图显示权重
- Distribution/分布: 显示权重的分布
- Projector/投影仪: 显示主成分分析和T-SNE算法,该技术用于降维
使用TensorBoard步骤
1. 启动TensorBoard
tensorboard --logdir=LOGDIR_PATH
2. 访问TensorBoard
Tensorboard 页面默认网址是 http://localhost:6006,使用浏览器访问改地址。
TensorBoard例子
1. 创建一个TensorFlow应用
TensorFlow应用代码如下所示,如果代码中有不理解的没关系,后续章节将详细介绍,现在让我们聚焦TensorBoard。
import tensorflow.compat.v1 as tf import numpy as np tf.compat.v1.disable_eager_execution() # 准备数据 X_train = (np.random.sample((10000,5))) y_train = (np.random.sample((10000,1))) X_train.shape # 转换数据并创建模型 feature_columns = [tf.feature_column.numeric_column('x', shape=X_train.shape[1:])] DNN_reg = tf.estimator.DNNRegressor( feature_columns=feature_columns, # 指定日志路径 model_dir='./train/tensorboard', hidden_units=[500, 300], optimizer=tf.train.ProximalAdagradOptimizer( learning_rate=0.1, l1_regularization_strength=0.001 ) ) # 训练模型 train_input = tf.estimator.inputs.numpy_input_fn( x={"x": X_train}, y=y_train, shuffle=False, num_epochs=None ) DNN_reg.train(train_input, steps=3000)
执行结果
C:\Anaconda3\python.exe "C:\Program Files\JetBrains\PyCharm 2019.1.1\helpers\pydev\pydevconsole.py" --mode=client --port=60385 import sys; print('Python %s on %s' % (sys.version, sys.platform)) sys.path.extend(['C:\\app\\PycharmProjects', 'C:/app/PycharmProjects']) Python 3.7.6 (default, Jan 8 2020, 20:23:39) [MSC v.1916 64 bit (AMD64)] Type 'copyright', 'credits' or 'license' for more information IPython 7.12.0 -- An enhanced Interactive Python. Type '?' for help. PyDev console: using IPython 7.12.0 Python 3.7.6 (default, Jan 8 2020, 20:23:39) [MSC v.1916 64 bit (AMD64)] on win32 runfile('C:/app/PycharmProjects/ArtificialIntelligence/test.py', wdir='C:/app/PycharmProjects/ArtificialIntelligence') INFO:tensorflow:Using default config. INFO:tensorflow:Using config: {'_model_dir': './train/tensorboard', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': None, '_save_checkpoints_secs': 600, '_session_config': allow_soft_placement: true graph_options { rewrite_options { meta_optimizer_iterations: ONE } } , '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_experimental_max_worker_delay_secs': None, '_session_creation_timeout_secs': 7200, '_service': None, '_cluster_spec': ClusterSpec({}), '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1} WARNING:tensorflow:From C:/app/PycharmProjects/ArtificialIntelligence/test.py:25: The name tf.estimator.inputs is deprecated. Please use tf.compat.v1.estimator.inputs instead. WARNING:tensorflow:From C:/app/PycharmProjects/ArtificialIntelligence/test.py:25: The name tf.estimator.inputs.numpy_input_fn is deprecated. Please use tf.compat.v1.estimator.inputs.numpy_input_fn instead. WARNING:tensorflow:From C:\Anaconda3\lib\site-packages\tensorflow\python\ops\resource_variable_ops.py:1666: calling BaseResourceVariable.__init__ (from tensorflow.python.ops.resource_variable_ops) with constraint is deprecated and will be removed in a future version. Instructions for updating: If using Keras pass *_constraint arguments to layers. WARNING:tensorflow:From C:\Anaconda3\lib\site-packages\tensorflow\python\training\training_util.py:236: Variable.initialized_value (from tensorflow.python.ops.variables) is deprecated and will be removed in a future version. Instructions for updating: Use Variable.read_value. Variables in 2.X are initialized automatically both in eager and graph (inside tf.defun) contexts. WARNING:tensorflow:From C:\Anaconda3\lib\site-packages\tensorflow_estimator\python\estimator\inputs\queues\feeding_queue_runner.py:65: QueueRunner.__init__ (from tensorflow.python.training.queue_runner_impl) is deprecated and will be removed in a future version. Instructions for updating: To construct input pipelines, use the `tf.data` module. WARNING:tensorflow:From C:\Anaconda3\lib\site-packages\tensorflow_estimator\python\estimator\inputs\queues\feeding_functions.py:491: add_queue_runner (from tensorflow.python.training.queue_runner_impl) is deprecated and will be removed in a future version. Instructions for updating: To construct input pipelines, use the `tf.data` module. INFO:tensorflow:Calling model_fn. WARNING:tensorflow:Layer dnn is casting an input tensor from dtype float64 to the layer's dtype of float32, which is new behavior in TensorFlow 2. The layer has dtype float32 because it's dtype defaults to floatx. If you intended to run this layer in float32, you can safely ignore this warning. If in doubt, this warning is likely only an issue if you are porting a TensorFlow 1.X model to TensorFlow 2. To change all layers to have dtype float64 by default, call `tf.keras.backend.set_floatx('float64')`. To change just this layer, pass dtype='float64' to the layer constructor. If you are the author of this layer, you can disable autocasting by passing autocast=False to the base Layer constructor. INFO:tensorflow:Done calling model_fn. INFO:tensorflow:Create CheckpointSaverHook. INFO:tensorflow:Graph was finalized. 2020-06-19 18:00:28.212716: I tensorflow/core/platform/cpu_feature_guard.cc:143] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 2020-06-19 18:00:28.229660: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x2edd95abc50 initialized for platform Host (this does not guarantee that XLA will be used). Devices: 2020-06-19 18:00:28.231634: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version INFO:tensorflow:Running local_init_op. INFO:tensorflow:Done running local_init_op. WARNING:tensorflow:From C:\Anaconda3\lib\site-packages\tensorflow\python\training\monitored_session.py:906: start_queue_runners (from tensorflow.python.training.queue_runner_impl) is deprecated and will be removed in a future version. Instructions for updating: To construct input pipelines, use the `tf.data` module. INFO:tensorflow:Calling checkpoint listeners before saving checkpoint 0... INFO:tensorflow:Saving checkpoints for 0 into ./train/tensorboard\model.ckpt. INFO:tensorflow:Calling checkpoint listeners after saving checkpoint 0... INFO:tensorflow:loss = 40.944073, step = 0 INFO:tensorflow:global_step/sec: 87.2018 INFO:tensorflow:loss = 10.192249, step = 100 (1.153 sec) INFO:tensorflow:global_step/sec: 94.7825 INFO:tensorflow:loss = 10.793806, step = 200 (1.048 sec) INFO:tensorflow:global_step/sec: 96.9541 INFO:tensorflow:loss = 10.146742, step = 300 (1.031 sec) INFO:tensorflow:global_step/sec: 94.1023 INFO:tensorflow:loss = 11.024506, step = 400 (1.063 sec) INFO:tensorflow:global_step/sec: 95.5069 INFO:tensorflow:loss = 11.495777, step = 500 (1.047 sec) INFO:tensorflow:global_step/sec: 95.507 INFO:tensorflow:loss = 10.501261, step = 600 (1.047 sec) INFO:tensorflow:global_step/sec: 90.624 INFO:tensorflow:loss = 10.642529, step = 700 (1.105 sec) INFO:tensorflow:global_step/sec: 94.3478 INFO:tensorflow:loss = 10.793669, step = 800 (1.060 sec) INFO:tensorflow:global_step/sec: 89.1954 INFO:tensorflow:loss = 11.432982, step = 900 (1.121 sec) INFO:tensorflow:global_step/sec: 82.931 INFO:tensorflow:loss = 12.161203, step = 1000 (1.204 sec) INFO:tensorflow:global_step/sec: 95.1947 INFO:tensorflow:loss = 11.331022, step = 1100 (1.052 sec) INFO:tensorflow:global_step/sec: 95.3069 INFO:tensorflow:loss = 9.95645, step = 1200 (1.047 sec) INFO:tensorflow:global_step/sec: 95.5069 INFO:tensorflow:loss = 10.853621, step = 1300 (1.047 sec) INFO:tensorflow:global_step/sec: 96.954 INFO:tensorflow:loss = 9.887156, step = 1400 (1.031 sec) INFO:tensorflow:global_step/sec: 86.3196 INFO:tensorflow:loss = 9.899103, step = 1500 (1.161 sec) INFO:tensorflow:global_step/sec: 85.4606 INFO:tensorflow:loss = 9.951197, step = 1600 (1.167 sec) INFO:tensorflow:global_step/sec: 92.7384 INFO:tensorflow:loss = 11.1549015, step = 1700 (1.078 sec) INFO:tensorflow:global_step/sec: 89.811 INFO:tensorflow:loss = 10.865185, step = 1800 (1.113 sec) INFO:tensorflow:global_step/sec: 92.4865 INFO:tensorflow:loss = 11.742689, step = 1900 (1.081 sec) INFO:tensorflow:global_step/sec: 75.7973 INFO:tensorflow:loss = 11.577065, step = 2000 (1.322 sec) INFO:tensorflow:global_step/sec: 71.6585 INFO:tensorflow:loss = 11.5444565, step = 2100 (1.405 sec) INFO:tensorflow:global_step/sec: 75.3075 INFO:tensorflow:loss = 9.6363125, step = 2200 (1.315 sec) INFO:tensorflow:global_step/sec: 78.8516 INFO:tensorflow:loss = 10.774572, step = 2300 (1.271 sec) INFO:tensorflow:global_step/sec: 67.4824 INFO:tensorflow:loss = 11.539929, step = 2400 (1.481 sec) INFO:tensorflow:global_step/sec: 60.0451 INFO:tensorflow:loss = 10.7935705, step = 2500 (1.679 sec) INFO:tensorflow:global_step/sec: 72.5683 INFO:tensorflow:loss = 10.482113, step = 2600 (1.362 sec) INFO:tensorflow:global_step/sec: 60.3152 INFO:tensorflow:loss = 10.1633835, step = 2700 (1.658 sec) INFO:tensorflow:global_step/sec: 63.108 INFO:tensorflow:loss = 10.111056, step = 2800 (1.585 sec) INFO:tensorflow:global_step/sec: 71.5397 INFO:tensorflow:loss = 10.775657, step = 2900 (1.398 sec) INFO:tensorflow:Calling checkpoint listeners before saving checkpoint 3000... INFO:tensorflow:Saving checkpoints for 3000 into ./train/tensorboard\model.ckpt. INFO:tensorflow:Calling checkpoint listeners after saving checkpoint 3000... INFO:tensorflow:Loss for final step: 12.5020685.
3. 启动TensorBoard
4. 访问界面
使用浏览器打开 http://localhost:6006