tensorflow多gpu训练

import tensorflow as tf

import time
from tensorflow import keras
from tensorflow.keras import layers
print("TensorFlow version:", tf.__version__)
TensorFlow version: 2.6.4
fashion_mnist = tf.keras.datasets.fashion_mnist
(x_train, y_train),(x_test, y_test) = fashion_mnist.load_data()
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/train-labels-idx1-ubyte.gz
32768/29515 [=================================] - 0s 0us/step
40960/29515 [=========================================] - 0s 0us/step
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/train-images-idx3-ubyte.gz
26427392/26421880 [==============================] - 0s 0us/step
26435584/26421880 [==============================] - 0s 0us/step
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/t10k-labels-idx1-ubyte.gz
16384/5148 [===============================================================================================] - 0s 0us/step
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/t10k-images-idx3-ubyte.gz
4423680/4422102 [==============================] - 0s 0us/step
4431872/4422102 [==============================] - 0s 0us/ste
strategy = tf.distribute.MirroredStrategy() 
print(f"Number of devices: {strategy.num_replicas_in_sync}")

 

2022-10-24 13:22:53.654524: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-10-24 13:22:53.655559: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-10-24 13:22:54.040217: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-10-24 13:22:54.041380: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-10-24 13:22:54.042118: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-10-24 13:22:54.042893: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-10-24 13:22:54.046939: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-10-24 13:22:54.316231: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-10-24 13:22:54.317023: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-10-24 13:22:54.317783: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-10-24 13:22:54.318723: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-10-24 13:22:54.319521: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-10-24 13:22:54.320277: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-10-24 13:22:59.620854: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-10-24 13:22:59.622102: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-10-24 13:22:59.623119: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-10-24 13:22:59.624125: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-10-24 13:22:59.625107: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-10-24 13:22:59.626084: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1510] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 13789 MB memory:  -> device: 0, name: Tesla T4, pci bus id: 0000:00:04.0, compute capability: 7.5
2022-10-24 13:22:59.631282: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-10-24 13:22:59.632274: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1510] Created device /job:localhost/replica:0/task:0/device:GPU:1 with 13789 MB memory:  -> device: 1, name: Tesla T4, pci bus id: 0000:00:05.0, compute capability: 7.5
Number of devices: 2
with strategy.scope():
    inputs = keras.Input(shape=(28,28,1))
    x = layers.Rescaling(1./255)(inputs)
    for size in [32,64,128]:
        x = layers.Conv2D(size, 3, strides=1, padding="same")(x)
        x = layers.BatchNormalization()(x)
        x = layers.Activation("relu")(x)
        x = layers.MaxPooling2D((2, 2))(x)
    
    x = layers.Conv2D(256, 3, strides=1, padding="same")(x)
    x = layers.BatchNormalization()(x)
    x = layers.Activation("relu")(x)
    x = layers.GlobalAveragePooling2D()(x)
    x = layers.Dropout(0.5)(x)
    outputs = layers.Dense(10, activation='softmax')(x)
    model = keras.Model(inputs, outputs)
    
    model.compile(
        optimizer=keras.optimizers.Adam(1e-3),
        loss='sparse_categorical_crossentropy',
        metrics=['accuracy']
    )

model.summary()

 

Model: "model"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_1 (InputLayer)         [(None, 28, 28, 1)]       0         
_________________________________________________________________
rescaling (Rescaling)        (None, 28, 28, 1)         0         
_________________________________________________________________
conv2d (Conv2D)              (None, 28, 28, 32)        320       
_________________________________________________________________
batch_normalization (BatchNo (None, 28, 28, 32)        128       
_________________________________________________________________
activation (Activation)      (None, 28, 28, 32)        0         
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 14, 14, 32)        0         
_________________________________________________________________
conv2d_1 (Conv2D)            (None, 14, 14, 64)        18496     
_________________________________________________________________
batch_normalization_1 (Batch (None, 14, 14, 64)        256       
_________________________________________________________________
activation_1 (Activation)    (None, 14, 14, 64)        0         
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 7, 7, 64)          0         
_________________________________________________________________
conv2d_2 (Conv2D)            (None, 7, 7, 128)         73856     
_________________________________________________________________
batch_normalization_2 (Batch (None, 7, 7, 128)         512       
_________________________________________________________________
activation_2 (Activation)    (None, 7, 7, 128)         0         
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, 3, 3, 128)         0         
_________________________________________________________________
conv2d_3 (Conv2D)            (None, 3, 3, 256)         295168    
_________________________________________________________________
batch_normalization_3 (Batch (None, 3, 3, 256)         1024      
_________________________________________________________________
activation_3 (Activation)    (None, 3, 3, 256)         0         
_________________________________________________________________
global_average_pooling2d (Gl (None, 256)               0         
_________________________________________________________________
dropout (Dropout)            (None, 256)               0         
_________________________________________________________________
dense (Dense)                (None, 10)                2570      
=================================================================
Total params: 392,330
Trainable params: 391,370
Non-trainable params: 960
_________________________________________________________________
since = time.time()
history = model.fit(
    x_train,
    y_train, 
    epochs=20,
    validation_split=0.1,
    batch_size=128
)
time_elapsed = time.time() - since
print(f'Training complete in {time_elapsed // 60:.0f}m {time_elapsed % 60:.0f}s')
print("Done!")

 

Epoch 9/20
422/422 [==============================] - 6s 14ms/step - loss: 0.1362 - accuracy: 0.9499 - val_loss: 0.4410 - val_accuracy: 0.8593
Epoch 10/20
422/422 [==============================] - 6s 13ms/step - loss: 0.1223 - accuracy: 0.9550 - val_loss: 0.3559 - val_accuracy: 0.8973
Epoch 11/20
422/422 [==============================] - 6s 14ms/step - loss: 0.1139 - accuracy: 0.9584 - val_loss: 0.4634 - val_accuracy: 0.8618
Epoch 12/20
422/422 [==============================] - 6s 13ms/step - loss: 0.1008 - accuracy: 0.9623 - val_loss: 0.3385 - val_accuracy: 0.8913
Epoch 13/20
422/422 [==============================] - 6s 14ms/step - loss: 0.0916 - accuracy: 0.9663 - val_loss: 0.7406 - val_accuracy: 0.8112
Epoch 14/20
422/422 [==============================] - 6s 14ms/step - loss: 0.0840 - accuracy: 0.9693 - val_loss: 0.6127 - val_accuracy: 0.8398
Epoch 15/20
422/422 [==============================] - 6s 14ms/step - loss: 0.0780 - accuracy: 0.9716 - val_loss: 0.3500 - val_accuracy: 0.8958
Epoch 16/20
422/422 [==============================] - 6s 13ms/step - loss: 0.0692 - accuracy: 0.9746 - val_loss: 0.3722 - val_accuracy: 0.8920
Epoch 17/20
422/422 [==============================] - 6s 14ms/step - loss: 0.0617 - accuracy: 0.9771 - val_loss: 0.3368 - val_accuracy: 0.9172
Epoch 18/20
422/422 [==============================] - 6s 13ms/step - loss: 0.0605 - accuracy: 0.9777 - val_loss: 0.4112 - val_accuracy: 0.9028
Epoch 19/20
422/422 [==============================] - 6s 15ms/step - loss: 0.0513 - accuracy: 0.9816 - val_loss: 0.3355 - val_accuracy: 0.9170
Epoch 20/20
422/422 [==============================] - 6s 13ms/step - loss: 0.0524 - accuracy: 0.9807 - val_loss: 0.4732 - val_accuracy: 0.8945
Training complete in 2m 27s
Done!

你可能感兴趣的:(深度学习,keras,tensorflow,深度学习,人工智能,python)