OReilly.Hands-On Machine Learning with Scikit-Learn, Keras, andTensorFlow
CHAPTER 12 Custom Models and Training with TensorFlow
几个问题记录一下:
1. How would you describe TensorFlow in a short sentence? What are its main features? Can you name other popular Deep Learning libraries?
TensorFlow is an open-source library for numerical computation, particularly well suited and fine-tuned for large-scale Machine Learning. Its core is similar to NumPy, but it also features GPU support, support for distributed computing, computation graph analysis and optimization capabilities (with a portable graph format that allows you to train a TensorFlow model in one environment and run it in another), an optimization API based on reverse-mode autodiff, and several powerful APIs such as tf.keras, tf.data, tf.image, tf.signal, and more. Other popular Deep Learning libraries include PyTorch, MXNet, Microsoft Cognitive Toolkit, Theano, Caffe2, and Chainer.
2. Is TensorFlow a drop-in replacement for NumPy? What are the main differences between the two?
Although TensorFlow offers most of the functionalities provided by NumPy, it is not a drop-in replacement, for a few reasons. First, the names of the functions are not always the same (for example, tf.reduce_sum() versus np.sum() ). Second,some functions do not behave in exactly the same way (for example, tf.transpose() creates a transposed copy of a tensor, while NumPy’s T attribute creates a transposed view, without actually copying any data). Lastly, NumPy arrays are
mutable(可变的), while TensorFlow tensors are not (but you can use a tf.Variable if you need a mutable object).
numpy T、transpose()函数、swapaxes()函数
3. Do you get the same result with tf.range(10) and tf.constant(np.arange(10)) ?
Both tf.range(10) and tf.constant(np.arange(10)) return a one-dimensional tensor(一维张量) containing the integers 0 to 9. However, the former uses 32-bit integers while the latter uses 64-bit integers. Indeed, TensorFlow defaults to 32 bits, while NumPy defaults to 64 bits.
4. Can you name six other data structures available in TensorFlow, beyond regular tensors?
Beyond regular tensors, TensorFlow offers several other data structures, including sparse tensors 稀疏张量, tensor arrays, ragged tensors 不规则张量, queues, string tensors, and sets. The last two are actually represented as regular tensors, but TensorFlow provides special functions to manipulate(控制) them (in tf.strings and tf.sets ).
注:
Tensorflow ragged tensors
ragged tensors即tf提供的不规则形状,或者说可变元素长度的tensor。
比如:
digits = tf.ragged.constant([
[3, 1, 4, 1],
[],
[5, 9, 2],
[6],
[]])
或者
words = tf.ragged.constant([
["So", "long"],
["thanks", "for", "all", "the", "fish"]])
支持的操作如:tf.add, tf.concat, tf.tile, tf.string.substr
需要注意的点:
1不能存储不同的类型,如
tf.ragged.constant([[“one”, “two”], [3, 4]])
2不能存储不同的nested depth,如
tf.ragged.constant([“A”, [“B”, “C”]])
正确的写法应该是:
tf.ragged.constant([[“A”], [“B”, “C”]])
5. A custom loss function can be defined by writing a function or by subclassing the keras.losses.Loss class. When would you use each option?
When you want to define a custom loss function, in general you can just implement it as a regular Python function. However, if your custom loss function must support some hyperparameters (or any other state), then you should subclass the
keras.losses.Loss class and implement the __init__() and call() methods. If you want the loss function’s hyperparameters to be saved along with the model, then you must also implement the get_config() method.
6. Similarly, a custom metric can be defined in a function or a subclass of keras.metrics.Metric . When would you use each option?
Much like custom loss functions, most metrics can be defined as regular Python functions. But if you want your custom metric to support some hyperparameters (or any other state), then you should subclass the keras.metrics.Metric class. Moreover, if computing the metric over a whole epoch is not equivalent to computing the mean metric over all batches in that epoch (e.g., as for the precision and recall metrics), then you should subclass the keras.metrics.Metric class and implement the __init__() update_state() , and result() methods to keep track of a running metric during each epoch. You should also implement the
reset_states() method unless all it needs to do is reset all variables to 0.0. If you want the state to be saved along with the model, then you should implement the get_config() method as well.
7. When should you create a custom layer versus a custom model?什么时候应该创建自定义层与自定义模型?
You should distinguish the internal components of your model (i.e., layers or reusable blocks of layers) from the model itself (i.e., the object you will train).The former should subclass the keras.layers.Layer class, while the latter should subclass the keras.models.Model class.
8. What are some use cases that require writing your own custom training loop?
Writing your own custom training loop is fairly advanced, so you should only do it if you really need to. Keras provides several tools to customize training without having to write a custom training loop: callbacks, custom regularizers, custom constraints, custom losses, and so on. You should use these instead of writing a custom training loop whenever possible: writing a custom training loop is more error-prone, and it will be harder to reuse the custom code you write. However, in some cases writing a custom training loop is necessary—for example, if you want to use different optimizers for different parts of your neural network, like in the Wide & Deep paper. A custom training loop can also be useful when debugging, or when trying to understand exactly how training works.
9. Can custom Keras components contain arbitrary Python code, or must they be convertible to TF Functions?
Custom Keras components should be convertible to TF Functions, which means they should stick to TF operations as much as possible and respect all the rules listed in “TF Function Rules” on page 409. If you absolutely need to include arbitrary Python code in a custom component, you can either wrap it in a tf.py_function() operation (but this will reduce performance and limit your model’s portability) or set dynamic=True when creating the custom layer or model (or set run_eagerly=True when calling the model’s compile() method).
10. What are the main rules to respect if you want a function to be convertible to a TF Function?
Please refer to “TF Function Rules” on page 409 for the list of rules to respect when creating a TF Function.
11. When would you need to create a dynamic Keras model? How do you do that? Why not make all your models dynamic?
Creating a dynamic Keras model can be useful for debugging, as it will not compile any custom component to a TF Function, and you can use any Python debugger to debug your code. It can also be useful if you want to include arbitrary Python code in your model (or in your training code), including calls to external libraries. To make a model dynamic, you must set dynamic=True when creating it. Alternatively, you can set run_eagerly=True when calling the model’s compile() method. Making a model dynamic prevents Keras from using any of TensorFlow’s graph features, so it will slow down training and inference, and you will not have the possibility to export the computation graph 你将无法导出计算图, which will limit your model’s portability.
【T-Tensorflow框架学习】Tensorflow “计算图”入门理解
12. Implement a custom layer that performs Layer Normalization (we will use this type of layer in Chapter 15):
a. The build() method should define two trainable weights α and β, both of shape input_shape[-1:] and data type tf.float32 . α should be initialized with 1s, and β with 0s.
b. The call() method should compute the mean μ and standard deviation σ of each instance’s features. For this, you can use tf.nn.moments(inputs, axes=-1, keepdims=True) , which returns the mean μ and the variance σ 2 of all instances (compute the square root of the variance to get the standard deviation). Then the function should compute and return α⊗(X - μ)/(σ + ε) +β, where ⊗ represents itemwise multiplication ( * ) and ε is a smoothing term(small constant to avoid division by zero, e.g., 0.001).
c. Ensure that your custom layer produces the same (or very nearly the same)output as the keras.layers.LayerNormalization layer.
13. Train a model using a custom training loop to tackle the Fashion MNIST dataset(see Chapter 10).
a. Display the epoch, iteration, mean training loss, and mean accuracy over eachepoch (updated at each iteration), as well as the validation loss and accuracy at the end of each epoch.
b. Try using a different optimizer with a different learning rate for the upper layers and the lower layers.