tensorflow uses a dataflow graph to represent your computation in terms of the dependencies between individual operations. this leads to a low-level programming model in which you first define the dataflow graph, then create a tensorflow session to run parts of the graph across a set of local and remote devices
this guide will be most useful if you intend to use the low-level programming model directly. higher-level APIs such as tf.estimator.Estimator and Keras hide the details of graphs and sessions from the end user
dataflow has several advantages that tensorflow leverages when executing your programs:
(1)compilation. tensorflow's XLA compiler can use the information in your dataflow graph to generate faster code, for example, by fusing together adjacent operations
(2)portability. the dataflow graph is a language-independent representation of the code in your model. you can build a dataflow graph in python, store it in a SavedModel, and restore it in a C++ program for low-latency inference
a tf.Graph contains two relevant kinds of information:
(1)Graph structure. the nodes and edges of the graph, indicating how individual operations are composed together, but not prescribing how they should be used. the graph structure is like assembly code: inspecting it can convey some useful information, but it does not contain all of the useful context that source code conveys
(2)Graph collections. tensorflow provides a general mechanism for storing collections of metadata in a tf.Graph. the tf.add_to_collection function enables you to associate a list of objects with a key(where tf.GraphKeys defines some of the standard keys), and tf.get_collection enables you to look up all objects associated with a key. many parts of the tensorflow library use this facility: for example, when you create a tf.Variable, it is added by default to collections representing "global variables" and "trainable variables". when you later come to create a tf.train.Saver or tf.train.Optimizer, the variables in these collections are used as the default arguments
in this phase, you invoke tensorflow API functions that construct new tf.Operation(node) and tf.Tensor(edge) objects and add them to a tf.Graph instance
calling tf.train.Optimizer.minimize will add operations and tensors to the default graph that calculate gradients, and return a tf.Operation that, when run, will apply those gradients to a set of variables
calling most functions in the tensorflow API merely adds operations and tensors to the default graph, but does not perform the actual computation
a tf.Graph object defines a namespace for the tf.Operation objects it contains. tensorflow automatically chooses a unique name for each operation in your graph, but giving operations descriptive names can make your program easier to read and debug
the tf.name_scope function makes it possible to add a name scope prefix to all operations created in a particular context. the current name scope prefix is a "/"-delimited list of the names of all active tf.name_scope context managers. if a name scope has already been used in the current context, tensorflow appends "_1", "_2", and so on
the graph visualizer uses name scope to group operations and reduce the visual complexity of a graph
note that tf.Tensor objects are implicity named after the tf.Operation that produces the tensor as output. a tensor name has the form ":":
(1)"" is the name of the operation that produces it
(2)"" is an integer representing the index of that tensor among the operation's outputs
a device specification has the following form:
/job:/task:/device::
where:
(1) is an alpha-numeric string that does not start with a number
(2) is a registered device type (such as GPU or CPU)
(3) is a non-negative integer representing the index of the task in the job named . see tf.train.ClusterSpec for an explanation of jobs and tasks
(4) is a non-negative integer representing the index of the device, for example, to distinguish between different GPU devices used in the same process
you do not need to specify every part of a device specification. for example, if you are running in a single-machine configuration with a single GPU, you might use tf.device to pin some operations to the CPU and GPU
for example, if you have a GPU and a CPU available, and the operation has a GPU implementation, tensorflow will choose the GPU
if you are deploying tensorflow in a typical distributed configuration, you might specify the job name and task ID to place variables on a task in the parameter server job ("/job:ps"), and the other operations on task in the worker job("/job:worker")
tf.add_n takes a list of n tf.Tensor objects. for convenience, these functions will accept a tensor-like object in place of a tf.Tensor, and implicitly convert it to a tf.Tensor using the tf.convert_to_tensor method
by default, tensorflow will create a new tf.Tensor each time you use the same tensor-like object, if the tensor-like object is large(a numpy.ndarray containing a set of training examples) and you use it multiple times, you may run out of memory. to avoid this, manually call tf.convert_to_tensor on the tensor-like object once and use the returned tf.Tensor instead
tensorflow uses the tf.Session class to represent a connection between the client program--typically a python program, although a similar interface is available in other languages--and the C++ runtime. a tf.Session object provides access to devices in the local machine, and remote devices using the distributed tensorflow runtime. it also caches information about your tf.Graph so that you can efficiently run the same computation multiple times
since a tf.Session owns physical resources(such as GPUs and network connections), it is typically used as a context manager(in a with block) that automatically closes the session when you exit the block. it is also possible to create a session without using a with block, but you should explicitly call tf.Session.close when you are finished with it to free the resources
tf.Session.init accepts three optional arguments:
(1)target. if this argument is left empty(the default), the session will only use devices in the local machine. however, you may also specify a grpc:// URL to specify the address of a tensorflow server, which gives the session access to all devices on machines that this server controls. sess tf.train.Server for details of how to create a tensorflow server
(2)graph. by default, a new tf.Session will be bound to--and only able to run operations in--the current default graph. if you are using multiple graphs in your program(see Programming with multiple graphs for more details), you can specify an explicit tf.Graph when you construct the session
(3)config. this argument allows you to specify a tf.ConfigProto that controls the behavior of the session
(1)allow_soft_placement. set this to True to enable a "soft" device placement algorithm, which ignores tf.device annotations that attempt to place CPU-only operations on a GPU device, and places them on the CPU instead
(2)cluster_def. when using distributed tensorflow, this option allows you to specify what machines to use in the computation, and provide a mapping between job names, task indices, and network addresses. see tf.train.ClusterSpec.as_cluster_def for details
(3)graph_options.optimizer_options. provides control over the optimizations that tensorflow performs on your graph before executing it
(4)gpu_options.allow_growth. set this to True to change the GPU memory allocator so that it gradually increases the amount of memory allocated, rather than allocating most of the memory at startup
tf.Session.run requires you to specify a list of fetches, which determines the return values, and may be a tf.Operation, a tf.Tensor, or a tensor-like type such as tf.Variable. these fetches determine what subgraph of the overall tf.Graph must be executed to produce the result: this is the subgraph that contains all operations named in the fetch list, plus all operations whose outputs are used to compute the value of the fetches
tf.Session.run also acceptes an optional options argument that enables you to specify options about the call, and an optional run_metadata argument that enables you to collect metadata about the execution
the easiest way to create a visualization is to pass a tf.Graph when creating the tf.summary.FileWriter
by defualt utilities like tf.train.Saver use the names of tf.Variable objects(which have names based on an underlying tf.Operation) to identify each variable in a saved checkpoint. when programming this way, you can either use completely separate python processes to build and execute the graphs, or you can use multiple graphs in the same process
the default graph stores information about every tf.Operation and tf.Tensor that was ever added to it. if your program creates a large number of unconnected subgraphs, it may be more efficient to use a different tf.Graph to build each subgraph, so that unrelated state can be garbage collected