联邦学习-tff学习笔记

联邦学习

federated data

  1. the term federated data in this tutorial to refer to a collection of data items hosted across a group of devices in a distributed system.

  2. In programming language design, a first-class citizen (also type, object, entity, or value) in a given programming language is an entity which supports all the operations generally available to other entities. These operations typically include being passed as an argument, returned from a function, modified, and assigned to a variable.

  3. The important point to understand is that we are modeling the entire collection of data items across all devices (e.g., the entire collection temperature readings from all sensors in a distributed array) as a single federated value.

  4. A collection of temperature readings that materialize across an array of distributed sensors could be modeled as a value of this federated type.

    federated_float_on_clients = tff.FederatedType(tf.float32, tff.CLIENTS)
    
  5. One example of a federated value of such type that might arise in practical scenarios is a hyperparameter (such as a learning rate, a clipping norm, etc.) that has been broadcasted by a server to a group of devices that participate in federated training.

    str(tff.FederatedType(tf.float32, tff.CLIENTS, all_equal=True))
    
  6. More generally, we’ll often use the term federated XYZ to refer to a federated value in which member constituents are XYZ-like. Thus, we will talk about things like federated tuples, federated sequences, federated models, and so on.

Placements

  1. This is especially important when dealing with, e.g., application data on mobile devices. Since the data is private and can be sensitive, we need the ability to statically verify that this data will never leave the device (and prove facts about how the data is being processed). The placement specifications are one of the mechanisms designed to support this.
  2. TFF focuses on data, where that data materializes, and how it’s being transformed.
  3. There’s no concept of a device or client identity anywhere in the Federated Core API, the underlying set of architectural abstractions, or the core runtime infrastructure we provide to support simulations. All the computation logic you write will be expressed as operations on the entire client group.
  4. Placements are designed to be a first-class citizen in TFF as well, and can appear as parameters and results of a placement type (to be represented by tff.PlacementType in the API).
  5. In the future, we plan to provide a variety of operators to transform or combine placements, but this is outside the scope of this tutorial. For now, it suffices to think of placement as an opaque primitive built-in type in TFF, similar to how int and bool are opaque built-in types in Python, with tff.CLIENTS being a constant literal of this type, not unlike 1 being a constant literal of type int.

联邦计算

  1. The basic unit of composition in TFF is a federated computation - a section of logic that may accept federated values as input and return federated values as output.

     #the computation accepts a collection of different sensor readings on client devices, and returns a single average on the server.
    str(get_average_temperature.type_signature)
    
  2. Where the computation expects a value of a federated type with the all_equal bit set to False, you can feed it as a plain list in Python, and for federated types with the all_equal bit set to True, you can just directly feed the (single) member constituent. This is also how the results are reported back to you.

    get_average_temperature([68.5, 70.3, 69.8])
    
  3. An important restriction to be aware of is that bodies of Python functions decorated with tff.federated_computation must consist only of federated operators, i.e., they cannot directly contain TensorFlow operations. For example, you cannot directly use tf.nest interfaces to add a pair of federated values. TensorFlow code must be confined to blocks of code decorated with a tff.tf_computation discussed in the following section. Only when wrapped in this manner can the wrapped TensorFlow code be invoked in the body of a tff.federated_computation.

federated logic

  1. TFF is designed for use with TensorFlow. As such, the bulk of the code you will write in TFF is likely to be ordinary (i.e., locally-executing) TensorFlow code. In order to use such code with TFF, as noted above, it just needs to be decorated with tff.tf_computation.

    @tff.tf_computation(tf.float32)
    def add_half(x):
      return tf.add(x, 0.5)
    @tff.federated_computation(tff.FederatedType(tf.float32, tff.CLIENTS))
    def add_half_on_clients(x):
      return tff.federated_map(add_half, x)
    add_half_on_clients([1.0, 3.0, 2.0])
    
  2. The only difference between Python methods decorated with tff.federated_computation and those decorated with tff.tf_computation is that the latter are serialized as TensorFlow graphs (whereas the former are not allowed to contain TensorFlow code directly embedded in them).

  3. 不可以使用eager模式

    def get_constant_10():
      return tf.constant(10.)
    
    @tff.tf_computation(tf.float32)
    def add_ten(x):
      return x + get_constant_10()
    
    add_ten(5.0)
    
  4. the type specification tff.SequenceType(tf.float32) defines an abstract sequence of float elements in TFF. Sequences can contain either tensors, or complex nested structures (we’ll see examples of those later). The concise representation of a sequence of T-typed items is T*.

    float32_sequence = tff.SequenceType(tf.float32)
    
    str(float32_sequence)
    
    @tff.tf_computation(tff.SequenceType(tf.float32))
    def get_local_temperature_average(local_temperatures):
      sum_and_count = (
          local_temperatures.reduce((0.0, 0), lambda x, y: (x[0] + y, x[1] + 1)))
      return sum_and_count[0] / tf.cast(sum_and_count[1], tf.float32)
    
    @tff.tf_computation(tff.SequenceType(tf.int32))
    def foo(x):
      return x.reduce(np.int32(0), lambda x, y: x + y)
    
    foo([1, 2, 3])
    
    @tff.tf_computation(tff.SequenceType(collections.OrderedDict([('A', tf.int32), ('B', tf.int32)])))
    def foo(ds):
      print ('output_types = {}, shapes = {}'.format(
          tf.compat.v1.data.get_output_types(ds),
          tf.compat.v1.data.get_output_shapes(ds)))
      return ds.reduce(np.int32(0), lambda total, x: total + x['A'] * x['B'])
    str(foo.type_signature)
    foo([{'A': 2, 'B': 3}, {'A': 4, 'B': 5}])
    

将所有都放在一起

  1. Now, let’s try again to use our TensorFlow computation in a federated setting. Suppose we have a group of sensors that each have a local sequence of temperature readings. We can compute the global temperature average by averaging the sensors’ local averages as follows.
@tff.tf_computation(tff.SequenceType(tf.float32))
def get_local_temperature_average(local_temperatures):
  sum_and_count = (
      local_temperatures.reduce((0.0, 0), lambda x, y: (x[0] + y, x[1] + 1)))
  return sum_and_count[0] / tf.cast(sum_and_count[1], tf.float32)

# 只算了简单平均,但是实际上应该考虑每个客户端的数据量
@tff.federated_computation( tff.FederatedType(tff.SequenceType(tf.float32), tff.CLIENTS) )
def get_global_temperature_average(sensor_readings):
  return tff.federated_mean(
      tff.federated_map(get_local_temperature_average, sensor_readings))

str(get_global_temperature_average.type_signature)

get_global_temperature_average([[68.0, 70.0], [71.0], [68.0, 72.0, 70.0]])

你可能感兴趣的:(联邦学习-tff学习笔记)