Py之tensorflow-federated:tensorflow-federated的简介、安装、使用方法之详细攻略
目录
tensorflow-federated的简介
tensorflow-federated的安装
tensorflow-federated的使用方法
1、基础案例
TensorFlow Federated (TFF)是一个用于机器学习和其他分布式数据计算的开源框架。TFF 的开发旨在促进联邦学习 (FL) 的开放研究和实验,这是一种机器学习方法,其中一个共享的全局模型在许多参与的客户之间进行训练,这些客户将他们的训练数据保存在本地。例如,FL 已被用于训练移动键盘的预测模型,而无需将敏感的打字数据上传到服务器。
TFF使开发人员能够在他们的模型和数据中使用包含的联邦学习算法,以及试验新的算法。TFF提供的构建块还可以用于实现非学习计算,例如对分散数据的聚合分析。
TFF的接口分为两层:
TFF使开发人员能够声明式地表示联邦计算,因此可以将它们部署到不同的运行时环境中。TFF中包含了一个用于实验的单机模拟运行时。
官方网站:http://tensorflow.org/federated
GitHub官网:GitHub - tensorflow/federated: A framework for implementing federated learning
pip install tensorflow-federated
参考文章:federated/program.py at main · tensorflow/federated · GitHub
import asyncio
import os.path
from typing import Sequence, Tuple, Union
from absl import app
from absl import flags
import tensorflow as tf
import tensorflow_federated as tff
from tensorflow_federated.examples.program import computations
from tensorflow_federated.examples.program import program_logic
_OUTPUT_DIR = flags.DEFINE_string('output_dir', None, 'The output path.')
def _filter_metrics(path: Tuple[Union[str, int], ...]) -> bool:
if path == (computations.METRICS_TOTAL_SUM,):
return True
else:
return False
def main(argv: Sequence[str]) -> None:
if len(argv) > 1:
raise app.UsageError('Too many command-line arguments.')
total_rounds = 10
number_of_clients = 3
# Configure the platform-specific components; in this example, the TFF native
# platform is used, but this example could use any platform that conforms to
# the approprate abstract interfaces.
# Create a context in which to execute the program logic.
context = tff.google.backends.native.create_local_async_cpp_execution_context(
)
context = tff.program.NativeFederatedContext(context)
tff.framework.set_default_context(context)
# Create data sources that are compatible with the context and computations.
to_int32 = lambda x: tf.cast(x, tf.int32)
datasets = [tf.data.Dataset.range(10).map(to_int32)] * 3
train_data_source = tff.program.DatasetDataSource(datasets)
evaluation_data_source = tff.program.DatasetDataSource(datasets)
# Create computations that are compatible with the context and data sources.
initialize = computations.initialize
train = computations.train
evaluation = computations.evaluation
# Configure the platform-agnostic components.
# Create release managers with access to customer storage.
train_metrics_managers = [tff.program.LoggingReleaseManager()]
evaluation_metrics_managers = [tff.program.LoggingReleaseManager()]
model_output_manager = tff.program.LoggingReleaseManager()
if _OUTPUT_DIR.value is not None:
summary_dir = os.path.join(_OUTPUT_DIR.value, 'summary')
tensorboard_manager = tff.program.TensorBoardReleaseManager(summary_dir)
train_metrics_managers.append(tensorboard_manager)
csv_path = os.path.join(_OUTPUT_DIR.value, 'evaluation_metrics.csv')
csv_manager = tff.program.CSVFileReleaseManager(csv_path)
evaluation_metrics_managers.append(csv_manager)
# Group the metrics release managers; program logic may accept a single
# release manager to make the implementation of the program logic simpler and
# easier to maintain, the program can use a
# `tff.program.GroupingReleaseManager` to release values to multiple
# destinations.
#
# Filter the metrics before they are released; the program can use a
# `tff.program.FilteringReleaseManager` to limit the values that are
# released by the program logic. If a formal privacy guarantee is not
# required, it may be ok to release all the metrics.
train_metrics_manager = tff.program.FilteringReleaseManager(
tff.program.GroupingReleaseManager(train_metrics_managers),
_filter_metrics)
evaluation_metrics_manager = tff.program.FilteringReleaseManager(
tff.program.GroupingReleaseManager(evaluation_metrics_managers),
_filter_metrics)
# Create a program state manager with access to platform storage.
program_state_manager = None
if _OUTPUT_DIR.value is not None:
program_state_dir = os.path.join(_OUTPUT_DIR.value, 'program_state')
program_state_manager = tff.program.FileProgramStateManager(
program_state_dir)
# Execute the program logic; the program logic is abstracted into a separate
# function to illustrate the boundary between the program and the program
# logic. This program logic is declared as an async def and needs to be
# executed in an asyncio event loop.
asyncio.run(
program_logic.train_federated_model(
initialize=initialize,
train=train,
train_data_source=train_data_source,
evaluation=evaluation,
evaluation_data_source=evaluation_data_source,
total_rounds=total_rounds,
number_of_clients=number_of_clients,
train_metrics_manager=train_metrics_manager,
evaluation_metrics_manager=evaluation_metrics_manager,
model_output_manager=model_output_manager,
program_state_manager=program_state_manager))
if __name__ == '__main__':
app.run(main)