FATE —— 2.2 Hetero-NN快速入门:二进制分类任务

前言

在本教程中,您将学习如何使用Hetero NN。应该注意的是,Hetero NN也已升级为与Homo NN类似的工作方式,允许使用Pytorch后端对模型和数据集进行高度定制。我们将在后面的章节中专门介绍针对Hetero NN的定制。

此外,Hetero NN还改进了一些接口,如交互层接口,这使其使用逻辑更加清晰。

在本章中,我们将提供一个使用Hetero-NN的基本二进制分类任务的示例。使用此算法的过程与其他FATE算法一致:您将使用FATE提供的读取器和转换器接口来输入表数据,然后将数据输入到算法组件中。然后,组件将使用定义的顶部/底部模型、优化器和损失函数进行训练。此版本的用法与旧版本FATE的用法基本相同。

如果您想了解Hetero-NN算法的原理,可以参考异构神经网络。

上载表格数据

一开始,我们将数据上传到FATE。我们可以使用管道直接上传数据。在这里,我们上传两个文件:来宾的brest_hetero_guest.csv和主机的brest_hetero_host.csv。请注意,在本教程中,我们使用的是独立版本,如果您使用的是集群版本,则需要在每台计算机上上载相应的数据。

from pipeline.backend.pipeline import PipeLine  # pipeline class

# we have two party: guest, whose data with labels
#                    host, without label
# the dataset is vertically split

dense_data_guest = {"name": "breast_hetero_guest", "namespace": f"experiment"}
dense_data_host = {"name": "breast_hetero_host", "namespace": f"experiment"}

guest= 9999
host = 10000

pipeline_upload = PipeLine().set_initiator(role='guest', party_id=guest).set_roles(guest=guest, host=host)

partition = 4

# 上传一份数据
pipeline_upload.add_upload_data(file="./examples/data/breast_hetero_guest.csv",
                                table_name=dense_data_guest["name"],             # table name
                                namespace=dense_data_guest["namespace"],         # namespace
                                head=1, partition=partition)               # data info

pipeline_upload.add_upload_data(file="./examples/data/breast_hetero_host.csv",
                                table_name=dense_data_host["name"],
                                namespace=dense_data_host["namespace"],
                                head=1, partition=partition)      # data info

pipeline_upload.upload(drop=1)
FATE —— 2.2 Hetero-NN快速入门:二进制分类任务_第1张图片

乳房数据集是一个具有30个特征的二进制数据集,它是垂直分割的:客人拥有10个胎儿和标签,而主人拥有20个特征

breast_hetero_guest数据写入

import pandas as pd
df = pd.read_csv('../examples/data/breast_hetero_guest.csv')  # 文件地址根据自己得环境设置
df
FATE —— 2.2 Hetero-NN快速入门:二进制分类任务_第2张图片

breast_hetero_host数据写入

import pandas as pd
df = pd.read_csv('../examples/data/breast_hetero_host.csv')
df
FATE —— 2.2 Hetero-NN快速入门:二进制分类任务_第3张图片

编写并执行Pipeline脚本

上传完成后,我们可以开始编写Pipeline脚本以提交FATE任务。

import torch as t
from torch import nn
from pipeline.backend.pipeline import PipeLine  # pipeline Class
from pipeline import fate_torch_hook
from pipeline.component import HeteroNN, Reader, DataTransform, Intersection  # Hetero NN Component, Data IO component, PSI component
from pipeline.interface import Data, Model # data, model for defining the work flow

fate_torch_hook

请确保执行以下fate_torch_hook函数,该函数可以修改某些torch类,以便可以通过Pipeline解析和提交您在脚本中定义的torch层、顺序、优化器和损失函数。

from pipeline import fate_torch_hook
t = fate_torch_hook(t)
guest = 9999
host = 10000
pipeline = PipeLine().set_initiator(role='guest', party_id=guest).set_roles(guest=guest, host=host)

guest_train_data = {"name": "breast_hetero_guest", "namespace": "experiment"}
host_train_data = {"name": "breast_hetero_host", "namespace": "experiment"}

pipeline = PipeLine().set_initiator(role='guest', party_id=guest).set_roles(guest=guest, host=host)

# read uploaded dataset
reader_0 = Reader(name="reader_0")
reader_0.get_party_instance(role='guest', party_id=guest).component_param(table=guest_train_data)
reader_0.get_party_instance(role='host', party_id=host).component_param(table=host_train_data)
# The transform component converts the uploaded data to the DATE standard format
data_transform_0 = DataTransform(name="data_transform_0")
data_transform_0.get_party_instance(role='guest', party_id=guest).component_param(with_label=True)
data_transform_0.get_party_instance(role='host', party_id=host).component_param(with_label=False)
# intersection
intersection_0 = Intersection(name="intersection_0")

异质神经网络组件

这里我们初始化Hetero-NN组件。我们使用get_party_instance分别获取来宾组件和主机组件。由于双方的模型架构不同,我们必须使用各自的组件为每一方指定模型参数。

hetero_nn_0 = HeteroNN(name="hetero_nn_0", epochs=2,
                       interactive_layer_lr=0.01, batch_size=-1, validation_freqs=1, task_type='classification', seed=114514)
guest_nn_0 = hetero_nn_0.get_party_instance(role='guest', party_id=guest)
host_nn_0 = hetero_nn_0.get_party_instance(role='host', party_id=host)

定义Guest 和 Host 模型

# Guest Bottom, Top Model
guest_bottom = t.nn.Sequential(
    nn.Linear(10, 2),
    nn.ReLU()
)
guest_top = t.nn.Sequential(
    nn.Linear(2, 1),
    nn.Sigmoid()
)

# Host Bottom Model
host_bottom = t.nn.Sequential(
    nn.Linear(20, 2),
    nn.ReLU()
)

# After using fate_torch_hook, nn module can use InteractiveLayer, you can view the structure of Interactive layer with print
interactive_layer = t.nn.InteractiveLayer(out_dim=2, guest_dim=2, host_dim=2, host_num=1)
print(interactive_layer)

guest_nn_0.add_top_model(guest_top)
guest_nn_0.add_bottom_model(guest_bottom)
host_nn_0.add_bottom_model(host_bottom)

optimizer = t.optim.Adam(lr=0.01) # Notice! After fate_torch_hook, the optimizer can be initialized without model parameter
loss = t.nn.BCELoss()

hetero_nn_0.set_interactive_layer(interactive_layer)
hetero_nn_0.compile(optimizer=optimizer, loss=loss)

InteractiveLayer(

(activation): ReLU()

(guest_model): Linear(in_features=2, out_features=2, bias=True)

(host_model): ModuleList(

(0): Linear(in_features=2, out_features=2, bias=True)

)

(act_seq): Sequential(

(0): ReLU()

)

)

你可能感兴趣的:(联邦学习,python,算法)