图序列就是图的一个序列,图指的是数据结构中的图结构,序列是指先随机生成前K个,用相应的数据结构表示,再不断地用最近的前K个图预测下一个图,直到图的总量到达给定限制。
为生成好的图序列实现一个数据加载器,它能输出每一个图片和之前的K个图
大致思路和第一个差不多,但是用非线性的自回归模型,要使用到sigmoid函数。
随机生成前K个无向图的概率矩阵,具体表现为三维张量(每一个元素是一个二维张量,每一个二维张量是一个对称的对角线上全是0的矩阵),大小为(K,N,N),N代表图中节点的个数。再由前K个概率矩阵通过伯努利抽样生成对应图的邻接矩阵。
通过概率矩阵(具体以张量的形式存在)通过伯努利抽样得到对应的邻接矩阵。
3.GraphSeqGenerator.simulation():
在前两个函数的基础上完成图序列的模拟。
通过线性自回归模型预测第K+1个图的概率矩阵,再通过此概率矩阵通过sampling()函数得到对应的邻接矩阵,再把预测得到的第K+1个矩阵拼接到前K个矩阵后面,所以最后就变成了(K+1,N,N)大小的张量。
4.GraphSeqGenerator2():
在前面已经实现的基础上,将线性的自回归方式改变成为新给出的非线性自回归方式。(说白了就是改一下公式那部分的代码,其他的不用变)
安装Pytorch和Pytorch-geometric。
这个我是参考的网上的安装教程。链接:pytorch无坑超详细图文CPU版小白安装教程(配gpu版链接、conda命令教程)
一定要仔细阅读,我第一次尝试的时候,没有激活环境就开始在安装所以意料之中地失败了。一定要等每一步完成的时候再继续下一步操作。
另外,在用conda config --add channels的时候不要重复添加,否则就会把重复添加的那条路径移动到路径列表的最顶部。
self.a = torch.rand(self.order, 1, 1) + 0.1
self.a = self.a / torch.sum(self.a)
这是生成K个在(0,1)之间的且和为1的系数向量
def initialization(self) -> torch.FloatTensor:
"""
Initialize K undirected graphs and formulate them as a float tensor with size (K, N, N)
:return: a torch float tensor with size (K, N, N)
"""
# TODO: change the following code to achieve the initialization function
graphs = torch.rand(self.order, self.num_nodes, self.num_nodes).float() # 生成K*N*N的随机三维矩阵,随机数为(0,1)之间,对角线暂时不处理
for i in range(self.order):
for j in range(self.num_nodes):
graphs[i, j, j] = 0 # 每一个N*N的矩阵对角线上置零
for i in range(0,self.order):
graphs[i] = graphs[i]*graphs[i].t() # 分别在每一个二维矩阵上处理,对称化,得到对称的二维矩阵
graphs = (graphs <= self.sparsity).float() # 跟伯努利是一样的效果,就是每个元素 比sparsity小或者等于都是1,可以解决torch.bernoulli()随机化之后二维矩阵不再对称的问题
return graphs
@staticmethod
def sampling(prob_edges: torch.Tensor) -> torch.FloatTensor: # 将概率矩阵转化为邻接矩阵
"""
Sample an adjacency matrix of a undirected graph from a probability matrix 从概率矩阵中对无向图的邻接矩阵进行采样
:param prob_edges: (N, N) shaped matrix
:return: a torch float tensor with size (N, N)
"""
# TODO: Change the code below to sample an adjacency matrix of a undirected graph from a probability matrix
adj_matrix = (prob_edges <= 0.5).float() # 实现伯努利抽样,原理和上一节代码中注释相同,用法也是相同的
# print(adj_matrix) # print出来结果可以一下子看到问题,当然也可以单步调试看看每一步变量的变化过程
return adj_matrix # 修改
def simulation(self, length: int = None) -> list:
"""
Simulate a graph sequence based on the initialization and sampling functions, and the autoregressive mechanism
:param length:
:return:
"""
if length is None:
length = self.length
graph_data = []
graphs = torch.zeros(length, self.num_nodes, self.num_nodes)
# TODO: 1) simulate graphs via the auto-regressive model;
# 2) Convert the format of the graph sequence to "Data" Type defined in PyTorch Geometric;
# Hint: please check the function "dense_to_sparse" and the usage of "Data" class
# visualize the graph sequence
prob_edges = torch.zeros(self.num_nodes, self.num_nodes)
undirected_graphs = GraphSeqGenerator.initialization(self) # 初始化无向图概率矩阵
# 循环预测第n个图,后面是求取加权和的意思,就是将自定义的K个系数分别和前K个图相乘加起来,
# 建议学习tensor.cumsum()函数,参考Reference链接;[-1]是指只取最后一层矩阵的值
for i in range(self.order, length):
prob_edges = (self.a * undirected_graphs[i - self.order:i]).cumsum(dim=0)[-1]
adj_matrix = GraphSeqGenerator.sampling(prob_edges) # 伯努利抽样
# 将原来的张量和新生成的二维矩阵拼接,
#注意:因为拼接需要保证维数是一样的,所以用tensor.view()改变二维矩阵的维数,扩展成三维
undirected_graphs = torch.cat((undirected_graphs, adj_matrix.view((1, self.num_nodes, self.num_nodes))),
dim=0)
graphs += undirected_graphs # 之前graphs所有元素全是0,用矩阵加法,对其元素值进行更新
# 目前存在的问题:如何将图数据存入Data中,提示中让我们学习dense_to_sparse" and the usage of "Data" class
# 还应该思考的是:现在我是把所有的图预测好全部拼接成了三维矩阵,然后再想着把它转化成Data,但是又没有考虑过边预测边存pt
# 现在尝试把图数据存成Data要求的形式,节点的特征向量是该节点的度,第二个参数是该图的稀疏矩阵
graph_data = []
for i in range(length):
features = graphs[i].cumsum(dim=0)[-1]
edge_index, edge_attr = dense_to_sparse(graphs[i])
# 运用Data所定义的格式储存
graph_data.append(Data(x=features, edge_index=edge_index, edge_attr=edge_attr))
# 可直接将tensor保存为图片
save_image(graphs.view(self.length, 1, self.num_nodes, self.num_nodes), 'graphs.png',
nrow=int(self.length ** 0.5))
return graph_data
使用torch.Batch函数,从self.data(graphs,type是list)成批加载数据。
graph_current是指idx对应的的那个图(的数据)
def __getitem__(self, idx: int):
"""
Given the index of a graph, output this graph and its K previous graphs
:param idx: the index of a graph
:return:
"""
# TODO: Change the code below to achieve the dataset sampler
# Hint: 1) graphs_history need to call the functions of the "Batch" Class;
# 2) Be careful about the range of the index.
graphs_history = Batch.from_data_list(self.data)
graph_current = self.data[idx]
return graphs_history, graph_current
除了simulation()函数要修改,其他基本上copy前面的。
下面只展示需要修改的这部分函数:
def simulation(self, length: int = None) -> list:
"""
Simulate a graph sequence based on the initialization and sampling functions, and the autoregressive mechanism
:param length:
:return:
"""
if length is None:
length = self.length
graph_data = []
graphs = torch.zeros(length, self.num_nodes, self.num_nodes)
# TODO: 1) simulate graphs via the auto-regressive model;
# 2) Convert the format of the graph sequence to "Data" Type defined in PyTorch Geometric;
# Hint: please check the function "dense_to_sparse" and the usage of "Data" class
# visualize the graph sequence
prob_edges = torch.zeros(self.num_nodes, self.num_nodes)
undirected_graphs = GraphSeqGenerator.initialization(self) # 初始化
for i in range(self.order, length):
prob_edges = (self.a * (undirected_graphs[i - self.order:i] - 0.5)).cumsum(dim=0)[-1] # 加权和
prob_edges = torch.sigmoid(prob_edges) # 取非线性函数
adj_matrix = GraphSeqGenerator.sampling(prob_edges) # 伯努利抽样
undirected_graphs = torch.cat((undirected_graphs, adj_matrix.view((1, self.num_nodes, self.num_nodes))),
dim=0) # 拼接????该怎么拼接
graphs += undirected_graphs
# 目前存在的问题:如何将图数据存入Data中,提示中让我们学习dense_to_sparse" and the usage of "Data" class
# 还应该思考的是:现在我是把所有的图预测好全部拼接成了三维矩阵,然后再想着把它转化成Data,但是又没有考虑过边预测边存pt
# 现在尝试把图数据存成Data要求的形式,节点的特征向量是该节点的度,第二个参数是该图的稀疏矩阵
graph_data = []
for i in range(length):
features = graphs[i].cumsum(dim=0)[-1]
edge_index, edge_attr = dense_to_sparse(graphs[i])
# 运用Data所定义的格式储存
graph_data.append(Data(x=features, edge_index=edge_index, edge_attr=edge_attr))
# 可直接将tensor保存为图片
save_image(graphs.view(self.length, 1, self.num_nodes, self.num_nodes), 'graphs.png',
nrow=int(self.length ** 0.5))
return graph_data