yyl424525

GraphSAGE NIPS 2017 代码分析（Tensorflow版）

文章目录

数据集

ppi数据集信息

toy-ppi-G.json 图的信息
toy-ppi-class_map.json
toy-ppi-id_map.json
toy-ppi-walks.txt
toy-ppi-feats.npy

实验要求

安装Docker与程序运行
运行

代码分析

`__init__.py`
`utils.py`
`neigh_samplers.py`
`models.py`
`layers.py`
`minibatch.py`
`aggregators.py`
`prediction.py`
`supervised_train.py`
`unsupervised_train.py`
`inits.py`

其他

`citation_eval.py`
`ppi_eval.py`
`reddit_eval.py`

参考

数据集

数据集	#图	#节点	#边	#特征	#标签(y)
Cora	1	2708	5429	1433	7
Citeseer	1	3327	4732	3703	6
Pubmed	1	19717	44338	500	3
PPI	24	56944	818716	50	121
Reddit	1	232965	11606919	602	41
Nell	1	65755	266144	61278	210

ppi数据集信息

toy-ppi-G.json 图的信息

{ 
  directed: false
  graph : {
　　　　　         {name: disjoint_union(,) }
　　　　　　　　　　　nodes:  [
　　　　　　　　　　              {　 
                                test: false
　　　　　　　　　　　              id: 0
　　　　　　　　　　　              features: [ ... ]
　　　　　　　　　　　              val: false
　　　　　　　　　　                lable: [ ... ]
　　　　　　　　　　　　           }
　　　　　　　　　　　　           {...}
　　　　　　　　　　               ...
　　　　　　　　　　　　      ]

　　　　　　　　　　　 links: [
　　　　　　　　　　　　           {　 
                                test_removed: false
　　　　　　　　　　　　            train_removed: false
　　　　　　　　　　　　            target: 800 # 指向的节点id（默认从小节点指向大节点）
　　　　　　　　　　　　            source: 0   # 从0节点按顺序展示
　　　　　　　　　　               }
　　　　　　　　　　               {...}
　　　       　　　　　            ...
　　　　　　　　　　          ]
　　　　　　}
}

name: disjoint_union(,)表示图的名字
toy-ppi-G.json里只有一个图（可能是因为用于节点分类只需要一张图即可，做图分类任务需要多张图）
可以看出，这是个无向图，并且由nodes集和links集合构成，每个集合都是一个list，里面包含的每一个node或link都是词典形式存储的
从github下载的源码中，没有links部分的数据?其实是由于文件过大显示不完整，其实是存在的，比如节点只显示到1883，总共14754个

toy-ppi-class_map.json

格式为：{“0”: [1, 0, 0,…],…,“14754”: [1, 1, 0, 0,…]}

toy-ppi-id_map.json

节点编号与序号的一一对应；数据格式为：{“0”: 0, “1”: 1,…, “14754”: 14754}

toy-ppi-walks.txt

从一点出发随机游走到邻居节点的情况，对于每个点取198次（即可能有重复情况）
例如：0 708 表示从0点走到708点。

toy-ppi-feats.npy

预训练好得到的features。

数据处理的时候主要通过两个函数
(1):np.save(“test.npy”，数据结构） ----存数据
(2):data =np.load('test.npy") ----取数据
例如，存列表

z = [[[1, 2, 3], ['w']], [[1, 2, 3], ['w']]]
np.save('test.npy', z)
x = np.load('test.npy')

x:
->array([[list([1, 2, 3]), list(['w'])],
       [list([1, 2, 3]), list(['w'])]], dtype=object)

例如，存字典

x
-> {0: 'wpy', 1: 'scg'}
np.save('test.npy',x)
x = np.load('test.npy')
x
->array({0: 'wpy', 1: 'scg'}, dtype=object)

在存为字典格式读取后，需要先调用如下语句
data.item()
将数据numpy.ndarray对象转换为dict

实验要求

networkx版本必须小于等于1.11，本人的是2.3版本，因此报错

  File "/home/yyl/桌面/Papers/GraphSage/GraphSAGE-master/graphsage/utils.py", line 20, in load_data
    G_data = json.load(open(prefix + "-G.json"))
FileNotFoundError: [Errno 2] No such file or directory: '-G.json'

换成1.11版本即可：pip install networkx==1.11

安装Docker与程序运行

参考：https://www.cnblogs.com/shiyublog/p/9858786.html

运行

在命令运行unsupervised_train.py

python -m graphsage.unsupervised_train --train_prefix ./example_data/toy-ppi --model graphsage_mean --max_total_steps 1000 --validate_iter 10
等价于
python ./graphsage/unsupervised_train.py  --train_prefix ./example_data/toy-ppi --model graphsage_mean --max_total_steps 1000 --validate_iter 10

注意，上述数据集路径和官方给的不一样。如果是在Pycharm中运行，需要更改train_prefix，model等参数的值，需要注意在ide和命令行中参数的格式，在idea中：

flags.DEFINE_string('model', 'graphsage_mean', 'model names. See README for possible values.')  
flags.DEFINE_string('train_prefix', '../example_data/toy-ppi', 'prefix identifying training data. must be specified.')

model参数可选值

graphsage_mean – GraphSage with mean-based aggregator
graphsage_seq – GraphSage with LSTM-based aggregator
graphsage_maxpool – GraphSage with max-pooling aggregator (as described in the NIPS 2017 paper)
graphsage_meanpool – GraphSage with mean-pooling aggregator (a variant of the pooling aggregator, where the element-wie mean replaces the element-wise max).
gcn – GraphSage with GCN-based aggregator
n2v – an implementation of DeepWalk (called n2v for short in the code.)

date_iter 10
Loading training data..
Removed 0 nodes that lacked proper annotations due to networkx versioning issues
Loaded data.. now preprocessing..
Done loading training data..
Unexpected missing: 0
9716 train nodes
5039 test nodes
2019-11-05 20:49:37.036227: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2019-11-05 20:49:37.075335: E tensorflow/stream_executor/cuda/cuda_driver.cc:300] failed call to cuInit: CUDA_ERROR_NO_DEVICE: no CUDA-capable device is detected
2019-11-05 20:49:37.076045: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:163] retrieving CUDA diagnostic information for host: yyl-Z370-HD3
2019-11-05 20:49:37.076058: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:170] hostname: yyl-Z370-HD3
2019-11-05 20:49:37.076154: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:194] libcuda reported version is: 430.34.0
2019-11-05 20:49:37.076205: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:198] kernel reported version is: 430.34.0
2019-11-05 20:49:37.076214: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:305] kernel version seems to match DSO: 430.34.0
Epoch: 0001
2019-11-05 20:49:41.997221: W tensorflow/core/framework/allocator.cc:122] Allocation of 25600000 exceeds 10% of system memory.
2019-11-05 20:49:42.001801: W tensorflow/core/framework/allocator.cc:122] Allocation of 25600000 exceeds 10% of system memory.
2019-11-05 20:49:42.018491: W tensorflow/core/framework/allocator.cc:122] Allocation of 25600000 exceeds 10% of system memory.
2019-11-05 20:49:42.024759: W tensorflow/core/framework/allocator.cc:122] Allocation of 25600000 exceeds 10% of system memory.
Iter: 0000 train_loss= 18.78112 train_mrr= 0.25004 train_mrr_ema= 0.25004 val_loss= 19.16457 val_mrr= 0.21838 val_mrr_ema= 0.21838 time= 1.29920
2019-11-05 20:49:42.517739: W tensorflow/core/framework/allocator.cc:122] Allocation of 25600000 exceeds 10% of system memory.
Iter: 0050 train_loss= 18.49755 train_mrr= 0.18471 train_mrr_ema= 0.22649 val_loss= 18.71147 val_mrr= 0.24803 val_mrr_ema= 0.20979 time= 0.14266
Iter: 0100 train_loss= 18.56170 train_mrr= 0.16419 train_mrr_ema= 0.21142 val_loss= 18.31936 val_mrr= 0.18485 val_mrr_ema= 0.21664 time= 0.12880
Iter: 0150 train_loss= 17.91184 train_mrr= 0.19551 train_mrr_ema= 0.20265 val_loss= 18.08941 val_mrr= 0.18543 val_mrr_ema= 0.21273 time= 0.12436
Iter: 0200 train_loss= 17.14944 train_mrr= 0.22213 train_mrr_ema= 0.19652 val_loss= 17.85653 val_mrr= 0.18931 val_mrr_ema= 0.20871 time= 0.12251
Iter: 0250 train_loss= 17.14205 train_mrr= 0.18210 train_mrr_ema= 0.19319 val_loss= 17.21363 val_mrr= 0.20572 val_mrr_ema= 0.20632 time= 0.12138
Iter: 0300 train_loss= 16.96532 train_mrr= 0.18807 train_mrr_ema= 0.19088 val_loss= 16.77304 val_mrr= 0.17903 val_mrr_ema= 0.20116 time= 0.12090
Iter: 0350 train_loss= 16.62087 train_mrr= 0.18385 train_mrr_ema= 0.18813 val_loss= 16.53304 val_mrr= 0.21569 val_mrr_ema= 0.21001 time= 0.12049
Iter: 0400 train_loss= 16.15347 train_mrr= 0.19938 train_mrr_ema= 0.18743 val_loss= 16.20924 val_mrr= 0.20107 val_mrr_ema= 0.20434 time= 0.11976
Iter: 0450 train_loss= 15.92187 train_mrr= 0.18782 train_mrr_ema= 0.18764 val_loss= 16.09361 val_mrr= 0.21507 val_mrr_ema= 0.20851 time= 0.11877
Iter: 0500 train_loss= 15.61726 train_mrr= 0.20567 train_mrr_ema= 0.18762 val_loss= 15.82525 val_mrr= 0.20445 val_mrr_ema= 0.20857 time= 0.11862
Iter: 0550 train_loss= 15.49840 train_mrr= 0.18336 train_mrr_ema= 0.18751 val_loss= 15.63526 val_mrr= 0.21188 val_mrr_ema= 0.20637 time= 0.11833
Iter: 0600 train_loss= 15.29428 train_mrr= 0.18559 train_mrr_ema= 0.18814 val_loss= 15.43966 val_mrr= 0.18432 val_mrr_ema= 0.20902 time= 0.11812
Iter: 0650 train_loss= 15.22701 train_mrr= 0.18361 train_mrr_ema= 0.18770 val_loss= 15.30805 val_mrr= 0.18943 val_mrr_ema= 0.20266 time= 0.11810
Iter: 0700 train_loss= 15.12402 train_mrr= 0.17775 train_mrr_ema= 0.18618 val_loss= 15.19020 val_mrr= 0.19980 val_mrr_ema= 0.20185 time= 0.11768
Iter: 0750 train_loss= 14.99245 train_mrr= 0.18521 train_mrr_ema= 0.18568 val_loss= 15.03517 val_mrr= 0.21548 val_mrr_ema= 0.20307 time= 0.11793
Iter: 0800 train_loss= 14.90705 train_mrr= 0.19703 train_mrr_ema= 0.18656 val_loss= 14.99781 val_mrr= 0.22909 val_mrr_ema= 0.20756 time= 0.11749
Iter: 0850 train_loss= 14.89672 train_mrr= 0.17020 train_mrr_ema= 0.18719 val_loss= 14.98761 val_mrr= 0.21074 val_mrr_ema= 0.21088 time= 0.11691
Iter: 0900 train_loss= 14.79581 train_mrr= 0.20458 train_mrr_ema= 0.18726 val_loss= 14.86050 val_mrr= 0.20545 val_mrr_ema= 0.20940 time= 0.11659
Iter: 0950 train_loss= 14.78135 train_mrr= 0.17666 train_mrr_ema= 0.18843 val_loss= 14.79613 val_mrr= 0.20859 val_mrr_ema= 0.20693 time= 0.11645
Iter: 1000 train_loss= 14.75321 train_mrr= 0.19232 train_mrr_ema= 0.18724 val_loss= 14.77326 val_mrr= 0.21229 val_mrr_ema= 0.20599 time= 0.11612
Optimization Finished!

可以看出，unsupervised_train.py只运行了一个epoch，共1000次迭代，每10个迭代运行一次validation
batch_size：512
python -m graphsage.unsupervised_train 表示以模块运行，不用具体路径
python ./graphsage/unsupervised_train.py 表示以脚本文件直接运行

运行supervised_train.py，注意train_prefix参数的值也需要改:…/example_data/toy-ppi

python -m graphsage.supervised_train --train_prefix ./example_data/toy-ppi --model graphsage_mean --sigmoid
等价于
python ./graphsage/supervised_train.py --train_prefix ./example_data/toy-ppi --model graphsage_mean --sigmoid

若报错：

  File "/home/yyl/anaconda3/lib/python3.5/site-packages/sklearn/metrics/cluster/supervised.py", line 14, in 
    from scipy.misc import comb
ImportError: cannot import name 'comb'

则修改lib\site-packages\sklearn\metrics\cluster\supervised.py中的from scipy.misc import comb为from scipy.special import comb

Loading training data..
Removed 0 nodes that lacked proper annotations due to networkx versioning issues
Loaded data.. now preprocessing..
Done loading training data..
2019-11-05 21:37:11.763295: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2019-11-05 21:37:11.770739: E tensorflow/stream_executor/cuda/cuda_driver.cc:300] failed call to cuInit: CUDA_ERROR_NO_DEVICE: no CUDA-capable device is detected
2019-11-05 21:37:11.770767: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:163] retrieving CUDA diagnostic information for host: yyl-Z370-HD3
2019-11-05 21:37:11.770772: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:170] hostname: yyl-Z370-HD3
2019-11-05 21:37:11.770792: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:194] libcuda reported version is: 430.34.0
2019-11-05 21:37:11.770810: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:198] kernel reported version is: 430.34.0
2019-11-05 21:37:11.770815: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:305] kernel version seems to match DSO: 430.34.0
Epoch: 0001
2019-11-05 21:37:12.108357: W tensorflow/core/framework/allocator.cc:122] Allocation of 25600000 exceeds 10% of system memory.
2019-11-05 21:37:12.112443: W tensorflow/core/framework/allocator.cc:122] Allocation of 25600000 exceeds 10% of system memory.
/home/yyl/anaconda3/lib/python3.5/site-packages/sklearn/metrics/classification.py:1074: UndefinedMetricWarning: F-score is ill-defined and being set to 0.0 in labels with no predicted samples.
  'precision', 'predicted', average, warn_for)
/home/yyl/anaconda3/lib/python3.5/site-packages/sklearn/metrics/classification.py:1076: UndefinedMetricWarning: F-score is ill-defined and being set to 0.0 in labels with no true samples.
  'recall', 'true', average, warn_for)
Iter: 0000 train_loss= 0.69289 train_f1_mic= 0.34109 train_f1_mac= 0.29393 val_loss= 0.66505 val_f1_mic= 0.37310 val_f1_mac= 0.13088 time= 0.38797
2019-11-05 21:37:12.349277: W tensorflow/core/framework/allocator.cc:122] Allocation of 25600000 exceeds 10% of system memory.
2019-11-05 21:37:12.353043: W tensorflow/core/framework/allocator.cc:122] Allocation of 25600000 exceeds 10% of system memory.
2019-11-05 21:37:12.411469: W tensorflow/core/framework/allocator.cc:122] Allocation of 25600000 exceeds 10% of system memory.
Iter: 0005 train_loss= 0.58413 train_f1_mic= 0.36509 train_f1_mac= 0.08875 val_loss= 0.66505 val_f1_mic= 0.37310 val_f1_mac= 0.13088 time= 0.11344
Iter: 0010 train_loss= 0.54682 train_f1_mic= 0.40012 train_f1_mac= 0.10253 val_loss= 0.66505 val_f1_mic= 0.37310 val_f1_mac= 0.13088 time= 0.08710
Iter: 0015 train_loss= 0.54642 train_f1_mic= 0.37319 train_f1_mac= 0.08884 val_loss= 0.66505 val_f1_mic= 0.37310 val_f1_mac= 0.13088 time= 0.07637
Epoch: 0002
..........
Epoch: 0010
Iter: 0004 train_loss= 0.48976 train_f1_mic= 0.49845 train_f1_mac= 0.27817 val_loss= 0.52274 val_f1_mic= 0.49617 val_f1_mac= 0.29091 time= 0.05695
Iter: 0009 train_loss= 0.49231 train_f1_mic= 0.48284 train_f1_mac= 0.26799 val_loss= 0.52274 val_f1_mic= 0.49617 val_f1_mac= 0.29091 time= 0.05683
Iter: 0014 train_loss= 0.48515 train_f1_mic= 0.50462 train_f1_mac= 0.28371 val_loss= 0.52274 val_f1_mic= 0.49617 val_f1_mac= 0.29091 time= 0.05664
Optimization Finished!
Full validation stats: loss= 0.51017 f1_micro= 0.52730 f1_macro= 0.32655 time= 0.18505
Writing test set stats to file (don't peak!)

注意

/home/yyl/anaconda3/lib/python3.5/site-packages/sklearn/metrics/classification.py:1076: UndefinedMetricWarning: F-score is ill-defined and being set to 0.0 in labels with no true samples.
  'recall', 'true', average, warn_for)

原因：存在一些样本 label 为 y_true，但是y_pred 并没有预测到，即在预测数据中存在实际类别没有的标签时报此warning，此时F1当作0。
比如
y_true = (0, 1, 2, 3, 4)
y_pred = (0, 1, 1, 3, 4)
label‘2’ 从来没有被预测到，所以F-score没有计算这项 label，因此这种情况下 F-score 就被当作为 0.0 了。
但是又因为，要计算所有分类结果的平均得分就必须将这项得分为 0 的情况考虑进去，所以，scikit-learn出来提醒你，warning警告一下，但不是错误。

/home/yyl/anaconda3/lib/python3.5/site-packages/sklearn/metrics/classification.py:1074: UndefinedMetricWarning: F-score is ill-defined and being set to 0.0 in labels with no predicted samples.
  'precision', 'predicted', average, warn_for)

代码分析

`init.py`

from __future__ import print_function  
#即使在python2.X，使用print就得像python3.X那样加括号使用。

from __future__ import division          
# 导入python未来支持的语言特征division(精确除法)，
# 当我们没有在程序中导入该特征时，"/"操作符执行的是截断除法(Truncating Division)；
# 当我们导入精确除法之后，"/"执行的是精确除法, "//"执行截断除除法

`utils.py`

（1）type() 与 isinstance() 区别
isinstance() 函数来判断一个对象是否是一个已知的类型，类似 type()。
isinstance(object, classinfo)
参数
object – 实例对象。
classinfo – 可以是直接或间接类名、基本类型或者由它们组成的元组。

返回值
如果对象的类型与参数二的类型（classinfo）相同则返回 True，否则返回 False。

>>>a = 2
>>> isinstance (a,int)
True
>>> isinstance (a,str)
False
>>> isinstance (a,(str,int,list))    # 是元组中的一个返回 True
True

type() 不会认为子类是一种父类类型，不考虑继承关系。
isinstance() 会认为子类是一种父类类型，考虑继承关系。

class A:
    pass
 
class B(A):
    pass
 
isinstance(A(), A)    # returns True
type(A()) == A        # returns True
isinstance(B(), A)    # returns True
type(B()) == A        # returns False

（2）G.nodes()
返回的是图中节点n与节点属性nodedata。

    G_data = json.load(open(prefix + "-G.json"))
    G = json_graph.node_link_graph(G_data) #Return graph from node-link data format
    print("G.nodes():",G.nodes)
    #G.nodes(): >

    print("list(G.nodes):",list(G))
    # list(G.nodes): [0, 1, 2, 3, 4,...,14754]

    # print("G.nodes.data():",G.nodes.data()) #AttributeError: 'function' object has no attribute 'data'

    print(list(G.nodes(data=True))) #带nodedata
    #类似这种格式：[(0, {'val': False}，‘label':[...],...), (1, {...}), (2, {...}),...]

    #判断G.nodes()[0] 是否为int型（即不带nodedata）
    if isinstance(G.nodes()[0], int): #若为int型，则将n转为int型
        conversion = lambda n : int(n)
    else:
        conversion = lambda n : n

（3）G.edges()
代码中edge对edges迭代，每次去list中的一个元组，而edge[0], edge[1]则分别表示两个顶点。
若两个顶点中至少有一个的val / test不为空，则将该边的’train_removed’设为True，否则为False。
该操作为保证’train_removed’不为空。

    ## Make sure the graph has edge train_removed annotations
    ## (some datasets might already have this..)
    for edge in G.edges():
        if (G.node[edge[0]]['val'] or G.node[edge[1]]['val'] or
            G.node[edge[0]]['test'] or G.node[edge[1]]['test']):
            G[edge[0]][edge[1]]['train_removed'] = True
        else:
            G[edge[0]][edge[1]]['train_removed'] = False

G.edges() 得到edge_list, [( , ), ( , ), … ( , )]。list中每一个元素是所表示边的两个节点信息。若设置data = True，则会显示边的权重等属性信息。

>>> G = nx.Graph()   # or DiGraph, MultiGraph, MultiDiGraph, etc
>>> G.add_path([0,1,2])
>>> G.add_edge(2,3,weight=5)
>>> G.edges()
[(0, 1), (1, 2), (2, 3)]
>>> G.edges(data=True) # default edge data is {} (empty dictionary)
[(0, 1, {}), (1, 2, {}), (2, 3, {'weight': 5})]
>>> list(G.edges_iter(data='weight', default=1))
[(0, 1, 1), (1, 2, 1), (2, 3, 5)]
>>> G.edges([0,3])
[(0, 1), (3, 2)]
>>> G.edges(0)
[(0, 1)]

（4）获取训练数据features并标准化
这里if not feats is None 等价于 if feats is not None。
将val,test均为None的node选为训练数据，通过id_map获取其在feature表中的索引值，添加到train_ids数组中。
根据索引train_ids，train_fests获取这些nodes的features。

if normalize and not feats is None:
        from sklearn.preprocessing import StandardScaler
        train_ids = np.array([id_map[n] for n in G.nodes(
        ) if not G.node[n]['val'] and not G.node[n]['test']])
        train_feats = feats[train_ids]
        scaler = StandardScaler()
        scaler.fit(train_feats)
        feats = scaler.transform(feats)

（5）StandardScaler的用法
http://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.StandardScaler.html
Methods:

fit(X[, y]) : Compute the mean and std to be used for later scaling. 预处理的数据，计算矩阵列均值和列标准差
transform(X[, y, copy]) : Perform standardization by centering and scaling：得到标准化的矩阵。用此方法，必须使用fit先进行预处理计算均值和标准差，然后用fit计算的均值和标准差，进行标准化处理 {x_i - u}/标准差
fit_transform(X[, y]) : Fit to data, then transform it. 相当于是fit和transform的组合

>>> from sklearn.preprocessing import StandardScaler
>>> data = [[0, 0], [0, 0], [1, 1], [1, 1]]
>>> scaler = StandardScaler()
>>> print(scaler.fit(data))
StandardScaler(copy=True, with_mean=True, with_std=True)
>>> print(scaler.mean_)
[0.5 0.5]
>>> print(scaler.transform(data))
[[-1. -1.]
 [-1. -1.]
 [ 1.  1.]
 [ 1.  1.]]
>>> print(scaler.transform([[2, 2]]))
[[3. 3.]]

# 计算得
# 均值[0.5, 0.5], 
# 方差：1/4 * [(0 - 0.5)^2 * 2 + (1 - 0.5)^2 * 2] = 1/4 = 0.25
# 标准差：0.5
# 对于[2,2] transform 标准化之后: (2 - 0.5) / 0.5 = 3

（6）map() 的用法
map(function, iterable, …)
map() 会根据提供的函数对指定序列做映射。
第一个参数 function 以参数序列中的每一个元素调用 function 函数，返回包含每次 function 函数返回值的新列表。
例子：

>>>def square(x) :            # 计算平方数
...     return x ** 2
... 
>>> map(square, [1,2,3,4,5])   # 计算列表各个元素的平方
[1, 4, 9, 16, 25]
>>> map(lambda x: x ** 2, [1, 2, 3, 4, 5])  # 使用 lambda 匿名函数
[1, 4, 9, 16, 25]
 
# 提供了两个列表，对相同位置的列表数据进行相加
>>> map(lambda x, y: x + y, [1, 3, 5, 7, 9], [2, 4, 6, 8, 10])
[3, 7, 11, 15, 19]

utils.py完整代码

from __future__ import print_function

import numpy as np
import random
import json
import sys
import os

import networkx as nx
from networkx.readwrite import json_graph
version_info = list(map(int, nx.__version__.split('.')))
major = version_info[0]
minor = version_info[1]
assert (major <= 1) and (minor <= 11), "networkx major version > 1.11"

WALK_LEN=5
N_WALKS=50

def load_data(prefix, normalize=True, load_walks=False):
    G_data = json.load(open(prefix + "-G.json"))
    G = json_graph.node_link_graph(G_data) #Return graph from node-link data format
    print("G.nodes():",G.nodes)
    #G.nodes(): >

    # print("list(G.nodes):",list(G))
    # list(G.nodes): [0, 1, 2, 3, 4,...,14754]

    # print("G.nodes.data():",G.nodes.data()) #AttributeError: 'function' object has no attribute 'data'

    # print(list(G.nodes(data=True))) #带nodedata
    #类似这种格式：[(0, {'val': False}，‘label':[...],...), (1, {...}), (2, {...}),...]

    #定义conversion函数
    #判断G.nodes()[0] 是否为int型（即不带nodedata）
    if isinstance(G.nodes()[0], int): #若为int型，则将n转为int型
        conversion = lambda n : int(n)
    else:
        conversion = lambda n : n

    #节点特征存于.npy文件中，如果不存在则feats = None
    if os.path.exists(prefix + "-feats.npy"):
        feats = np.load(prefix + "-feats.npy")
    else:
        print("No features present.. Only identity features will be used.")
        feats = None

    id_map = json.load(open(prefix + "-id_map.json"))
    # print("id_map:",id_map)
    # id_map :{'143': 143, '10758': 10758, '13438': 13438,..
    id_map = {conversion(k):int(v) for k,v in id_map.items()}   #id_map的迭代中k为str类型，v为int型，将其全部转换成整型
    # print("id_map:",id_map)
    #id_map: {0: 0, 1: 1, 2: 2...

    walks = []
    class_map = json.load(open(prefix + "-class_map.json"))
    # print("class_map:",class_map)
    #{"0": [1, 0, 0,...],...,"14754": [1, 1, 0, 0,...]}

    if isinstance(list(class_map.values())[0], list):
        lab_conversion = lambda n : n
    else:
        lab_conversion = lambda n : int(n)

    class_map = {conversion(k):lab_conversion(v) for k,v in class_map.items()} #同上，将class_map的keys和values都转换成成型，并返回新的class_map

    # print("class_map:",class_map)
    #{0: [1, 0, 0,...],...,14754: [1, 1, 0, 0,...]}


    #这里删除的节点是不具有'val'或'test'属性 的节点，而不是'val'，'test' 属性值为None的节点。
    #区分开 if not 'val' in G.node[node] 和 if not G.node[n]['val']的不同意义。
    ## Remove all nodes that do not have val/test annotations
    ## (necessary because of networkx weirdness with the Reddit data)
    broken_count = 0  #记录删去的没有val 或者 test的属性的节点的数目。
    for node in G.nodes():
        if not 'val' in G.node[node] or not 'test' in G.node[node]:
            G.remove_node(node)
            broken_count += 1
    print("Removed {:d} nodes that lacked proper annotations due to networkx versioning issues".format(broken_count))

    ## Make sure the graph has edge train_removed annotations
    ## (some datasets might already have this..)
    # 代码中edge对edges迭代，每次去list中的一个元组，而edge[0], edge[1]则分别表示两个顶点。
    # 若两个顶点中至少有一个的val / test不为空，则将该边的'train_removed'设为True，否则为False。
    # 该操作为保证'train_removed'不为空。
    print("Loaded data.. now preprocessing..")
    for edge in G.edges():
        if (G.node[edge[0]]['val'] or G.node[edge[1]]['val'] or
            G.node[edge[0]]['test'] or G.node[edge[1]]['test']):
            G[edge[0]][edge[1]]['train_removed'] = True
        else:
            G[edge[0]][edge[1]]['train_removed'] = False


    #获取训练数据features并标准化
    #这里if not feats is None 等价于 if feats is not None
    if normalize and not feats is None:
        from sklearn.preprocessing import StandardScaler
        #将val,test均为None的node选为训练数据，通过id_map获取其在feature表中的索引值，添加到train_ids数组中。
        #根据索引train_ids，train_fests获取这些nodes的features.
        train_ids = np.array([id_map[n] for n in G.nodes() if not G.node[n]['val'] and not G.node[n]['test']])
        train_feats = feats[train_ids]
        ## 标准化数据，保证每个维度的特征数据方差为1，均值为0，使得预测结果不会被某些维度过大的特征值而主导
        scaler = StandardScaler()
        scaler.fit(train_feats)
        feats = scaler.transform(feats)

    #如果load_walks为True则加载walks
    if load_walks:
        with open(prefix + "-walks.txt") as fp:
            for line in fp:
                #map将walk中的数据都转为整型
                #walks初始化为[]， 之后append的是游走的节点对的对象
                walks.append(map(conversion, line.split()))

    # print("walks:",walks)
    #[, , ....]
    print("len(walks):",len(walks))
    #len(walks): 1895817


    return G, feats, id_map, walks, class_map

def run_random_walks(G, nodes, num_walks=N_WALKS):
    pairs = []
    for count, node in enumerate(nodes):
        if G.degree(node) == 0:
            continue
        for i in range(num_walks):
            curr_node = node
            for j in range(WALK_LEN):
                next_node = random.choice(G.neighbors(curr_node))
                # self co-occurrences are useless
                if curr_node != node:
                    pairs.append((node,curr_node))
                curr_node = next_node
        if count % 1000 == 0:
            print("Done walks for", count, "nodes")
    return pairs

if __name__ == "__main__":
    """ Run random walks """
    graph_file = sys.argv[1]
    out_file = sys.argv[2]
    G_data = json.load(open(graph_file))
    G = json_graph.node_link_graph(G_data)
    nodes = [n for n in G.nodes() if not G.node[n]["val"] and not G.node[n]["test"]]
    G = G.subgraph(nodes)
    pairs = run_random_walks(G, nodes)
    with open(out_file, "w") as fp:
        fp.write("\n".join([str(p[0]) + "\t" + str(p[1]) for p in pairs]))

`neigh_samplers.py`

定义了从节点的邻居中采样的均匀采样器。
均匀：shuffle打乱0维的顺序，即打乱行顺序，以此使下面采样可以“均匀”。为了使用shuffle函数，需要在shuffle前后transpose一下。
采样：slice之后，相当于随机挑选了num_samples个样本，并保留了这些样本的全部属性特征。
最后的adj_lists即为均匀采样后的表示邻居信息的矩阵。

neigh_samplers.py完整代码

from __future__ import division
from __future__ import print_function

from graphsage.layers import Layer

import tensorflow as tf
flags = tf.app.flags
FLAGS = flags.FLAGS


"""
Classes that are used to sample node neighborhoods
"""

class UniformNeighborSampler(Layer):
    """
    Uniformly samples neighbors.
    Assumes that adj lists are padded with random re-sampling
    """
    def __init__(self, adj_info, **kwargs):
        super(UniformNeighborSampler, self).__init__(**kwargs)
        self.adj_info = adj_info

    def _call(self, inputs):
        ids, num_samples = inputs
        adj_lists = tf.nn.embedding_lookup(self.adj_info, ids)  #用于根据ids在adj_info中找到各个对应位的向量。
        adj_lists = tf.transpose(tf.random_shuffle(tf.transpose(adj_lists)))
        adj_lists = tf.slice(adj_lists, [0,0], [-1, num_samples])
        return adj_lists

`models.py`

（1）namedtuple 命名元组，可以给tuple命名
例子：

import collections

MyTupleClass = collections.namedtuple('MyTupleClass',['name', 'age', 'job'])
obj = MyTupleClass("Tomsom",12,'Cooker')
print(obj.name)# Tomsom
print(obj.age) # 12
print(obj.job) # Cooker


Person=collections.namedtuple('Person','name age gender')
# 以空格分开，表示这个namedtuple有三个元素

print( 'Type of Person:',type(Person)) # Type of Person: 
Bob=Person(name='Bob',age=30,gender='male') 
print( 'Representation:',Bob) #Representation: Person(name='Bob', age=30, gender='male')
Jane=Person(name='Jane',age=29,gender='female')
print( 'Field by Name:',Jane.name) # Field by Name: Jane
for people in [Bob,Jane]:
    print ("%s is %d years old %s" % people)
# Bob is 30 years old male
# Jane is 29 years old female


# 在使用namedtyuple的时候要注意其中的名称不能使用Python的关键字，如class def等
# 不能有重复的元素名称，比如：不能有两个’age age’。如果出现这些情况，程序会报错。
# 但是，在实际使用的时候可能无法避免这种情况，
# 比如:可能我们的元素名称是从数据库里读出来的记录，这样很难保证一定不会出现Python关键字。
# 这种情况下的解决办法是将namedtuple的重命名模式打开，
# 这样如果遇到Python关键字或者有重复元素名 时，自动进行重命名。

with_class=collections.namedtuple('Person','name age class gender',rename=True)
print(with_class._fields)# ('name', 'age', '_2', 'gender')
two_ages=collections.namedtuple('Person','name age gender age',rename=True)
print(two_ages._fields)# ('name', 'age', 'gender', '_3')

# 使用rename=True的方式打开重命名选项
# 可以看到第一个集合中的class被重命名为 ‘_2' ；
# 第二个集合中重复的age被重命名为 ‘_3'
# namedtuple在重命名的时候使用了下划线 _ 加元素所在索引数的方式进行重命名

（2）class SampleAndAggregate(GeneralizedModel)主要包含的函数

def __init__(self, placeholders, features, adj, degrees, layer_infos, concat=True, aggregator_type=“mean”, model_size=“small”, identity_dim=0, **kwargs)
def sample(self, inputs, layer_infos, batch_size=None)
def aggregate(self, samples, input_features, dims, num_samples, support_sizes, batch_size=None,aggregators=None, name=None, concat=False, model_size=“small”)
def _build(self)
def build(self)
def _loss(self)
def _accuracy(self)

（3）类及其继承关系

      Model 
     /   \
    /     \
  MLP   GeneralizedModel
          /  \
         /    \
Node2VecModel  SampleAndAggregate

其中Model与 GeneralizedModel的区别在于，Model的build()函数中搭建了序列层模型，而在GeneralizedModel中被删去
self.ouput必须在GeneralizedModel的子类build()中被赋值
序列层实现的功能是，给输入，通过layer()返回输出，又将这个输出再次作为输入到下一个layer()中，最终，取最后一层layer的结果作为output

（4）class SampleAndAggregate(GeneralizedModel)

__init__()中self.features的由来

para: features    tf.get_variable()-> identity features
     |                   |
self.features     self.embeds   --> At least one is not None
      \                 /       --> Concat if both are not None 
       \               /
        \             /
         self.features

__init__()中self.dims：
self.dims是一个list, 每一位记录各个神经网络层的维数。
self.dims[0]的值相当于self.features的列数 (0 if features is None else features.shape[1]) + identity_dim)，(注意：括号里features为传入的参数，而非self.features)。
之后各位为各层output_dim，也就是hidden units的个数。
sample(inputs, layer_infos, batch_size=None)
对于sample的算法描述，详见论文Appendix A, algorithm 2。

GraphSAGE NIPS 2017 代码分析（Tensorflow版）_第1张图片

sampler = layer_infos[t].neigh_sampler

当函数被调用时，layer_infos会被赋值，在unsupervised_train.py中，其中neigh_sampler被赋为UniformNeighborSampler，其在neigh_samplers.py中定义：class UniformNeighborSampler(Layer)。

目的是对于输入的samples[k] (即为上一步sample得到的节点，如上图依次得到黄色区域表示的samples[0]，橙色区域表示的samples[1], 粉色区域表示的samples[2]。其中samples[k]是有由对samples[k - 1]中各节点的邻居采样而得)，选取num_samples个数的邻居节点的序号（对应上图N(u)）。（返回值是adj_lists, 即为被截断为num_samples列数的邻接矩阵。）

这里注意区别support_size与num_samples:

num_sample为当前深度每个节点u所选取的邻居节点的个数为num_samples;

support_size表示当前节点u的embedding受多少节点信息的影响。其既受当前层num_samples个直接邻居的影响，其邻居也受更先前深度num_samples个邻居的影响，以此类推。故support_size是到目前深度为止的各深度下num_samples的连乘积。则对于batch_size个输入节点，总的support个数为: support_size * batch_size。

最后将support_size存进support_sizes的数组中。

sample() 函数最终返回包含各深度下采样点的samples数组与各深度下各点受支持节点数目的support_sizes数组。

（5）tf.nn.fixed_unigram_candidate_sampler

按照用户提供的概率分布进行采样。
如果类别服从均匀分布，我们就用uniform_candidate_sampler;
如果词作类别，我们知道词服从 Zipfian, 我们就用 log_uniform_candidate_sampler;
如果能够通过统计或者其他渠道知道类别满足某些分布，用 nn.fixed_unigram_candidate_sampler;
如果实在不知道类别分布，我们还可以用 tf.nn.learned_unigram_candidate_sampler。

(2) Paras:
a. num_sampled:

sampling_candidates的元素是在没有替换（如果unique = True）的情况下绘制的，
或者是从基本分布中替换（如果unique = False）。

unique = True 可以看作无放回抽样；unique = False 可以看作有放回抽样。

b. distortion:

distortion used the word2vec freq energy table formulation
f^(3/4) / total(f^(3/4))
in word2vec energy counted by freq;
in graphsage energy counted by degrees
so in unigrams = [] each ID recored each node's degree

c. unigrams: 

各个节点的度。

(3) Returns:
a. sampled_candidates: A tensor of type int64 and shape [num_sampled]. The sampled classes.
b. true_expected_count: A tensor of type float. Same shape as true_classes. The expected counts under the sampling distribution of each of true_classes.
c. sampled_expected_count: A tensor of type float. Same shape as sampled_candidates. The expected counts under the sampling distribution of each of sampled_candidates.

models.py完整代码

from collections import namedtuple

import tensorflow as tf
import math

import graphsage.layers as layers
import graphsage.metrics as metrics

from .prediction import BipartiteEdgePredLayer
from .aggregators import MeanAggregator, MaxPoolingAggregator, MeanPoolingAggregator, SeqAggregator, GCNAggregator

flags = tf.app.flags
FLAGS = flags.FLAGS

# DISCLAIMER:
# Boilerplate parts of this code file were originally forked from
# https://github.com/tkipf/gcn
# which itself was very inspired by the keras package

class Model(object):
    def __init__(self, **kwargs):
        allowed_kwargs = {'name', 'logging', 'model_size'}
        for kwarg in kwargs.keys():
            assert kwarg in allowed_kwargs, 'Invalid keyword argument: ' + kwarg
        name = kwargs.get('name')
        if not name:
            name = self.__class__.__name__.lower()
        self.name = name

        logging = kwargs.get('logging', False)
        self.logging = logging

        self.vars = {}
        self.placeholders = {}

        self.layers = []
        self.activations = []

        self.inputs = None
        self.outputs = None

        self.loss = 0
        self.accuracy = 0
        self.optimizer = None
        self.opt_op = None

    def _build(self):
        raise NotImplementedError

    def build(self):
        """ Wrapper for _build() """
        with tf.variable_scope(self.name):
            self._build()

        # Build sequential layer model
        self.activations.append(self.inputs)
        for layer in self.layers:
            hidden = layer(self.activations[-1])
            self.activations.append(hidden)
        self.outputs = self.activations[-1]
        # 这部分sequential layer model模型在GeneralizedModel的build()中被删去
        
        
        # Store model variables for easy access
        variables = tf.get_collection(tf.GraphKeys.GLOBAL_VARIABLES, scope=self.name)
        self.vars = {var.name: var for var in variables}

        # Build metrics
        self._loss()
        self._accuracy()

        self.opt_op = self.optimizer.minimize(self.loss)

    def predict(self):
        pass

    def _loss(self):
        raise NotImplementedError

    def _accuracy(self):
        raise NotImplementedError

    def save(self, sess=None):
        if not sess:
            raise AttributeError("TensorFlow session not provided.")
        saver = tf.train.Saver(self.vars)
        save_path = saver.save(sess, "tmp/%s.ckpt" % self.name)
        print("Model saved in file: %s" % save_path)

    def load(self, sess=None):
        if not sess:
            raise AttributeError("TensorFlow session not provided.")
        saver = tf.train.Saver(self.vars)
        save_path = "tmp/%s.ckpt" % self.name
        saver.restore(sess, save_path)
        print("Model restored from file: %s" % save_path)


class MLP(Model):
    """ A standard multi-layer perceptron """
    def __init__(self, placeholders, dims, categorical=True, **kwargs):
        super(MLP, self).__init__(**kwargs)

        self.dims = dims
        self.input_dim = dims[0]
        self.output_dim = dims[-1]
        self.placeholders = placeholders
        self.categorical = categorical

        self.inputs = placeholders['features']
        self.labels = placeholders['labels']

        self.optimizer = tf.train.AdamOptimizer(learning_rate=FLAGS.learning_rate)

        self.build()

    def _loss(self):
        # Weight decay loss
        for var in self.layers[0].vars.values():
            self.loss += FLAGS.weight_decay * tf.nn.l2_loss(var)

        # Cross entropy error
        if self.categorical:
            self.loss += metrics.masked_softmax_cross_entropy(self.outputs, self.placeholders['labels'],
                    self.placeholders['labels_mask'])
        # L2
        else:
            diff = self.labels - self.outputs
            self.loss += tf.reduce_sum(tf.sqrt(tf.reduce_sum(diff * diff, axis=1)))

    def _accuracy(self):
        if self.categorical:
            self.accuracy = metrics.masked_accuracy(self.outputs, self.placeholders['labels'],
                    self.placeholders['labels_mask'])

    def _build(self):
        self.layers.append(layers.Dense(input_dim=self.input_dim,
                                 output_dim=self.dims[1],
                                 act=tf.nn.relu,
                                 dropout=self.placeholders['dropout'],
                                 sparse_inputs=False,
                                 logging=self.logging))

        self.layers.append(layers.Dense(input_dim=self.dims[1],
                                 output_dim=self.output_dim,
                                 act=lambda x: x,
                                 dropout=self.placeholders['dropout'],
                                 logging=self.logging))

    def predict(self):
        return tf.nn.softmax(self.outputs)

class GeneralizedModel(Model):
    """
    Base class for models that aren't constructed from traditional, sequential layers.
    Subclasses must set self.outputs in _build method

    (Removes the layers idiom from build method of the Model class)
    """

    def __init__(self, **kwargs):
        super(GeneralizedModel, self).__init__(**kwargs)
        

    def build(self):
        """ Wrapper for _build() """
        with tf.variable_scope(self.name):
            self._build()

        # Store model variables for easy access
        variables = tf.get_collection(tf.GraphKeys.GLOBAL_VARIABLES, scope=self.name)
        self.vars = {var.name: var for var in variables}

        # Build metrics
        self._loss()
        self._accuracy()

        self.opt_op = self.optimizer.minimize(self.loss)

# SAGEInfo is a namedtuple that specifies the parameters of the recursive GraphSAGE layers
SAGEInfo = namedtuple("SAGEInfo",
    ['layer_name', # name of the layer (to get feature embedding etc.)
     'neigh_sampler', # callable neigh_sampler constructor
     'num_samples',
     'output_dim' # the output (i.e., hidden) dimension
    ])

class SampleAndAggregate(GeneralizedModel):
    """
    Base implementation of unsupervised GraphSAGE
    """

    def __init__(self, placeholders, features, adj, degrees,
            layer_infos, concat=True, aggregator_type="mean", 
            model_size="small", identity_dim=0,
            **kwargs):
        '''
        Args:
            - placeholders: Stanford TensorFlow placeholder object.
            - features: Numpy array with node features. 
                        NOTE: Pass a None object to train in featureless mode (identity features for nodes)!
            - adj: Numpy array with adjacency lists (padded with random re-samples)
            - degrees: Numpy array with node degrees. 
            - layer_infos: List of SAGEInfo namedtuples that describe the parameters of all 
                   the recursive layers. See SAGEInfo definition above.
            - concat: whether to concatenate during recursive iterations
            - aggregator_type: how to aggregate neighbor information
            - model_size: one of "small" and "big"
            - identity_dim: Set to positive int to use identity features (slow and cannot generalize, but better accuracy)
        '''
        super(SampleAndAggregate, self).__init__(**kwargs)
        if aggregator_type == "mean":
            self.aggregator_cls = MeanAggregator
        elif aggregator_type == "seq":
            self.aggregator_cls = SeqAggregator
        elif aggregator_type == "maxpool":
            self.aggregator_cls = MaxPoolingAggregator
        elif aggregator_type == "meanpool":
            self.aggregator_cls = MeanPoolingAggregator
        elif aggregator_type == "gcn":
            self.aggregator_cls = GCNAggregator
        else:
            raise Exception("Unknown aggregator: ", self.aggregator_cls)

        # get info from placeholders...
        self.inputs1 = placeholders["batch1"]
        self.inputs2 = placeholders["batch2"]
        self.model_size = model_size
        self.adj_info = adj
        if identity_dim > 0:
           self.embeds = tf.get_variable("node_embeddings", [adj.get_shape().as_list()[0], identity_dim])
        else:
           self.embeds = None
        if features is None: 
            if identity_dim == 0:
                raise Exception("Must have a positive value for identity feature dimension if no input features given.")
            self.features = self.embeds
        else:
            self.features = tf.Variable(tf.constant(features, dtype=tf.float32), trainable=False)
            if not self.embeds is None:
                self.features = tf.concat([self.embeds, self.features], axis=1)
        self.degrees = degrees
        self.concat = concat

        self.dims = [(0 if features is None else features.shape[1]) + identity_dim]
        self.dims.extend([layer_infos[i].output_dim for i in range(len(layer_infos))])
        self.batch_size = placeholders["batch_size"]
        self.placeholders = placeholders
        self.layer_infos = layer_infos

        self.optimizer = tf.train.AdamOptimizer(learning_rate=FLAGS.learning_rate)

        self.build()

    def sample(self, inputs, layer_infos, batch_size=None):
        """ Sample neighbors to be the supportive fields for multi-layer convolutions.

        Args:
            inputs: batch inputs
            batch_size: the number of inputs (different for batch inputs and negative samples).
        """
        
        if batch_size is None:
            batch_size = self.batch_size
        samples = [inputs]
        # size of convolution support at each layer per node
        support_size = 1
        support_sizes = [support_size]
        for k in range(len(layer_infos)):
            t = len(layer_infos) - k - 1
            support_size *= layer_infos[t].num_samples
            sampler = layer_infos[t].neigh_sampler
            node = sampler((samples[k], layer_infos[t].num_samples))
            samples.append(tf.reshape(node, [support_size * batch_size,]))
            support_sizes.append(support_size)
        return samples, support_sizes


    def aggregate(self, samples, input_features, dims, num_samples, support_sizes, batch_size=None,
            aggregators=None, name=None, concat=False, model_size="small"):
        """ At each layer, aggregate hidden representations of neighbors to compute the hidden representations 
            at next layer.
        Args:
            samples: a list of samples of variable hops away for convolving at each layer of the
                network. Length is the number of layers + 1. Each is a vector of node indices.
            input_features: the input features for each sample of various hops away.
            dims: a list of dimensions of the hidden representations from the input layer to the
                final layer. Length is the number of layers + 1.
            num_samples: list of number of samples for each layer.
            support_sizes: the number of nodes to gather information from for each layer.
            batch_size: the number of inputs (different for batch inputs and negative samples).
        Returns:
            The hidden representation at the final layer for all nodes in batch
        """

        if batch_size is None:
            batch_size = self.batch_size

        # length: number of layers + 1
        hidden = [tf.nn.embedding_lookup(input_features, node_samples) for node_samples in samples]
        new_agg = aggregators is None
        if new_agg:
            aggregators = []
        for layer in range(len(num_samples)):
            if new_agg:
                dim_mult = 2 if concat and (layer != 0) else 1
                # aggregator at current layer
                if layer == len(num_samples) - 1:
                    aggregator = self.aggregator_cls(dim_mult*dims[layer], dims[layer+1], act=lambda x : x,
                            dropout=self.placeholders['dropout'], 
                            name=name, concat=concat, model_size=model_size)
                else:
                    aggregator = self.aggregator_cls(dim_mult*dims[layer], dims[layer+1],
                            dropout=self.placeholders['dropout'], 
                            name=name, concat=concat, model_size=model_size)
                aggregators.append(aggregator)
            else:
                aggregator = aggregators[layer]
            # hidden representation at current layer for all support nodes that are various hops away
            next_hidden = []
            # as layer increases, the number of support nodes needed decreases
            for hop in range(len(num_samples) - layer):
                dim_mult = 2 if concat and (layer != 0) else 1
                neigh_dims = [batch_size * support_sizes[hop], 
                              num_samples[len(num_samples) - hop - 1], 
                              dim_mult*dims[layer]]
                h = aggregator((hidden[hop],
                                tf.reshape(hidden[hop + 1], neigh_dims)))
                next_hidden.append(h)
            hidden = next_hidden
        return hidden[0], aggregators

    def _build(self):
        labels = tf.reshape(
                tf.cast(self.placeholders['batch2'], dtype=tf.int64),
                [self.batch_size, 1])
        self.neg_samples, _, _ = (tf.nn.fixed_unigram_candidate_sampler(
            true_classes=labels,
            num_true=1,
            num_sampled=FLAGS.neg_sample_size,
            unique=False,
            range_max=len(self.degrees),
            distortion=0.75,
            unigrams=self.degrees.tolist()))

           
        # perform "convolution"
        samples1, support_sizes1 = self.sample(self.inputs1, self.layer_infos)
        samples2, support_sizes2 = self.sample(self.inputs2, self.layer_infos)
        num_samples = [layer_info.num_samples for layer_info in self.layer_infos]
        self.outputs1, self.aggregators = self.aggregate(samples1, [self.features], self.dims, num_samples,
                support_sizes1, concat=self.concat, model_size=self.model_size)
        self.outputs2, _ = self.aggregate(samples2, [self.features], self.dims, num_samples,
                support_sizes2, aggregators=self.aggregators, concat=self.concat,
                model_size=self.model_size)

        neg_samples, neg_support_sizes = self.sample(self.neg_samples, self.layer_infos,
            FLAGS.neg_sample_size)
        self.neg_outputs, _ = self.aggregate(neg_samples, [self.features], self.dims, num_samples,
                neg_support_sizes, batch_size=FLAGS.neg_sample_size, aggregators=self.aggregators,
                concat=self.concat, model_size=self.model_size)

        dim_mult = 2 if self.concat else 1
        self.link_pred_layer = BipartiteEdgePredLayer(dim_mult*self.dims[-1],
                dim_mult*self.dims[-1], self.placeholders, act=tf.nn.sigmoid, 
                bilinear_weights=False,
                name='edge_predict')

        self.outputs1 = tf.nn.l2_normalize(self.outputs1, 1)
        self.outputs2 = tf.nn.l2_normalize(self.outputs2, 1)
        self.neg_outputs = tf.nn.l2_normalize(self.neg_outputs, 1)

    def build(self):
        self._build()

        # TF graph management
        self._loss()
        self._accuracy()
        self.loss = self.loss / tf.cast(self.batch_size, tf.float32)
        grads_and_vars = self.optimizer.compute_gradients(self.loss)
        clipped_grads_and_vars = [(tf.clip_by_value(grad, -5.0, 5.0) if grad is not None else None, var) 
                for grad, var in grads_and_vars]
        self.grad, _ = clipped_grads_and_vars[0]
        self.opt_op = self.optimizer.apply_gradients(clipped_grads_and_vars)

    def _loss(self):
        for aggregator in self.aggregators:
            for var in aggregator.vars.values():
                self.loss += FLAGS.weight_decay * tf.nn.l2_loss(var)

        self.loss += self.link_pred_layer.loss(self.outputs1, self.outputs2, self.neg_outputs) 
        tf.summary.scalar('loss', self.loss)

    def _accuracy(self):
        # shape: [batch_size]
        aff = self.link_pred_layer.affinity(self.outputs1, self.outputs2)
        # shape : [batch_size x num_neg_samples]
        self.neg_aff = self.link_pred_layer.neg_cost(self.outputs1, self.neg_outputs)
        self.neg_aff = tf.reshape(self.neg_aff, [self.batch_size, FLAGS.neg_sample_size])
        _aff = tf.expand_dims(aff, axis=1)
        self.aff_all = tf.concat(axis=1, values=[self.neg_aff, _aff])
        size = tf.shape(self.aff_all)[1]
        _, indices_of_ranks = tf.nn.top_k(self.aff_all, k=size)
        _, self.ranks = tf.nn.top_k(-indices_of_ranks, k=size)
        self.mrr = tf.reduce_mean(tf.div(1.0, tf.cast(self.ranks[:, -1] + 1, tf.float32)))
        tf.summary.scalar('mrr', self.mrr)


class Node2VecModel(GeneralizedModel):
    def __init__(self, placeholders, dict_size, degrees, name=None,
                 nodevec_dim=50, lr=0.001, **kwargs):
        """ Simple version of Node2Vec/DeepWalk algorithm.

        Args:
            dict_size: the total number of nodes.
            degrees: numpy array of node degrees, ordered as in the data's id_map
            nodevec_dim: dimension of the vector representation of node.
            lr: learning rate of optimizer.
        """

        super(Node2VecModel, self).__init__(**kwargs)

        self.placeholders = placeholders
        self.degrees = degrees
        self.inputs1 = placeholders["batch1"]
        self.inputs2 = placeholders["batch2"]

        self.batch_size = placeholders['batch_size']
        self.hidden_dim = nodevec_dim

        # following the tensorflow word2vec tutorial
        self.target_embeds = tf.Variable(
                tf.random_uniform([dict_size, nodevec_dim], -1, 1),
                name="target_embeds")
        self.context_embeds = tf.Variable(
                tf.truncated_normal([dict_size, nodevec_dim],
                stddev=1.0 / math.sqrt(nodevec_dim)),
                name="context_embeds")
        self.context_bias = tf.Variable(
                tf.zeros([dict_size]),
                name="context_bias")

        self.optimizer = tf.train.GradientDescentOptimizer(learning_rate=lr)

        self.build()

    def _build(self):
        labels = tf.reshape(
                tf.cast(self.placeholders['batch2'], dtype=tf.int64),
                [self.batch_size, 1])
        self.neg_samples, _, _ = (tf.nn.fixed_unigram_candidate_sampler(
            true_classes=labels,
            num_true=1,
            num_sampled=FLAGS.neg_sample_size,
            unique=True,
            range_max=len(self.degrees),
            distortion=0.75,
            unigrams=self.degrees.tolist()))

        self.outputs1 = tf.nn.embedding_lookup(self.target_embeds, self.inputs1)
        self.outputs2 = tf.nn.embedding_lookup(self.context_embeds, self.inputs2)
        self.outputs2_bias = tf.nn.embedding_lookup(self.context_bias, self.inputs2)
        self.neg_outputs = tf.nn.embedding_lookup(self.context_embeds, self.neg_samples)
        self.neg_outputs_bias = tf.nn.embedding_lookup(self.context_bias, self.neg_samples)

        self.link_pred_layer = BipartiteEdgePredLayer(self.hidden_dim, self.hidden_dim,
                self.placeholders, bilinear_weights=False)

    def build(self):
        self._build()
        # TF graph management
        self._loss()
        self._minimize()
        self._accuracy()

    def _minimize(self):
        self.opt_op = self.optimizer.minimize(self.loss)

    def _loss(self):
        aff = tf.reduce_sum(tf.multiply(self.outputs1, self.outputs2), 1) + self.outputs2_bias
        neg_aff = tf.matmul(self.outputs1, tf.transpose(self.neg_outputs)) + self.neg_outputs_bias
        true_xent = tf.nn.sigmoid_cross_entropy_with_logits(
                labels=tf.ones_like(aff), logits=aff)
        negative_xent = tf.nn.sigmoid_cross_entropy_with_logits(
                labels=tf.zeros_like(neg_aff), logits=neg_aff)
        loss = tf.reduce_sum(true_xent) + tf.reduce_sum(negative_xent)
        self.loss = loss / tf.cast(self.batch_size, tf.float32)
        tf.summary.scalar('loss', self.loss)
        
    def _accuracy(self):
        # shape: [batch_size]
        aff = self.link_pred_layer.affinity(self.outputs1, self.outputs2)
       # shape : [batch_size x num_neg_samples]
        self.neg_aff = self.link_pred_layer.neg_cost(self.outputs1, self.neg_outputs)
        self.neg_aff = tf.reshape(self.neg_aff, [self.batch_size, FLAGS.neg_sample_size])
        _aff = tf.expand_dims(aff, axis=1)
        self.aff_all = tf.concat(axis=1, values=[self.neg_aff, _aff])
        size = tf.shape(self.aff_all)[1]
        _, indices_of_ranks = tf.nn.top_k(self.aff_all, k=size)
        _, self.ranks = tf.nn.top_k(-indices_of_ranks, k=size)
        self.mrr = tf.reduce_mean(tf.div(1.0, tf.cast(self.ranks[:, -1] + 1, tf.float32)))
        tf.summary.scalar('mrr', self.mrr)

`layers.py`

（1）_LAYER_UIDS = {}
_LAYER_UIDS = {} 是记录layer及其出现次数的字典
在class Layer中，当未赋variable scope的name时，通过实例化Layer的次数来标定不同的layer_id
例子：

class Layer():
    def __init__(self):
        layer = self.__class__.__name__
        name = layer + '_' + str(get_layer_uid(layer))
        print(name) 

layer1 = Layer()
layer2 = Layer()

# Output:
# Layer_1
# Layer_2

（2）class Layer
class Layer主要定义基本的层的API。
方法：

__init__(): 获取传入的name, logging, model_size参数。初始化实例变量name, vars{}, logging, sparse_inputs
_call(inputs): 定义层的计算图：获取input, 返回output
__call__(inputs): 相当于_call()的装饰器，在实现列_call()基本功能后，丰富了其功能，这里主要通过tf.summary.histogram() 可以查看inputs与outputs分布情况的直方图
_log_vars(): 记录所有变量。实现时主要将vars中的各个变量以直方图形式显示

（2）class Dense
Dense layer主要用于实现全连接层的基本功能。即为了最终得到 Relu(Wx + b)。

__init__(): 用于获取初始化成员变量。其中num_features_nonzero和featureless的作用目前还不清楚
_call(): 用于实现并且返回Relu(Wx + b)

layers.py完整代码

from __future__ import division
from __future__ import print_function

import tensorflow as tf

from graphsage.inits import zeros

flags = tf.app.flags
FLAGS = flags.FLAGS

# DISCLAIMER:
# Boilerplate parts of this code file were originally forked from
# https://github.com/tkipf/gcn
# which itself was very inspired by the keras package

# global unique layer ID dictionary for layer name assignment
_LAYER_UIDS = {} #记录layer及其出现次数的字典

#作用： 在class Layer中，当未赋variable scope的name时，通过实例化Layer的次数来标定不同的layer_id
def get_layer_uid(layer_name=''):
    """Helper function, assigns unique layer IDs."""
    if layer_name not in _LAYER_UIDS:
        _LAYER_UIDS[layer_name] = 1  #若layer_name从未出现过，如今出现了，则将_LAYER_UIDS[layer_name]设为1；否则累加
        return 1
    else:
        _LAYER_UIDS[layer_name] += 1
        return _LAYER_UIDS[layer_name]

class Layer(object):
    """Base layer class. Defines basic API for all layer objects.
    Implementation inspired by keras (http://keras.io).
    # Properties
        name: String, defines the variable scope of the layer.
        logging: Boolean, switches Tensorflow histogram logging on/off

    # Methods
        _call(inputs): Defines computation graph of layer
            (i.e. takes input, returns output)
        __call__(inputs): Wrapper for _call()
        _log_vars(): Log all variables
    """

    def __init__(self, **kwargs):
        allowed_kwargs = {'name', 'logging', 'model_size'}
        for kwarg in kwargs.keys():
            assert kwarg in allowed_kwargs, 'Invalid keyword argument: ' + kwarg
        name = kwargs.get('name')
        if not name:
            layer = self.__class__.__name__.lower()
            name = layer + '_' + str(get_layer_uid(layer))
        self.name = name
        self.vars = {}
        logging = kwargs.get('logging', False)
        self.logging = logging
        self.sparse_inputs = False


    def _call(self, inputs):
        return inputs

    #重写了__call__ python内置函数，可以把对象当函数调用,__call__输入为inputs，输出为outputs
    def __call__(self, inputs):
        with tf.name_scope(self.name):
            if self.logging and not self.sparse_inputs:
                tf.summary.histogram(self.name + '/inputs', inputs)
            outputs = self._call(inputs) #在子类也可以把对象当作函数调用，即调用__call__时也会自动调用_call()函数
            if self.logging:
                tf.summary.histogram(self.name + '/outputs', outputs)
            return outputs

    def _log_vars(self):
        for var in self.vars:
            tf.summary.histogram(self.name + '/vars/' + var, self.vars[var])


class Dense(Layer):
    """Dense layer."""
    def __init__(self, input_dim, output_dim, dropout=0., 
                 act=tf.nn.relu, placeholders=None, bias=True, featureless=False, 
                 sparse_inputs=False, **kwargs):
        super(Dense, self).__init__(**kwargs)

        self.dropout = dropout

        self.act = act
        self.featureless = featureless
        self.bias = bias
        self.input_dim = input_dim
        self.output_dim = output_dim

        # helper variable for sparse dropout
        self.sparse_inputs = sparse_inputs
        if sparse_inputs:
            self.num_features_nonzero = placeholders['num_features_nonzero']

        with tf.variable_scope(self.name + '_vars'):
            self.vars['weights'] = tf.get_variable('weights', shape=(input_dim, output_dim),
                                         dtype=tf.float32, 
                                         initializer=tf.contrib.layers.xavier_initializer(),
                                         regularizer=tf.contrib.layers.l2_regularizer(FLAGS.weight_decay))
            if self.bias:
                self.vars['bias'] = zeros([output_dim], name='bias')

        if self.logging:
            self._log_vars()

    def _call(self, inputs):
        x = inputs

        x = tf.nn.dropout(x, 1-self.dropout)

        # transform
        output = tf.matmul(x, self.vars['weights'])

        # bias
        if self.bias:
            output += self.vars['bias']

        return self.act(output)

`minibatch.py`

minibatch.py完整代码

from __future__ import division
from __future__ import print_function

import numpy as np

np.random.seed(123)

class EdgeMinibatchIterator(object):
    
    """ This minibatch iterator iterates over batches of sampled edges or
    random pairs of co-occuring edges.

    G -- networkx graph
    id2idx -- dict mapping node ids to index in feature tensor
    placeholders -- tensorflow placeholders object
    context_pairs -- if not none, then a list of co-occuring node pairs (from random walks)
    batch_size -- size of the minibatches
    max_degree -- maximum size of the downsampled adjacency lists
    n2v_retrain -- signals that the iterator is being used to add new embeddings to a n2v model
    fixed_n2v -- signals that the iterator is being used to retrain n2v with only existing nodes as context
    """
    def __init__(self, G, id2idx, 
            placeholders, context_pairs=None, batch_size=100, max_degree=25,
            n2v_retrain=False, fixed_n2v=False,
            **kwargs):

        self.G = G
        self.nodes = G.nodes()
        self.id2idx = id2idx
        self.placeholders = placeholders
        self.batch_size = batch_size
        self.max_degree = max_degree
        self.batch_num = 0

        ## 函数shuffle与permutation都是对原来的数组进行重新洗牌,即随机打乱原来的元素顺序
        # permutation不直接在原来的数组上进行操作，而是返回一个新的打乱顺序的数组，并不改变原来的数组。
        self.nodes = np.random.permutation(G.nodes())
        self.adj, self.deg = self.construct_adj()
        self.test_adj = self.construct_test_adj()
        if context_pairs is None:
            edges = G.edges()
        else:
            edges = context_pairs
        self.train_edges = self.edges = np.random.permutation(edges)
        if not n2v_retrain:
            self.train_edges = self._remove_isolated(self.train_edges)
            self.val_edges = [e for e in G.edges() if G[e[0]][e[1]]['train_removed']]
        else:
            if fixed_n2v:
                self.train_edges = self.val_edges = self._n2v_prune(self.edges)
            else:
                self.train_edges = self.val_edges = self.edges

        print(len([n for n in G.nodes() if not G.node[n]['test'] and not G.node[n]['val']]), 'train nodes')
        print(len([n for n in G.nodes() if G.node[n]['test'] or G.node[n]['val']]), 'test nodes')
        self.val_set_size = len(self.val_edges)

    def _n2v_prune(self, edges):
        is_val = lambda n : self.G.node[n]["val"] or self.G.node[n]["test"]
        return [e for e in edges if not is_val(e[1])]

    def _remove_isolated(self, edge_list):
        new_edge_list = []
        missing = 0
        for n1, n2 in edge_list:
            if not n1 in self.G.node or not n2 in self.G.node:
                missing += 1
                continue
            if (self.deg[self.id2idx[n1]] == 0 or self.deg[self.id2idx[n2]] == 0) \
                    and (not self.G.node[n1]['test'] or self.G.node[n1]['val']) \
                    and (not self.G.node[n2]['test'] or self.G.node[n2]['val']):
                continue
            else:
                new_edge_list.append((n1,n2))
        print("Unexpected missing:", missing)
        return new_edge_list

    def construct_adj(self):
        adj = len(self.id2idx)*np.ones((len(self.id2idx)+1, self.max_degree))
        deg = np.zeros((len(self.id2idx),))

        for nodeid in self.G.nodes():
            if self.G.node[nodeid]['test'] or self.G.node[nodeid]['val']:
                continue
            neighbors = np.array([self.id2idx[neighbor] 
                for neighbor in self.G.neighbors(nodeid)
                if (not self.G[nodeid][neighbor]['train_removed'])])
            deg[self.id2idx[nodeid]] = len(neighbors)
            if len(neighbors) == 0:
                continue
            if len(neighbors) > self.max_degree:
                neighbors = np.random.choice(neighbors, self.max_degree, replace=False)
            elif len(neighbors) < self.max_degree:
                neighbors = np.random.choice(neighbors, self.max_degree, replace=True)
            adj[self.id2idx[nodeid], :] = neighbors
        return adj, deg

    def construct_test_adj(self):
        adj = len(self.id2idx)*np.ones((len(self.id2idx)+1, self.max_degree))
        for nodeid in self.G.nodes():
            neighbors = np.array([self.id2idx[neighbor] 
                for neighbor in self.G.neighbors(nodeid)])
            if len(neighbors) == 0:
                continue
            if len(neighbors) > self.max_degree:
                neighbors = np.random.choice(neighbors, self.max_degree, replace=False)
            elif len(neighbors) < self.max_degree:
                neighbors = np.random.choice(neighbors, self.max_degree, replace=True)
            adj[self.id2idx[nodeid], :] = neighbors
        return adj

    def end(self):
        return self.batch_num * self.batch_size >= len(self.train_edges)

    def batch_feed_dict(self, batch_edges):
        batch1 = []
        batch2 = []
        for node1, node2 in batch_edges:
            batch1.append(self.id2idx[node1])
            batch2.append(self.id2idx[node2])

        feed_dict = dict()
        feed_dict.update({self.placeholders['batch_size'] : len(batch_edges)})
        feed_dict.update({self.placeholders['batch1']: batch1})
        feed_dict.update({self.placeholders['batch2']: batch2})

        return feed_dict

    #函数中获取下个edgeminibatch的起始与终止序号，将batch后的边的信息传给batch_feed_dict(self, batch_edges)函数，更新placeholders中的batch1, batch2, batch_size信息
    def next_minibatch_feed_dict(self):
        start_idx = self.batch_num * self.batch_size
        self.batch_num += 1
        end_idx = min(start_idx + self.batch_size, len(self.train_edges))
        batch_edges = self.train_edges[start_idx : end_idx]
        return self.batch_feed_dict(batch_edges)

    def num_training_batches(self):
        return len(self.train_edges) // self.batch_size + 1

    def val_feed_dict(self, size=None):
        edge_list = self.val_edges
        if size is None:
            return self.batch_feed_dict(edge_list)
        else:
            ind = np.random.permutation(len(edge_list))
            val_edges = [edge_list[i] for i in ind[:min(size, len(ind))]]
            return self.batch_feed_dict(val_edges)

    def incremental_val_feed_dict(self, size, iter_num):
        edge_list = self.val_edges
        val_edges = edge_list[iter_num*size:min((iter_num+1)*size, 
            len(edge_list))]
        return self.batch_feed_dict(val_edges), (iter_num+1)*size >= len(self.val_edges), val_edges

    def incremental_embed_feed_dict(self, size, iter_num):
        node_list = self.nodes
        val_nodes = node_list[iter_num*size:min((iter_num+1)*size, 
            len(node_list))]
        val_edges = [(n,n) for n in val_nodes]
        return self.batch_feed_dict(val_edges), (iter_num+1)*size >= len(node_list), val_edges

    def label_val(self):
        train_edges = []
        val_edges = []
        for n1, n2 in self.G.edges():
            if (self.G.node[n1]['val'] or self.G.node[n1]['test'] 
                    or self.G.node[n2]['val'] or self.G.node[n2]['test']):
                val_edges.append((n1,n2))
            else:
                train_edges.append((n1,n2))
        return train_edges, val_edges

    # shuffle直接在原来的数组上进行操作，改变原来数组的顺序，无返回值
    def shuffle(self):
        """ Re-shuffle the training set.
            Also reset the batch number.
        """
        self.train_edges = np.random.permutation(self.train_edges)
        self.nodes = np.random.permutation(self.nodes)
        self.batch_num = 0

class NodeMinibatchIterator(object):
    
    """ 
    This minibatch iterator iterates over nodes for supervised learning.

    G -- networkx graph
    id2idx -- dict mapping node ids to integer values indexing feature tensor
    placeholders -- standard tensorflow placeholders object for feeding
    label_map -- map from node ids to class values (integer or list)
    num_classes -- number of output classes
    batch_size -- size of the minibatches
    max_degree -- maximum size of the downsampled adjacency lists
    """
    def __init__(self, G, id2idx, 
            placeholders, label_map, num_classes, 
            batch_size=100, max_degree=25,
            **kwargs):

        self.G = G
        self.nodes = G.nodes()
        self.id2idx = id2idx
        self.placeholders = placeholders
        self.batch_size = batch_size
        self.max_degree = max_degree
        self.batch_num = 0
        self.label_map = label_map
        self.num_classes = num_classes

        self.adj, self.deg = self.construct_adj()
        self.test_adj = self.construct_test_adj()

        self.val_nodes = [n for n in self.G.nodes() if self.G.node[n]['val']]
        self.test_nodes = [n for n in self.G.nodes() if self.G.node[n]['test']]

        self.no_train_nodes_set = set(self.val_nodes + self.test_nodes)
        self.train_nodes = set(G.nodes()).difference(self.no_train_nodes_set)
        # don't train on nodes that only have edges to test set
        self.train_nodes = [n for n in self.train_nodes if self.deg[id2idx[n]] > 0]

    def _make_label_vec(self, node):
        label = self.label_map[node]
        if isinstance(label, list):
            label_vec = np.array(label)
        else:
            label_vec = np.zeros((self.num_classes))
            class_ind = self.label_map[node]
            label_vec[class_ind] = 1
        return label_vec

    def construct_adj(self):
        # 该矩阵记录训练数据中各节点的邻居节点的编号
        # 采样只取max_degree个邻居节点，采样方法见下
        # 同样进行了行数加一操作
        adj = len(self.id2idx)*np.ones((len(self.id2idx)+1, self.max_degree))

        # 该矩阵记录了每个节点的度数
        deg = np.zeros((len(self.id2idx),))

        for nodeid in self.G.nodes():
            # 在选取邻居节点时进行了筛选，对于G.neighbors(nodeid) 点node的邻居，
            # 只取该node与neighbor相连的边的train_removed = False的neighbor
            # 也就是只取不是val, test的节点
            # neighbors得到了邻居节点编号数列
            if self.G.node[nodeid]['test'] or self.G.node[nodeid]['val']:
                continue
            neighbors = np.array([self.id2idx[neighbor] 
                for neighbor in self.G.neighbors(nodeid)
                if (not self.G[nodeid][neighbor]['train_removed'])])


            # deg各位取值为该位对应nodeid的节点的度数，
            # 也即经过上面筛选后得到的邻居数
            deg[self.id2idx[nodeid]] = len(neighbors)
            if len(neighbors) == 0:
                continue
            if len(neighbors) > self.max_degree:                # np.random.choice为选取size大小的数列
                neighbors = np.random.choice(neighbors, self.max_degree, replace=False)
            elif len(neighbors) < self.max_degree:
                # 经过choice随机选取，得到了固定大小max_degree = 25的直接相连的邻居数列
                neighbors = np.random.choice(neighbors, self.max_degree, replace=True)


            # 把该node的邻居数列，赋值给adj矩阵中对应nodeid位的向量
            adj[self.id2idx[nodeid], :] = neighbors
        return adj, deg

    #在construct_test_adj()  函数中，与上不同之处在于，可以直接得到邻居而无需根据val/test/train_removed筛选
    def construct_test_adj(self):
        adj = len(self.id2idx)*np.ones((len(self.id2idx)+1, self.max_degree))
        for nodeid in self.G.nodes():
            neighbors = np.array([self.id2idx[neighbor] 
                for neighbor in self.G.neighbors(nodeid)])
            if len(neighbors) == 0:
                continue
            if len(neighbors) > self.max_degree:
                neighbors = np.random.choice(neighbors, self.max_degree, replace=False)
            elif len(neighbors) < self.max_degree:
                neighbors = np.random.choice(neighbors, self.max_degree, replace=True)
            adj[self.id2idx[nodeid], :] = neighbors
        return adj

    def end(self):
        return self.batch_num * self.batch_size >= len(self.train_nodes)

    #也即next_minibatch_feed_dict()返回的是下一个edge minibatch的placeholders信息
    def batch_feed_dict(self, batch_nodes, val=False):
        batch1id = batch_nodes
        batch1 = [self.id2idx[n] for n in batch1id]
              
        labels = np.vstack([self._make_label_vec(node) for node in batch1id])
        feed_dict = dict()
        feed_dict.update({self.placeholders['batch_size'] : len(batch1)})
        feed_dict.update({self.placeholders['batch']: batch1})
        feed_dict.update({self.placeholders['labels']: labels})

        return feed_dict, labels

    def node_val_feed_dict(self, size=None, test=False):
        if test:
            val_nodes = self.test_nodes
        else:
            val_nodes = self.val_nodes
        if not size is None:
            val_nodes = np.random.choice(val_nodes, size, replace=True)
        # add a dummy neighbor
        ret_val = self.batch_feed_dict(val_nodes)
        return ret_val[0], ret_val[1]

    def incremental_node_val_feed_dict(self, size, iter_num, test=False):
        if test:
            val_nodes = self.test_nodes
        else:
            val_nodes = self.val_nodes
        val_node_subset = val_nodes[iter_num*size:min((iter_num+1)*size, 
            len(val_nodes))]

        # add a dummy neighbor
        ret_val = self.batch_feed_dict(val_node_subset)
        return ret_val[0], ret_val[1], (iter_num+1)*size >= len(val_nodes), val_node_subset

    def num_training_batches(self):
        return len(self.train_nodes) // self.batch_size + 1

    def next_minibatch_feed_dict(self):
        start_idx = self.batch_num * self.batch_size
        self.batch_num += 1
        end_idx = min(start_idx + self.batch_size, len(self.train_nodes))
        batch_nodes = self.train_nodes[start_idx : end_idx]
        return self.batch_feed_dict(batch_nodes)

    def incremental_embed_feed_dict(self, size, iter_num):
        node_list = self.nodes
        val_nodes = node_list[iter_num*size:min((iter_num+1)*size, 
            len(node_list))]
        return self.batch_feed_dict(val_nodes), (iter_num+1)*size >= len(node_list), val_nodes

    def shuffle(self):
        """ Re-shuffle the training set.
            Also reset the batch number.
        """
        self.train_nodes = np.random.permutation(self.train_nodes)
        self.batch_num = 0

`aggregators.py`

该类主要用于实现:
$\mathbf{h}^k_v \leftarrow \sigma(\mathbf{W} \cdot \mathrm{MEAN}(\lbrace \mathbf{h}^{k-1}_v \rbrace \cup \lbrace \mathbf{h}^{k-1}_u, \forall u \in \mathcal{N}(v) \rbrace))$

（1）class GCNAggregator(Layer)
这里__init__()与MeanAggregator基本相同，在_call()的实现中略有不同

def _call(self, inputs):
    self_vecs, neigh_vecs = inputs

    neigh_vecs = tf.nn.dropout(neigh_vecs, 1-self.dropout)
    self_vecs = tf.nn.dropout(self_vecs, 1-self.dropout)
    means = tf.reduce_mean(tf.concat([neigh_vecs, 
        tf.expand_dims(self_vecs, axis=1)], axis=1), axis=1)
   
    # [nodes] x [out_dim]
    output = tf.matmul(means, self.vars['weights'])

    # bias
    if self.bias:
        output += self.vars['bias']
   
    return self.act(output)

其中对means求解时：

先将self_vecs行列转换(tf.expand_dims(self_vecs, axis=1)),
之后self_vecs的行数与neigh_vecs行数相同时，将二者concat,即相当于在原先的neigh_vecs矩阵后面新增一列self_vecs的转置
最后将得到的矩阵每行求均值，即得means
之后means与权值矩阵vars[‘weights’]求内积，并加上vars[‘bias’], 最终将该值带入激活函数(ReLu)。

下面举个例子简单说明(例子中省略了点乘W的操作)：

import tensorflow as tf

neigh_vecs = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
self_vecs = [2, 3, 4]

means = tf.reduce_mean(tf.concat([neigh_vecs,
                                  tf.expand_dims(self_vecs, axis=1)], axis=1), axis=1)

print(tf.shape(self_vecs))

print(tf.expand_dims(self_vecs, axis=0))
# Tensor("ExpandDims_1:0", shape=(1, 3), dtype=int32)

print(tf.expand_dims(self_vecs, axis=1))
# Tensor("ExpandDims_2:0", shape=(3, 1), dtype=int32)

sess = tf.Session()
print(sess.run(tf.expand_dims(self_vecs, axis=1)))
# [[2]
#  [3]
#  [4]]

print(sess.run(tf.concat([neigh_vecs,
                          tf.expand_dims(self_vecs, axis=1)], axis=1)))
# [[1 2 3 2]
#  [4 5 6 3]
#  [7 8 9 4]]

print(means)
# Tensor("Mean:0", shape=(3,), dtype=int32)

print(sess.run(tf.reduce_mean(tf.concat([neigh_vecs,
                                         tf.expand_dims(self_vecs, axis=1)], axis=1), axis=1)))
# [2 4 7]

# [[1 2 3 2]   = 8 // 4  = 2
#  [4 5 6 3]   = 18 // 4 = 4
#  [7 8 9 4]]  = 28 // 4 = 7

bias = [1]
output = means + bias
print(sess.run(output))
# [3 5 8]
# [2 + 1, 4 + 1, 7 + 1] = [3, 5, 8]

aggregators.py完整代码

import tensorflow as tf

from .layers import Layer, Dense
from .inits import glorot, zeros

class MeanAggregator(Layer):
    """
    Aggregates via mean followed by matmul and non-linearity.
    """
    #__init_() 用于获取并初始化成员变量 dropout, bias(False), act(ReLu), concat(False), input_dim, output_dim, name(Variable scopr)
    def __init__(self, input_dim, output_dim, neigh_input_dim=None,
            dropout=0., bias=False, act=tf.nn.relu, 
            name=None, concat=False, **kwargs):
        super(MeanAggregator, self).__init__(**kwargs)

        self.dropout = dropout
        self.bias = bias
        self.act = act
        self.concat = concat

        if neigh_input_dim is None:
            neigh_input_dim = input_dim

        if name is not None:
            name = '/' + name
        else:
            name = ''
        #用glorot()方法初始化节点v的权值矩阵 vars['self_weights'] 和邻居节点均值u的权值矩阵 vars['neigh_weights']
        with tf.variable_scope(self.name + name + '_vars'):
            self.vars['neigh_weights'] = glorot([neigh_input_dim, output_dim],
                                                        name='neigh_weights')
            self.vars['self_weights'] = glorot([input_dim, output_dim],
                                                        name='self_weights')
            #用零向量初始化vars['bias']
            if self.bias:
                self.vars['bias'] = zeros([self.output_dim], name='bias')

        #若logging为True,则调用 layers.py 中 class Layer()的成员函数_log_vars(), 生成vars中各个变量的直方图
        if self.logging:
            self._log_vars()

        self.input_dim = input_dim
        self.output_dim = output_dim

    def _call(self, inputs):
        self_vecs, neigh_vecs = inputs


        #tf.nn.dropout(x, keep_prob, noise_shape=None, seed=None, name=None)
        #输出的非0元素是原来的 “1/keep_prob” 倍，以保证总和不变
        neigh_vecs = tf.nn.dropout(neigh_vecs, 1-self.dropout)
        self_vecs = tf.nn.dropout(self_vecs, 1-self.dropout)
        neigh_means = tf.reduce_mean(neigh_vecs, axis=1)
       
        # [nodes] x [out_dim]
        from_neighs = tf.matmul(neigh_means, self.vars['neigh_weights'])

        from_self = tf.matmul(self_vecs, self.vars["self_weights"])
         
        if not self.concat:
            output = tf.add_n([from_self, from_neighs])
        else:
            output = tf.concat([from_self, from_neighs], axis=1) #在concat后其维数变为之前的2倍

        # bias
        if self.bias:
            output += self.vars['bias']
       
        return self.act(output)




#这里__init__()与MeanAggregator基本相同，在_call()的实现中略有不同
class GCNAggregator(Layer):
    """
    Aggregates via mean followed by matmul and non-linearity.
    Same matmul parameters are used self vector and neighbor vectors.
    """

    def __init__(self, input_dim, output_dim, neigh_input_dim=None,
            dropout=0., bias=False, act=tf.nn.relu, name=None, concat=False, **kwargs):
        super(GCNAggregator, self).__init__(**kwargs)

        self.dropout = dropout
        self.bias = bias
        self.act = act
        self.concat = concat

        if neigh_input_dim is None:
            neigh_input_dim = input_dim

        if name is not None:
            name = '/' + name
        else:
            name = ''

        with tf.variable_scope(self.name + name + '_vars'):
            self.vars['weights'] = glorot([neigh_input_dim, output_dim],
                                                        name='neigh_weights')
            if self.bias:
                self.vars['bias'] = zeros([self.output_dim], name='bias')

        if self.logging:
            self._log_vars()

        self.input_dim = input_dim
        self.output_dim = output_dim

    def _call(self, inputs):
        self_vecs, neigh_vecs = inputs

        neigh_vecs = tf.nn.dropout(neigh_vecs, 1-self.dropout)
        self_vecs = tf.nn.dropout(self_vecs, 1-self.dropout)

        # 其中对means求解时，
        # 1.先将self_vecs行列转换(tf.expand_dims(self_vecs, axis=1)),
        # 2.之后self_vecs的行数与neigh_vecs行数相同时，将二者concat, 即相当于在原先的neigh_vecs矩阵后面新增一列self_vecs的转置
        # 3.最后将得到的矩阵每行求均值，即得means.
        # 之后means与权值矩阵vars['weights']
        # 求内积，并加上vars['bias'], 最终将该值带入激活函数(ReLu)
        means = tf.reduce_mean(tf.concat([neigh_vecs, 
            tf.expand_dims(self_vecs, axis=1)], axis=1), axis=1)
       
        # [nodes] x [out_dim]
        output = tf.matmul(means, self.vars['weights'])

        # bias
        if self.bias:
            output += self.vars['bias']
       
        return self.act(output)


class MaxPoolingAggregator(Layer):
    """ Aggregates via max-pooling over MLP functions.
    """
    def __init__(self, input_dim, output_dim, model_size="small", neigh_input_dim=None,
            dropout=0., bias=False, act=tf.nn.relu, name=None, concat=False, **kwargs):
        super(MaxPoolingAggregator, self).__init__(**kwargs)

        self.dropout = dropout
        self.bias = bias
        self.act = act
        self.concat = concat

        if neigh_input_dim is None:
            neigh_input_dim = input_dim

        if name is not None:
            name = '/' + name
        else:
            name = ''

        if model_size == "small":
            hidden_dim = self.hidden_dim = 512
        elif model_size == "big":
            hidden_dim = self.hidden_dim = 1024

        self.mlp_layers = []
        self.mlp_layers.append(Dense(input_dim=neigh_input_dim,
                                 output_dim=hidden_dim,
                                 act=tf.nn.relu,
                                 dropout=dropout,
                                 sparse_inputs=False,
                                 logging=self.logging))

        with tf.variable_scope(self.name + name + '_vars'):
            self.vars['neigh_weights'] = glorot([hidden_dim, output_dim],
                                                        name='neigh_weights')
           
            self.vars['self_weights'] = glorot([input_dim, output_dim],
                                                        name='self_weights')
            if self.bias:
                self.vars['bias'] = zeros([self.output_dim], name='bias')

        if self.logging:
            self._log_vars()

        self.input_dim = input_dim
        self.output_dim = output_dim
        self.neigh_input_dim = neigh_input_dim

    def _call(self, inputs):
        self_vecs, neigh_vecs = inputs
        neigh_h = neigh_vecs

        dims = tf.shape(neigh_h)
        batch_size = dims[0]
        num_neighbors = dims[1]
        # [nodes * sampled neighbors] x [hidden_dim]
        h_reshaped = tf.reshape(neigh_h, (batch_size * num_neighbors, self.neigh_input_dim))

        for l in self.mlp_layers:
            h_reshaped = l(h_reshaped)
        neigh_h = tf.reshape(h_reshaped, (batch_size, num_neighbors, self.hidden_dim))
        neigh_h = tf.reduce_max(neigh_h, axis=1)
        
        from_neighs = tf.matmul(neigh_h, self.vars['neigh_weights'])
        from_self = tf.matmul(self_vecs, self.vars["self_weights"])
        
        if not self.concat:
            output = tf.add_n([from_self, from_neighs])
        else:
            output = tf.concat([from_self, from_neighs], axis=1)

        # bias
        if self.bias:
            output += self.vars['bias']
       
        return self.act(output)

class MeanPoolingAggregator(Layer):
    """ Aggregates via mean-pooling over MLP functions.
    """
    def __init__(self, input_dim, output_dim, model_size="small", neigh_input_dim=None,
            dropout=0., bias=False, act=tf.nn.relu, name=None, concat=False, **kwargs):
        super(MeanPoolingAggregator, self).__init__(**kwargs)

        self.dropout = dropout
        self.bias = bias
        self.act = act
        self.concat = concat

        if neigh_input_dim is None:
            neigh_input_dim = input_dim

        if name is not None:
            name = '/' + name
        else:
            name = ''

        if model_size == "small":
            hidden_dim = self.hidden_dim = 512
        elif model_size == "big":
            hidden_dim = self.hidden_dim = 1024

        self.mlp_layers = []
        self.mlp_layers.append(Dense(input_dim=neigh_input_dim,
                                 output_dim=hidden_dim,
                                 act=tf.nn.relu,
                                 dropout=dropout,
                                 sparse_inputs=False,
                                 logging=self.logging))

        with tf.variable_scope(self.name + name + '_vars'):
            self.vars['neigh_weights'] = glorot([hidden_dim, output_dim],
                                                        name='neigh_weights')
           
            self.vars['self_weights'] = glorot([input_dim, output_dim],
                                                        name='self_weights')
            if self.bias:
                self.vars['bias'] = zeros([self.output_dim], name='bias')

        if self.logging:
            self._log_vars()

        self.input_dim = input_dim
        self.output_dim = output_dim
        self.neigh_input_dim = neigh_input_dim

    def _call(self, inputs):
        self_vecs, neigh_vecs = inputs
        neigh_h = neigh_vecs

        dims = tf.shape(neigh_h)
        batch_size = dims[0]
        num_neighbors = dims[1]
        # [nodes * sampled neighbors] x [hidden_dim]
        h_reshaped = tf.reshape(neigh_h, (batch_size * num_neighbors, self.neigh_input_dim))

        for l in self.mlp_layers:
            h_reshaped = l(h_reshaped)
        neigh_h = tf.reshape(h_reshaped, (batch_size, num_neighbors, self.hidden_dim))
        neigh_h = tf.reduce_mean(neigh_h, axis=1)
        
        from_neighs = tf.matmul(neigh_h, self.vars['neigh_weights'])
        from_self = tf.matmul(self_vecs, self.vars["self_weights"])
        
        if not self.concat:
            output = tf.add_n([from_self, from_neighs])
        else:
            output = tf.concat([from_self, from_neighs], axis=1)

        # bias
        if self.bias:
            output += self.vars['bias']
       
        return self.act(output)


class TwoMaxLayerPoolingAggregator(Layer):
    """ Aggregates via pooling over two MLP functions.
    """
    def __init__(self, input_dim, output_dim, model_size="small", neigh_input_dim=None,
            dropout=0., bias=False, act=tf.nn.relu, name=None, concat=False, **kwargs):
        super(TwoMaxLayerPoolingAggregator, self).__init__(**kwargs)

        self.dropout = dropout
        self.bias = bias
        self.act = act
        self.concat = concat

        if neigh_input_dim is None:
            neigh_input_dim = input_dim

        if name is not None:
            name = '/' + name
        else:
            name = ''

        if model_size == "small":
            hidden_dim_1 = self.hidden_dim_1 = 512
            hidden_dim_2 = self.hidden_dim_2 = 256
        elif model_size == "big":
            hidden_dim_1 = self.hidden_dim_1 = 1024
            hidden_dim_2 = self.hidden_dim_2 = 512

        self.mlp_layers = []
        self.mlp_layers.append(Dense(input_dim=neigh_input_dim,
                                 output_dim=hidden_dim_1,
                                 act=tf.nn.relu,
                                 dropout=dropout,
                                 sparse_inputs=False,
                                 logging=self.logging))
        self.mlp_layers.append(Dense(input_dim=hidden_dim_1,
                                 output_dim=hidden_dim_2,
                                 act=tf.nn.relu,
                                 dropout=dropout,
                                 sparse_inputs=False,
                                 logging=self.logging))


        with tf.variable_scope(self.name + name + '_vars'):
            self.vars['neigh_weights'] = glorot([hidden_dim_2, output_dim],
                                                        name='neigh_weights')
           
            self.vars['self_weights'] = glorot([input_dim, output_dim],
                                                        name='self_weights')
            if self.bias:
                self.vars['bias'] = zeros([self.output_dim], name='bias')

        if self.logging:
            self._log_vars()

        self.input_dim = input_dim
        self.output_dim = output_dim
        self.neigh_input_dim = neigh_input_dim

    def _call(self, inputs):
        self_vecs, neigh_vecs = inputs
        neigh_h = neigh_vecs

        dims = tf.shape(neigh_h)
        batch_size = dims[0]
        num_neighbors = dims[1]
        # [nodes * sampled neighbors] x [hidden_dim]
        h_reshaped = tf.reshape(neigh_h, (batch_size * num_neighbors, self.neigh_input_dim))

        for l in self.mlp_layers:
            h_reshaped = l(h_reshaped)
        neigh_h = tf.reshape(h_reshaped, (batch_size, num_neighbors, self.hidden_dim_2))
        neigh_h = tf.reduce_max(neigh_h, axis=1)
        
        from_neighs = tf.matmul(neigh_h, self.vars['neigh_weights'])
        from_self = tf.matmul(self_vecs, self.vars["self_weights"])
        
        if not self.concat:
            output = tf.add_n([from_self, from_neighs])
        else:
            output = tf.concat([from_self, from_neighs], axis=1)

        # bias
        if self.bias:
            output += self.vars['bias']
       
        return self.act(output)

class SeqAggregator(Layer):
    """ Aggregates via a standard LSTM.
    """
    def __init__(self, input_dim, output_dim, model_size="small", neigh_input_dim=None,
            dropout=0., bias=False, act=tf.nn.relu, name=None,  concat=False, **kwargs):
        super(SeqAggregator, self).__init__(**kwargs)

        self.dropout = dropout
        self.bias = bias
        self.act = act
        self.concat = concat

        if neigh_input_dim is None:
            neigh_input_dim = input_dim

        if name is not None:
            name = '/' + name
        else:
            name = ''

        if model_size == "small":
            hidden_dim = self.hidden_dim = 128
        elif model_size == "big":
            hidden_dim = self.hidden_dim = 256

        with tf.variable_scope(self.name + name + '_vars'):
            self.vars['neigh_weights'] = glorot([hidden_dim, output_dim],
                                                        name='neigh_weights')
           
            self.vars['self_weights'] = glorot([input_dim, output_dim],
                                                        name='self_weights')
            if self.bias:
                self.vars['bias'] = zeros([self.output_dim], name='bias')

        if self.logging:
            self._log_vars()

        self.input_dim = input_dim
        self.output_dim = output_dim
        self.neigh_input_dim = neigh_input_dim
        self.cell = tf.contrib.rnn.BasicLSTMCell(self.hidden_dim)

    def _call(self, inputs):
        self_vecs, neigh_vecs = inputs

        dims = tf.shape(neigh_vecs)
        batch_size = dims[0]
        initial_state = self.cell.zero_state(batch_size, tf.float32)
        used = tf.sign(tf.reduce_max(tf.abs(neigh_vecs), axis=2))
        length = tf.reduce_sum(used, axis=1)
        length = tf.maximum(length, tf.constant(1.))
        length = tf.cast(length, tf.int32)

        with tf.variable_scope(self.name) as scope:
            try:
                rnn_outputs, rnn_states = tf.nn.dynamic_rnn(
                        self.cell, neigh_vecs,
                        initial_state=initial_state, dtype=tf.float32, time_major=False,
                        sequence_length=length)
            except ValueError:
                scope.reuse_variables()
                rnn_outputs, rnn_states = tf.nn.dynamic_rnn(
                        self.cell, neigh_vecs,
                        initial_state=initial_state, dtype=tf.float32, time_major=False,
                        sequence_length=length)
        batch_size = tf.shape(rnn_outputs)[0]
        max_len = tf.shape(rnn_outputs)[1]
        out_size = int(rnn_outputs.get_shape()[2])
        index = tf.range(0, batch_size) * max_len + (length - 1)
        flat = tf.reshape(rnn_outputs, [-1, out_size])
        neigh_h = tf.gather(flat, index)

        from_neighs = tf.matmul(neigh_h, self.vars['neigh_weights'])
        from_self = tf.matmul(self_vecs, self.vars["self_weights"])
         
        output = tf.add_n([from_self, from_neighs])

        if not self.concat:
            output = tf.add_n([from_self, from_neighs])
        else:
            output = tf.concat([from_self, from_neighs], axis=1)

        # bias
        if self.bias:
            output += self.vars['bias']
       
        return self.act(output)

`prediction.py`

prediction.py完整代码

from __future__ import division
from __future__ import print_function

from graphsage.inits import zeros
from graphsage.layers import Layer
import tensorflow as tf

flags = tf.app.flags
FLAGS = flags.FLAGS


class BipartiteEdgePredLayer(Layer):
    def __init__(self, input_dim1, input_dim2, placeholders, dropout=False, act=tf.nn.sigmoid,
            loss_fn='xent', neg_sample_weights=1.0,
            bias=False, bilinear_weights=False, **kwargs):
        """
        Basic class that applies skip-gram-like loss
        (i.e., dot product of node+target and node and negative samples)
        Args:
            bilinear_weights: use a bilinear weight for affinity calculation: u^T A v. If set to
                false, it is assumed that input dimensions are the same and the affinity will be 
                based on dot product.
        """
        super(BipartiteEdgePredLayer, self).__init__(**kwargs)
        self.input_dim1 = input_dim1
        self.input_dim2 = input_dim2
        self.act = act
        self.bias = bias
        self.eps = 1e-7

        # Margin for hinge loss
        self.margin = 0.1
        self.neg_sample_weights = neg_sample_weights

        self.bilinear_weights = bilinear_weights

        if dropout:
            self.dropout = placeholders['dropout']
        else:
            self.dropout = 0.

        # output a likelihood term
        self.output_dim = 1
        with tf.variable_scope(self.name + '_vars'):
            # bilinear form
            if bilinear_weights:
                #self.vars['weights'] = glorot([input_dim1, input_dim2],
                #                              name='pred_weights')
                self.vars['weights'] = tf.get_variable(
                        'pred_weights', 
                        shape=(input_dim1, input_dim2),
                        dtype=tf.float32, 
                        initializer=tf.contrib.layers.xavier_initializer())

            if self.bias:
                self.vars['bias'] = zeros([self.output_dim], name='bias')

        if loss_fn == 'xent':
            self.loss_fn = self._xent_loss
        elif loss_fn == 'skipgram':
            self.loss_fn = self._skipgram_loss
        elif loss_fn == 'hinge':
            self.loss_fn = self._hinge_loss

        if self.logging:
            self._log_vars()

    def affinity(self, inputs1, inputs2):
        """ Affinity score between batch of inputs1 and inputs2.
        Args:
            inputs1: tensor of shape [batch_size x feature_size].
        """
        # shape: [batch_size, input_dim1]
        if self.bilinear_weights:
            prod = tf.matmul(inputs2, tf.transpose(self.vars['weights']))
            self.prod = prod
            result = tf.reduce_sum(inputs1 * prod, axis=1)
        else:
            result = tf.reduce_sum(inputs1 * inputs2, axis=1)
        return result

    def neg_cost(self, inputs1, neg_samples, hard_neg_samples=None):
        """ For each input in batch, compute the sum of its affinity to negative samples.

        Returns:
            Tensor of shape [batch_size x num_neg_samples]. For each node, a list of affinities to
                negative samples is computed.
        """
        if self.bilinear_weights:
            inputs1 = tf.matmul(inputs1, self.vars['weights'])
        neg_aff = tf.matmul(inputs1, tf.transpose(neg_samples))
        return neg_aff

    def loss(self, inputs1, inputs2, neg_samples):
        """ negative sampling loss.
        Args:
            neg_samples: tensor of shape [num_neg_samples x input_dim2]. Negative samples for all
            inputs in batch inputs1.
        """
        return self.loss_fn(inputs1, inputs2, neg_samples)

    def _xent_loss(self, inputs1, inputs2, neg_samples, hard_neg_samples=None):
        aff = self.affinity(inputs1, inputs2)
        neg_aff = self.neg_cost(inputs1, neg_samples, hard_neg_samples)
        true_xent = tf.nn.sigmoid_cross_entropy_with_logits(
                labels=tf.ones_like(aff), logits=aff)
        negative_xent = tf.nn.sigmoid_cross_entropy_with_logits(
                labels=tf.zeros_like(neg_aff), logits=neg_aff)
        loss = tf.reduce_sum(true_xent) + self.neg_sample_weights * tf.reduce_sum(negative_xent)
        return loss

    def _skipgram_loss(self, inputs1, inputs2, neg_samples, hard_neg_samples=None):
        aff = self.affinity(inputs1, inputs2)
        neg_aff = self.neg_cost(inputs1, neg_samples, hard_neg_samples)
        neg_cost = tf.log(tf.reduce_sum(tf.exp(neg_aff), axis=1))
        loss = tf.reduce_sum(aff - neg_cost)
        return loss

    def _hinge_loss(self, inputs1, inputs2, neg_samples, hard_neg_samples=None):
        aff = self.affinity(inputs1, inputs2)
        neg_aff = self.neg_cost(inputs1, neg_samples, hard_neg_samples)
        diff = tf.nn.relu(tf.subtract(neg_aff, tf.expand_dims(aff, 1) - self.margin), name='diff')
        loss = tf.reduce_sum(diff)
        self.neg_shape = tf.shape(neg_aff)
        return loss

    def weights_norm(self):
        return tf.nn.l2_norm(self.vars['weights'])

`supervised_train.py`

supervised_train.py完整代码

from __future__ import division
from __future__ import print_function

import os
import time
import tensorflow as tf
import numpy as np
import sklearn
from sklearn import metrics

from graphsage.supervised_models import SupervisedGraphsage
from graphsage.models import SAGEInfo
from graphsage.minibatch import NodeMinibatchIterator
from graphsage.neigh_samplers import UniformNeighborSampler
from graphsage.utils import load_data

os.environ["CUDA_DEVICE_ORDER"]="PCI_BUS_ID"

# Set random seed
seed = 123
np.random.seed(seed)
tf.set_random_seed(seed)

# Settings
flags = tf.app.flags
FLAGS = flags.FLAGS

tf.app.flags.DEFINE_boolean('log_device_placement', False,
                            """Whether to log device placement.""")
#core params..
flags.DEFINE_string('model', 'graphsage_mean', 'model names. See README for possible values.')  
flags.DEFINE_float('learning_rate', 0.01, 'initial learning rate.')
flags.DEFINE_string("model_size", "small", "Can be big or small; model specific def'ns")
flags.DEFINE_string('train_prefix', '../example_data/toy-ppi', 'prefix identifying training data. must be specified.')

# left to default values in main experiments 
flags.DEFINE_integer('epochs', 10, 'number of epochs to train.')
flags.DEFINE_float('dropout', 0.0, 'dropout rate (1 - keep probability).')
flags.DEFINE_float('weight_decay', 0.0, 'weight for l2 loss on embedding matrix.')
flags.DEFINE_integer('max_degree', 128, 'maximum node degree.')
flags.DEFINE_integer('samples_1', 25, 'number of samples in layer 1')
flags.DEFINE_integer('samples_2', 10, 'number of samples in layer 2')
flags.DEFINE_integer('samples_3', 0, 'number of users samples in layer 3. (Only for mean model)')
flags.DEFINE_integer('dim_1', 128, 'Size of output dim (final is 2x this, if using concat)')
flags.DEFINE_integer('dim_2', 128, 'Size of output dim (final is 2x this, if using concat)')
flags.DEFINE_boolean('random_context', True, 'Whether to use random context or direct edges')
flags.DEFINE_integer('batch_size', 512, 'minibatch size.')
flags.DEFINE_boolean('sigmoid', False, 'whether to use sigmoid loss')
flags.DEFINE_integer('identity_dim', 0, 'Set to positive value to use identity embedding features of that dimension. Default 0.')

#logging, saving, validation settings etc.
flags.DEFINE_string('base_log_dir', '.', 'base directory for logging and saving embeddings')
flags.DEFINE_integer('validate_iter', 5000, "how often to run a validation minibatch.")
flags.DEFINE_integer('validate_batch_size', 256, "how many nodes per validation sample.")
flags.DEFINE_integer('gpu', 1, "which gpu to use.")
flags.DEFINE_integer('print_every', 5, "How often to print training info.")
flags.DEFINE_integer('max_total_steps', 10**10, "Maximum total number of iterations")

os.environ["CUDA_VISIBLE_DEVICES"]=str(FLAGS.gpu)

GPU_MEM_FRACTION = 0.8

def calc_f1(y_true, y_pred):
    if not FLAGS.sigmoid:
        y_true = np.argmax(y_true, axis=1)
        y_pred = np.argmax(y_pred, axis=1)
    else:
        y_pred[y_pred > 0.5] = 1
        y_pred[y_pred <= 0.5] = 0
    return metrics.f1_score(y_true, y_pred, average="micro"), metrics.f1_score(y_true, y_pred, average="macro")

# Define model evaluation function
def evaluate(sess, model, minibatch_iter, size=None):
    t_test = time.time()
    feed_dict_val, labels = minibatch_iter.node_val_feed_dict(size)
    node_outs_val = sess.run([model.preds, model.loss], 
                        feed_dict=feed_dict_val)
    mic, mac = calc_f1(labels, node_outs_val[0])
    return node_outs_val[1], mic, mac, (time.time() - t_test)

def log_dir():
    log_dir = FLAGS.base_log_dir + "/sup-" + FLAGS.train_prefix.split("/")[-2]
    log_dir += "/{model:s}_{model_size:s}_{lr:0.4f}/".format(
            model=FLAGS.model,
            model_size=FLAGS.model_size,
            lr=FLAGS.learning_rate)
    if not os.path.exists(log_dir):
        os.makedirs(log_dir)
    return log_dir

def incremental_evaluate(sess, model, minibatch_iter, size, test=False):
    t_test = time.time()
    finished = False
    val_losses = []
    val_preds = []
    labels = []
    iter_num = 0
    finished = False
    while not finished:
        feed_dict_val, batch_labels, finished, _  = minibatch_iter.incremental_node_val_feed_dict(size, iter_num, test=test)
        node_outs_val = sess.run([model.preds, model.loss], 
                         feed_dict=feed_dict_val)
        val_preds.append(node_outs_val[0])
        labels.append(batch_labels)
        val_losses.append(node_outs_val[1])
        iter_num += 1
    val_preds = np.vstack(val_preds)
    labels = np.vstack(labels)
    f1_scores = calc_f1(labels, val_preds)
    return np.mean(val_losses), f1_scores[0], f1_scores[1], (time.time() - t_test)

def construct_placeholders(num_classes):
    # Define placeholders
    placeholders = {
        'labels' : tf.placeholder(tf.float32, shape=(None, num_classes), name='labels'),
        'batch' : tf.placeholder(tf.int32, shape=(None), name='batch1'),
        'dropout': tf.placeholder_with_default(0., shape=(), name='dropout'),
        'batch_size' : tf.placeholder(tf.int32, name='batch_size'),
    }
    return placeholders

def train(train_data, test_data=None):

    G = train_data[0]
    features = train_data[1]
    id_map = train_data[2]
    class_map  = train_data[4]
    if isinstance(list(class_map.values())[0], list):
        num_classes = len(list(class_map.values())[0])
    else:
        num_classes = len(set(class_map.values()))

    if not features is None:
        # pad with dummy zero vector
        features = np.vstack([features, np.zeros((features.shape[1],))])

    context_pairs = train_data[3] if FLAGS.random_context else None
    placeholders = construct_placeholders(num_classes)
    minibatch = NodeMinibatchIterator(G, 
            id_map,
            placeholders, 
            class_map,
            num_classes,
            batch_size=FLAGS.batch_size,
            max_degree=FLAGS.max_degree, 
            context_pairs = context_pairs)
    adj_info_ph = tf.placeholder(tf.int32, shape=minibatch.adj.shape)
    adj_info = tf.Variable(adj_info_ph, trainable=False, name="adj_info")

    if FLAGS.model == 'graphsage_mean':
        # Create model
        sampler = UniformNeighborSampler(adj_info)
        if FLAGS.samples_3 != 0:
            layer_infos = [SAGEInfo("node", sampler, FLAGS.samples_1, FLAGS.dim_1),
                                SAGEInfo("node", sampler, FLAGS.samples_2, FLAGS.dim_2),
                                SAGEInfo("node", sampler, FLAGS.samples_3, FLAGS.dim_2)]
        elif FLAGS.samples_2 != 0:
            layer_infos = [SAGEInfo("node", sampler, FLAGS.samples_1, FLAGS.dim_1),
                                SAGEInfo("node", sampler, FLAGS.samples_2, FLAGS.dim_2)]
        else:
            layer_infos = [SAGEInfo("node", sampler, FLAGS.samples_1, FLAGS.dim_1)]

        model = SupervisedGraphsage(num_classes, placeholders, 
                                     features,
                                     adj_info,
                                     minibatch.deg,
                                     layer_infos, 
                                     model_size=FLAGS.model_size,
                                     sigmoid_loss = FLAGS.sigmoid,
                                     identity_dim = FLAGS.identity_dim,
                                     logging=True)
    elif FLAGS.model == 'gcn':
        # Create model
        sampler = UniformNeighborSampler(adj_info)
        layer_infos = [SAGEInfo("node", sampler, FLAGS.samples_1, 2*FLAGS.dim_1),
                            SAGEInfo("node", sampler, FLAGS.samples_2, 2*FLAGS.dim_2)]

        model = SupervisedGraphsage(num_classes, placeholders, 
                                     features,
                                     adj_info,
                                     minibatch.deg,
                                     layer_infos=layer_infos, 
                                     aggregator_type="gcn",
                                     model_size=FLAGS.model_size,
                                     concat=False,
                                     sigmoid_loss = FLAGS.sigmoid,
                                     identity_dim = FLAGS.identity_dim,
                                     logging=True)

    elif FLAGS.model == 'graphsage_seq':
        sampler = UniformNeighborSampler(adj_info)
        layer_infos = [SAGEInfo("node", sampler, FLAGS.samples_1, FLAGS.dim_1),
                            SAGEInfo("node", sampler, FLAGS.samples_2, FLAGS.dim_2)]

        model = SupervisedGraphsage(num_classes, placeholders, 
                                     features,
                                     adj_info,
                                     minibatch.deg,
                                     layer_infos=layer_infos, 
                                     aggregator_type="seq",
                                     model_size=FLAGS.model_size,
                                     sigmoid_loss = FLAGS.sigmoid,
                                     identity_dim = FLAGS.identity_dim,
                                     logging=True)

    elif FLAGS.model == 'graphsage_maxpool':
        sampler = UniformNeighborSampler(adj_info)
        layer_infos = [SAGEInfo("node", sampler, FLAGS.samples_1, FLAGS.dim_1),
                            SAGEInfo("node", sampler, FLAGS.samples_2, FLAGS.dim_2)]

        model = SupervisedGraphsage(num_classes, placeholders, 
                                    features,
                                    adj_info,
                                    minibatch.deg,
                                     layer_infos=layer_infos, 
                                     aggregator_type="maxpool",
                                     model_size=FLAGS.model_size,
                                     sigmoid_loss = FLAGS.sigmoid,
                                     identity_dim = FLAGS.identity_dim,
                                     logging=True)

    elif FLAGS.model == 'graphsage_meanpool':
        sampler = UniformNeighborSampler(adj_info)
        layer_infos = [SAGEInfo("node", sampler, FLAGS.samples_1, FLAGS.dim_1),
                            SAGEInfo("node", sampler, FLAGS.samples_2, FLAGS.dim_2)]

        model = SupervisedGraphsage(num_classes, placeholders, 
                                    features,
                                    adj_info,
                                    minibatch.deg,
                                     layer_infos=layer_infos, 
                                     aggregator_type="meanpool",
                                     model_size=FLAGS.model_size,
                                     sigmoid_loss = FLAGS.sigmoid,
                                     identity_dim = FLAGS.identity_dim,
                                     logging=True)

    else:
        raise Exception('Error: model name unrecognized.')

    config = tf.ConfigProto(log_device_placement=FLAGS.log_device_placement)
    config.gpu_options.allow_growth = True
    #config.gpu_options.per_process_gpu_memory_fraction = GPU_MEM_FRACTION
    config.allow_soft_placement = True
    
    # Initialize session
    sess = tf.Session(config=config)
    merged = tf.summary.merge_all()
    summary_writer = tf.summary.FileWriter(log_dir(), sess.graph)
     
    # Init variables
    sess.run(tf.global_variables_initializer(), feed_dict={adj_info_ph: minibatch.adj})
    
    # Train model
    
    total_steps = 0
    avg_time = 0.0
    epoch_val_costs = []

    train_adj_info = tf.assign(adj_info, minibatch.adj)
    val_adj_info = tf.assign(adj_info, minibatch.test_adj)
    for epoch in range(FLAGS.epochs): 
        minibatch.shuffle() 

        iter = 0
        print('Epoch: %04d' % (epoch + 1))
        epoch_val_costs.append(0)
        while not minibatch.end():
            # Construct feed dictionary
            feed_dict, labels = minibatch.next_minibatch_feed_dict()
            feed_dict.update({placeholders['dropout']: FLAGS.dropout})

            t = time.time()
            # Training step
            outs = sess.run([merged, model.opt_op, model.loss, model.preds], feed_dict=feed_dict)
            train_cost = outs[2]

            if iter % FLAGS.validate_iter == 0:
                # Validation
                sess.run(val_adj_info.op)
                if FLAGS.validate_batch_size == -1:
                    val_cost, val_f1_mic, val_f1_mac, duration = incremental_evaluate(sess, model, minibatch, FLAGS.batch_size)
                else:
                    val_cost, val_f1_mic, val_f1_mac, duration = evaluate(sess, model, minibatch, FLAGS.validate_batch_size)
                sess.run(train_adj_info.op)
                epoch_val_costs[-1] += val_cost

            if total_steps % FLAGS.print_every == 0:
                summary_writer.add_summary(outs[0], total_steps)
    
            # Print results
            avg_time = (avg_time * total_steps + time.time() - t) / (total_steps + 1)

            if total_steps % FLAGS.print_every == 0:
                train_f1_mic, train_f1_mac = calc_f1(labels, outs[-1])
                print("Iter:", '%04d' % iter, 
                      "train_loss=", "{:.5f}".format(train_cost),
                      "train_f1_mic=", "{:.5f}".format(train_f1_mic), 
                      "train_f1_mac=", "{:.5f}".format(train_f1_mac), 
                      "val_loss=", "{:.5f}".format(val_cost),
                      "val_f1_mic=", "{:.5f}".format(val_f1_mic), 
                      "val_f1_mac=", "{:.5f}".format(val_f1_mac), 
                      "time=", "{:.5f}".format(avg_time))
 
            iter += 1
            total_steps += 1

            if total_steps > FLAGS.max_total_steps:
                break

        if total_steps > FLAGS.max_total_steps:
                break
    
    print("Optimization Finished!")
    sess.run(val_adj_info.op)
    val_cost, val_f1_mic, val_f1_mac, duration = incremental_evaluate(sess, model, minibatch, FLAGS.batch_size)
    print("Full validation stats:",
                  "loss=", "{:.5f}".format(val_cost),
                  "f1_micro=", "{:.5f}".format(val_f1_mic),
                  "f1_macro=", "{:.5f}".format(val_f1_mac),
                  "time=", "{:.5f}".format(duration))
    with open(log_dir() + "val_stats.txt", "w") as fp:
        fp.write("loss={:.5f} f1_micro={:.5f} f1_macro={:.5f} time={:.5f}".
                format(val_cost, val_f1_mic, val_f1_mac, duration))

    print("Writing test set stats to file (don't peak!)")
    val_cost, val_f1_mic, val_f1_mac, duration = incremental_evaluate(sess, model, minibatch, FLAGS.batch_size, test=True)
    with open(log_dir() + "test_stats.txt", "w") as fp:
        fp.write("loss={:.5f} f1_micro={:.5f} f1_macro={:.5f}".
                format(val_cost, val_f1_mic, val_f1_mac))

def main(argv=None):
    print("Loading training data..")
    train_data = load_data(FLAGS.train_prefix)
    print("Done loading training data..")
    train(train_data)

if __name__ == '__main__':
    tf.app.run()

`unsupervised_train.py`

unsupervised_train.py完整代码

from __future__ import division
from __future__ import print_function

import os
import time
import tensorflow as tf
import numpy as np

from graphsage.models import SampleAndAggregate, SAGEInfo, Node2VecModel
from graphsage.minibatch import EdgeMinibatchIterator
from graphsage.neigh_samplers import UniformNeighborSampler
from graphsage.utils import load_data

os.environ["CUDA_DEVICE_ORDER"] = "PCI_BUS_ID"

# Set random seed
seed = 123
np.random.seed(seed)
tf.set_random_seed(seed)

# Settings
flags = tf.app.flags
FLAGS = flags.FLAGS

tf.app.flags.DEFINE_boolean('log_device_placement', False,
                            """Whether to log device placement.""")
# core params..
flags.DEFINE_string('model', 'graphsage_maxpool', 'model names. See README for possible values.')
flags.DEFINE_float('learning_rate', 0.00001, 'initial learning rate.')
flags.DEFINE_string("model_size", "small", "Can be big or small; model specific def'ns")
flags.DEFINE_string('train_prefix', '../example_data/toy-ppi',
                    'name of the object file that stores the training data. must be specified.')

# left to default values in main experiments
flags.DEFINE_integer('epochs', 1, 'number of epochs to train.')
flags.DEFINE_float('dropout', 0.0, 'dropout rate (1 - keep probability).')
flags.DEFINE_float('weight_decay', 0.0, 'weight for l2 loss on embedding matrix.')
flags.DEFINE_integer('max_degree', 100, 'maximum node degree.')

#对应论文中的K = 1 ，第一层S1 = 25； K = 2 ，第二层S2 = 10。
flags.DEFINE_integer('samples_1', 25, 'number of samples in layer 1')
flags.DEFINE_integer('samples_2', 10, 'number of users samples in layer 2')

#若有concat操作，则维度变为2倍
flags.DEFINE_integer('dim_1', 128, 'Size of output dim (final is 2x this, if using concat)')
flags.DEFINE_integer('dim_2', 128, 'Size of output dim (final is 2x this, if using concat)')


flags.DEFINE_boolean('random_context', True, 'Whether to use random context or direct edges')
flags.DEFINE_integer('neg_sample_size', 20, 'number of negative samples')
flags.DEFINE_integer('batch_size', 512, 'minibatch size.')
flags.DEFINE_integer('n2v_test_epochs', 1, 'Number of new SGD epochs for n2v.')
flags.DEFINE_integer('identity_dim', 0,
                     'Set to positive value to use identity embedding features of that dimension. Default 0.')

# logging, saving, validation settings etc.
flags.DEFINE_boolean('save_embeddings', True, 'whether to save embeddings for all nodes after training')
flags.DEFINE_string('base_log_dir', '.', 'base directory for logging and saving embeddings')
flags.DEFINE_integer('validate_iter', 5000, "how often to run a validation minibatch.")
flags.DEFINE_integer('validate_batch_size', 256, "how many nodes per validation sample.")
flags.DEFINE_integer('gpu', 0, "which gpu to use.")
flags.DEFINE_integer('print_every', 50, "How often to print training info.")
flags.DEFINE_integer('max_total_steps', 10 ** 10, "Maximum total number of iterations")

os.environ["CUDA_VISIBLE_DEVICES"] = str(FLAGS.gpu)  # 使用哪一块gpu，本人只有一块，需将1改为0
#
GPU_MEM_FRACTION = 0.8


def log_dir():
    log_dir = FLAGS.base_log_dir + "/unsup-" + FLAGS.train_prefix.split("/")[-2]
    log_dir += "/{model:s}_{model_size:s}_{lr:0.6f}/".format(
        model=FLAGS.model,
        model_size=FLAGS.model_size,
        lr=FLAGS.learning_rate)
    if not os.path.exists(log_dir):
        os.makedirs(log_dir)
    return log_dir


# Define model evaluation function
def evaluate(sess, model, minibatch_iter, size=None):
    t_test = time.time()
    feed_dict_val = minibatch_iter.val_feed_dict(size)
    outs_val = sess.run([model.loss, model.ranks, model.mrr],
                        feed_dict=feed_dict_val)
    return outs_val[0], outs_val[1], outs_val[2], (time.time() - t_test)


def incremental_evaluate(sess, model, minibatch_iter, size):
    t_test = time.time()
    finished = False
    val_losses = []
    val_mrrs = []
    iter_num = 0
    while not finished:
        feed_dict_val, finished, _ = minibatch_iter.incremental_val_feed_dict(size, iter_num)
        iter_num += 1
        outs_val = sess.run([model.loss, model.ranks, model.mrr],
                            feed_dict=feed_dict_val)
        val_losses.append(outs_val[0])
        val_mrrs.append(outs_val[2])
    return np.mean(val_losses), np.mean(val_mrrs), (time.time() - t_test)


def save_val_embeddings(sess, model, minibatch_iter, size, out_dir, mod=""):
    val_embeddings = []
    finished = False
    seen = set([])
    nodes = []
    iter_num = 0
    name = "val"
    while not finished:
        feed_dict_val, finished, edges = minibatch_iter.incremental_embed_feed_dict(size, iter_num)
        iter_num += 1
        outs_val = sess.run([model.loss, model.mrr, model.outputs1],
                            feed_dict=feed_dict_val)
        # ONLY SAVE FOR embeds1 because of planetoid
        for i, edge in enumerate(edges):
            if not edge[0] in seen:
                val_embeddings.append(outs_val[-1][i, :])
                nodes.append(edge[0])
                seen.add(edge[0])
    if not os.path.exists(out_dir):
        os.makedirs(out_dir)
    val_embeddings = np.vstack(val_embeddings)
    np.save(out_dir + name + mod + ".npy", val_embeddings)
    with open(out_dir + name + mod + ".txt", "w") as fp:
        fp.write("\n".join(map(str, nodes)))


def construct_placeholders():
    # Define placeholders
    placeholders = {
        'batch1': tf.placeholder(tf.int32, shape=(None), name='batch1'),
        'batch2': tf.placeholder(tf.int32, shape=(None), name='batch2'),
        # negative samples for all nodes in the batch ，所有nodes均为负样本
        'neg_samples': tf.placeholder(tf.int32, shape=(None,),
                                      name='neg_sample_size'),
        'dropout': tf.placeholder_with_default(0., shape=(), name='dropout'),
        'batch_size': tf.placeholder(tf.int32, name='batch_size'),
    }
    return placeholders


def train(train_data, test_data=None):
    G = train_data[0]
    features = train_data[1]   # 训练数据的features
    id_map = train_data[2] ## "n" : n；已经删除了节点是不具有'val'或'test'属性 的节点

    class_map=train_data[4]
    # print("class_map:",class_map)

    print("G:", G)
    # G: disjoint_union(, )
    print("features:", features)
    # features: [[-0.08760569 - 0.08760569 - 0.1132336... - 0.13184157 - 0.14681277
    #             - 0.14717815]
    #            [-0.08760569 - 0.08760569 - 0.1132336... - 0.13184157 - 0.14681277
    #             - 0.14717815]
    #            ...
    #           ]
    print("id_map:", id_map)
    # id_map: {0: 0, 1: 1,...,14754: 14754}
    print("feature.length",len(features))
    #feature.length 14755
    print("features.shape:", features.shape)
    # features.shape: (14755, 50)

    if not features is None:
        # pad with dummy zero vector
        #vstack为features添加列一行0向量，用于WX + b中与b相加
        features = np.vstack([features, np.zeros((features.shape[1],))])
    print("features.shape:", features.shape)
    # features.shape: (14756, 50)
    print(features[14755])  # 添加一个0向量
    # [0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
    #  0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
    #  0. 0.]

    context_pairs = train_data[3] if FLAGS.random_context else None  # #random walk的点对
    placeholders = construct_placeholders()
    minibatch = EdgeMinibatchIterator(G,
                                      id_map,
                                      placeholders, batch_size=FLAGS.batch_size,
                                      max_degree=FLAGS.max_degree,
                                      num_neg_samples=FLAGS.neg_sample_size,
                                      context_pairs=context_pairs)
    adj_info_ph = tf.placeholder(tf.int32, shape=minibatch.adj.shape)
    adj_info = tf.Variable(adj_info_ph, trainable=False, name="adj_info")

    if FLAGS.model == 'graphsage_mean':
        # Create model
        sampler = UniformNeighborSampler(adj_info)
        layer_infos = [SAGEInfo("node", sampler, FLAGS.samples_1, FLAGS.dim_1),
                       SAGEInfo("node", sampler, FLAGS.samples_2, FLAGS.dim_2)]

        model = SampleAndAggregate(placeholders,
                                   features,
                                   adj_info,
                                   minibatch.deg,
                                   layer_infos=layer_infos,
                                   model_size=FLAGS.model_size,
                                   identity_dim=FLAGS.identity_dim,
                                   logging=True)
    elif FLAGS.model == 'gcn':
        # Create model
        sampler = UniformNeighborSampler(adj_info)
        layer_infos = [SAGEInfo("node", sampler, FLAGS.samples_1, 2 * FLAGS.dim_1),
                       SAGEInfo("node", sampler, FLAGS.samples_2, 2 * FLAGS.dim_2)]

        model = SampleAndAggregate(placeholders,
                                   features,
                                   adj_info,
                                   minibatch.deg,
                                   layer_infos=layer_infos,
                                   aggregator_type="gcn",
                                   model_size=FLAGS.model_size,
                                   identity_dim=FLAGS.identity_dim,
                                   concat=False,
                                   logging=True)

    elif FLAGS.model == 'graphsage_seq':
        sampler = UniformNeighborSampler(adj_info)
        layer_infos = [SAGEInfo("node", sampler, FLAGS.samples_1, FLAGS.dim_1),
                       SAGEInfo("node", sampler, FLAGS.samples_2, FLAGS.dim_2)]

        model = SampleAndAggregate(placeholders,
                                   features,
                                   adj_info,
                                   minibatch.deg,
                                   layer_infos=layer_infos,
                                   identity_dim=FLAGS.identity_dim,
                                   aggregator_type="seq",
                                   model_size=FLAGS.model_size,
                                   logging=True)

    elif FLAGS.model == 'graphsage_maxpool':
        sampler = UniformNeighborSampler(adj_info)
        layer_infos = [SAGEInfo("node", sampler, FLAGS.samples_1, FLAGS.dim_1),
                       SAGEInfo("node", sampler, FLAGS.samples_2, FLAGS.dim_2)]

        model = SampleAndAggregate(placeholders,
                                   features,
                                   adj_info,
                                   minibatch.deg,
                                   layer_infos=layer_infos,
                                   aggregator_type="maxpool",
                                   model_size=FLAGS.model_size,
                                   identity_dim=FLAGS.identity_dim,
                                   logging=True)
    elif FLAGS.model == 'graphsage_meanpool':
        sampler = UniformNeighborSampler(adj_info)
        layer_infos = [SAGEInfo("node", sampler, FLAGS.samples_1, FLAGS.dim_1),
                       SAGEInfo("node", sampler, FLAGS.samples_2, FLAGS.dim_2)]

        model = SampleAndAggregate(placeholders,
                                   features,
                                   adj_info,
                                   minibatch.deg,
                                   layer_infos=layer_infos,
                                   aggregator_type="meanpool",
                                   model_size=FLAGS.model_size,
                                   identity_dim=FLAGS.identity_dim,
                                   logging=True)

    elif FLAGS.model == 'n2v':  # node2vec
        model = Node2VecModel(placeholders, features.shape[0],
                              minibatch.deg,
                              # 2x because graphsage uses concat
                              nodevec_dim=2 * FLAGS.dim_1,
                              lr=FLAGS.learning_rate)
    else:
        raise Exception('Error: model name unrecognized.')

    config = tf.ConfigProto(log_device_placement=FLAGS.log_device_placement)
    config.gpu_options.allow_growth = True  # 使用allow_growth option，刚一开始分配少量的GPU容量，然后按需慢慢的增加

    # 设置每个GPU应该拿出多少容量给进程使用，
    # per_process_gpu_memory_fraction =0.4代表 40%
    # config.gpu_options.per_process_gpu_memory_fraction = GPU_MEM_FRACTION


    # 自动选择运行设备
    # 在tf中，通过命令 "with tf.device('/cpu:0'):",允许手动设置操作运行的设备
    # 如果手动设置的设备不存在或者不可用，就会导致tf程序等待或异常，
    # 为了防止这种情况，可以设置tf.ConfigProto()中参数allow_soft_placement=True，
    # 允许tf自动选择一个存在并且可用的设备来运行操作。
    config.allow_soft_placement = True  # 如果指定的设备不存在，允许TF自动分配设备


    # Initialize session
    sess = tf.Session(config=config)


    # tf.summary()能够保存训练过程以及参数分布图并在tensorboard显示。
    # merge_all 可以将所有summary全部保存到磁盘，以便tensorboard显示。
    # 如果没有特殊要求，一般用这一句就可一显示训练时的各种信息了
    merged = tf.summary.merge_all()



    # 指定一个文件用来保存图
    # 格式：tf.summary.FileWritter(path,sess.graph)
    # 可以调用其add_summary（）方法将训练过程数据保存在filewriter指定的文件中
    summary_writer = tf.summary.FileWriter(log_dir(), sess.graph)


    # Init variables
    sess.run(tf.global_variables_initializer(), feed_dict={adj_info_ph: minibatch.adj})

    # Train model

    train_shadow_mrr = None
    shadow_mrr = None

    total_steps = 0
    avg_time = 0.0
    epoch_val_costs = []

    train_adj_info = tf.assign(adj_info, minibatch.adj)
    val_adj_info = tf.assign(adj_info, minibatch.test_adj)
    for epoch in range(FLAGS.epochs):
        minibatch.shuffle()

        iter = 0
        print('Epoch: %04d' % (epoch + 1))
        epoch_val_costs.append(0)
        while not minibatch.end():
            # Construct feed dictionary
            feed_dict = minibatch.next_minibatch_feed_dict()
            feed_dict.update({placeholders['dropout']: FLAGS.dropout})

            t = time.time()
            # Training step
            outs = sess.run([merged, model.opt_op, model.loss, model.ranks, model.aff_all,
                             model.mrr, model.outputs1], feed_dict=feed_dict)
            train_cost = outs[2]
            train_mrr = outs[5]
            if train_shadow_mrr is None:
                train_shadow_mrr = train_mrr  #
            else:
                train_shadow_mrr -= (1 - 0.99) * (train_shadow_mrr - train_mrr)

            if iter % FLAGS.validate_iter == 0:
                # Validation
                sess.run(val_adj_info.op)
                val_cost, ranks, val_mrr, duration = evaluate(sess, model, minibatch, size=FLAGS.validate_batch_size)
                sess.run(train_adj_info.op)
                epoch_val_costs[-1] += val_cost
            if shadow_mrr is None:
                shadow_mrr = val_mrr
            else:
                shadow_mrr -= (1 - 0.99) * (shadow_mrr - val_mrr)

            if total_steps % FLAGS.print_every == 0:
                summary_writer.add_summary(outs[0], total_steps)

            # Print results
            avg_time = (avg_time * total_steps + time.time() - t) / (total_steps + 1)

            if total_steps % FLAGS.print_every == 0:
                print("Iter:", '%04d' % iter,
                      "train_loss=", "{:.5f}".format(train_cost),
                      "train_mrr=", "{:.5f}".format(train_mrr),
                      "train_mrr_ema=", "{:.5f}".format(train_shadow_mrr),  # exponential moving average，指数滑动平均EMA
                      "val_loss=", "{:.5f}".format(val_cost),
                      "val_mrr=", "{:.5f}".format(val_mrr),
                      "val_mrr_ema=", "{:.5f}".format(shadow_mrr),  # exponential moving average
                      "time=", "{:.5f}".format(avg_time))

            iter += 1
            total_steps += 1

            if total_steps > FLAGS.max_total_steps:
                break

        if total_steps > FLAGS.max_total_steps:
            break

    print("Optimization Finished!")
    if FLAGS.save_embeddings:  # 训练以后是否存储节点的embeddings
        sess.run(val_adj_info.op)

        save_val_embeddings(sess, model, minibatch, FLAGS.validate_batch_size, log_dir())

        if FLAGS.model == "n2v":
            # stopping the gradient for the already trained nodes
            train_ids = tf.constant(
                [[id_map[n]] for n in G.nodes_iter() if not G.node[n]['val'] and not G.node[n]['test']],
                dtype=tf.int32)
            test_ids = tf.constant([[id_map[n]] for n in G.nodes_iter() if G.node[n]['val'] or G.node[n]['test']],
                                   dtype=tf.int32)
            update_nodes = tf.nn.embedding_lookup(model.context_embeds, tf.squeeze(test_ids))
            no_update_nodes = tf.nn.embedding_lookup(model.context_embeds, tf.squeeze(train_ids))
            update_nodes = tf.scatter_nd(test_ids, update_nodes, tf.shape(model.context_embeds))
            no_update_nodes = tf.stop_gradient(
                tf.scatter_nd(train_ids, no_update_nodes, tf.shape(model.context_embeds)))
            model.context_embeds = update_nodes + no_update_nodes
            sess.run(model.context_embeds)

            # run random walks
            from graphsage.utils import run_random_walks
            nodes = [n for n in G.nodes_iter() if G.node[n]["val"] or G.node[n]["test"]]
            start_time = time.time()
            pairs = run_random_walks(G, nodes, num_walks=50)
            walk_time = time.time() - start_time

            test_minibatch = EdgeMinibatchIterator(G,
                                                   id_map,
                                                   placeholders, batch_size=FLAGS.batch_size,
                                                   max_degree=FLAGS.max_degree,
                                                   num_neg_samples=FLAGS.neg_sample_size,
                                                   context_pairs=pairs,
                                                   n2v_retrain=True,
                                                   fixed_n2v=True)

            start_time = time.time()
            print("Doing test training for n2v.")
            test_steps = 0
            for epoch in range(FLAGS.n2v_test_epochs):
                test_minibatch.shuffle()
                while not test_minibatch.end():
                    feed_dict = test_minibatch.next_minibatch_feed_dict()
                    feed_dict.update({placeholders['dropout']: FLAGS.dropout})
                    outs = sess.run([model.opt_op, model.loss, model.ranks, model.aff_all,
                                     model.mrr, model.outputs1], feed_dict=feed_dict)
                    if test_steps % FLAGS.print_every == 0:
                        print("Iter:", '%04d' % test_steps,
                              "train_loss=", "{:.5f}".format(outs[1]),
                              "train_mrr=", "{:.5f}".format(outs[-2]))
                    test_steps += 1
            train_time = time.time() - start_time
            save_val_embeddings(sess, model, minibatch, FLAGS.validate_batch_size, log_dir(), mod="-test")
            print("Total time: ", train_time + walk_time)
            print("Walk time: ", walk_time)
            print("Train time: ", train_time)


# main函数，加载数据并训练
def main(argv=None):
    print("Loading training data..")
    train_data = load_data(FLAGS.train_prefix, load_walks=True)
    print("Done loading training data..")
    train(train_data)


if __name__ == '__main__':
    tf.app.run()
# tf.app.run()的作用：通过处理flag解析，然后执行main函数
# 如果代码中的入口函数不叫main()，而是一个其他名字的函数，如test()，则应该这样写入口tf.app.run(test)
# 如果代码中的入口函数叫main()，则可以把入口写成tf.app.run()

`inits.py`

glorot初始化方法：它为了保证前向传播和反向传播时每一层的方差一致:在正向传播时，每层的激活值的方差保持不变；在反向传播时，每层的梯度值的方差保持不变。根据每层的输入个数和输出个数来决定参数随机初始化的分布范围，是一个通过该层的输入和输出参数个数得到的分布范围内的均匀分布。
(推导见：https://blog.csdn.net/yyl424525/article/details/100823398#4_Xavier_21)

这部分和GCN中的一样：

import tensorflow as tf
import numpy as np

#产生一个维度为shape的Tensor，值分布在（-0.005-0.005）之间，且为均匀分布
def uniform(shape, scale=0.05, name=None):
    """Uniform init."""
    initial = tf.random_uniform(shape, minval=-scale, maxval=scale, dtype=tf.float32)
    return tf.Variable(initial, name=name)


def glorot(shape, name=None):
    """Glorot & Bengio (AISTATS 2010) init."""
    #
    init_range = np.sqrt(6.0/(shape[0]+shape[1]))
    initial = tf.random_uniform(shape, minval=-init_range, maxval=init_range, dtype=tf.float32)
    return tf.Variable(initial, name=name)

#产生一个维度为shape，值全为1的Tensor
def zeros(shape, name=None):
    """All zeros."""
    initial = tf.zeros(shape, dtype=tf.float32)
    return tf.Variable(initial, name=name)

#产生一个维度为shape，值全为0的Tensor
def ones(shape, name=None):
    """All ones."""
    initial = tf.ones(shape, dtype=tf.float32)
    return tf.Variable(initial, name=name)

其他

`citation_eval.py`

`ppi_eval.py`

`reddit_eval.py`

参考

博客园 listenviolet 的GraphSAGE 代码解析系列

你可能感兴趣的:(深度学习)

机器学习与深度学习间关系与区别 ℒℴѵℯ心·动ꦿ໊ོ꫞ 人工智能学习深度学习 python
一、机器学习概述定义机器学习（MachineLearning,ML）是一种通过数据驱动的方法，利用统计学和计算算法来训练模型，使计算机能够从数据中学习并自动进行预测或决策。机器学习通过分析大量数据样本，识别其中的模式和规律，从而对新的数据进行判断。其核心在于通过训练过程，让模型不断优化和提升其预测准确性。主要类型1.监督学习（SupervisedLearning）监督学习是指在训练数据集中包含输入
将cmd中命令输出保存为txt文本文件落难Coder Windows cmd window
最近深度学习本地的训练中我们常常要在命令行中运行自己的代码，无可厚非，我们有必要保存我们的炼丹结果，但是复制命令行输出到txt是非常麻烦的，其实Windows下的命令行为我们提供了相应的操作。其基本的调用格式就是：运行指令>输出到的文件名称或者具体保存路径测试下，我打开cmd并且ping一下百度：pingwww.baidu.com>./data.txt看下相同目录下data.txt的输出：如果你再
推荐3家毕业AI论文可五分钟一键生成！文末附免费教程！小猪包333 写论文人工智能 AI写作深度学习计算机视觉
在当前的学术研究和写作领域，AI论文生成器已经成为许多研究人员和学生的重要工具。这些工具不仅能够帮助用户快速生成高质量的论文内容，还能进行内容优化、查重和排版等操作。以下是三款值得推荐的AI论文生成器：千笔-AIPassPaper、懒人论文以及AIPaperPass。千笔-AIPassPaper千笔-AIPassPaper是一款基于深度学习和自然语言处理技术的AI写作助手，旨在帮助用户快速生成高质
AI大模型的架构演进与最新发展季风泯灭的季节 AI大模型应用技术二人工智能架构
随着深度学习的发展，AI大模型（LargeLanguageModels,LLMs）在自然语言处理、计算机视觉等领域取得了革命性的进展。本文将详细探讨AI大模型的架构演进，包括从Transformer的提出到GPT、BERT、T5等模型的历史演变，并探讨这些模型的技术细节及其在现代人工智能中的核心作用。一、基础模型介绍：Transformer的核心原理Transformer架构的背景在Transfo
[实践应用] 深度学习之模型性能评估指标 YuanDaima2048 深度学习工具使用深度学习人工智能损失函数性能评估 pytorch python 机器学习
文章总览：YuanDaiMa2048博客文章总览深度学习之模型性能评估指标分类任务回归任务排序任务聚类任务生成任务其他介绍在机器学习和深度学习领域，评估模型性能是一项至关重要的任务。不同的学习任务需要不同的性能指标来衡量模型的有效性。以下是对一些常见任务及其相应的性能评估指标的详细解释和总结。分类任务分类任务是指模型需要将输入数据分配到预定义的类别或标签中。以下是分类任务中常用的性能指标：准确率(
[实践应用] 深度学习之优化器 YuanDaima2048 深度学习工具使用 pytorch 深度学习人工智能机器学习 python 优化器
文章总览：YuanDaiMa2048博客文章总览深度学习之优化器1.随机梯度下降（SGD）2.动量优化（Momentum）3.自适应梯度（Adagrad）4.自适应矩估计（Adam）5.RMSprop总结其他介绍在深度学习中，优化器用于更新模型的参数，以最小化损失函数。常见的优化函数有很多种，下面是几种主流的优化器及其特点、原理和PyTorch实现：1.随机梯度下降（SGD）原理:随机梯度下降通过
生成式地图制图 Bwywb_3 深度学习机器学习深度学习生成对抗网络
生成式地图制图（GenerativeCartography）是一种利用生成式算法和人工智能技术自动创建地图的技术。它结合了传统的地理信息系统（GIS）技术与现代生成模型（如深度学习、GANs等），能够根据输入的数据自动生成符合需求的地图。这种方法在城市规划、虚拟环境设计、游戏开发等多个领域具有应用前景。主要特点：自动化生成：通过算法和模型，系统能够根据输入的地理或空间数据自动生成地图，而无需人工逐
吴恩达深度学习笔记(30)-正则化的解释极客Array
正则化（Regularization）深度学习可能存在过拟合问题——高方差，有两个解决方法，一个是正则化，另一个是准备更多的数据，这是非常可靠的方法，但你可能无法时时刻刻准备足够多的训练数据或者获取更多数据的成本很高，但正则化通常有助于避免过拟合或减少你的网络误差。如果你怀疑神经网络过度拟合了数据，即存在高方差问题，那么最先想到的方法可能是正则化，另一个解决高方差的方法就是准备更多数据，这也是非常
个人学习笔记7-6：动手学深度学习pytorch版-李沐浪子L 深度学习深度学习笔记计算机视觉 python 人工智能神经网络 pytorch
#人工智能##深度学习##语义分割##计算机视觉##神经网络#计算机视觉13.11全卷积网络全卷积网络（fullyconvolutionalnetwork，FCN）采用卷积神经网络实现了从图像像素到像素类别的变换。引入l转置卷积（transposedconvolution）实现的，输出的类别预测与输入图像在像素级别上具有一一对应关系：通道维的输出即该位置对应像素的类别预测。13.11.1构造模型下
深度学习-点击率预估-研究论文2024-09-14速读 sp_fyf_2024 深度学习人工智能
深度学习-点击率预估-研究论文2024-09-14速读1.DeepTargetSessionInterestNetworkforClick-ThroughRatePredictionHZhong,JMa,XDuan,SGu,JYao-2024InternationalJointConferenceonNeuralNetworks,2024深度目标会话兴趣网络用于点击率预测摘要：这篇文章提出了一种新
损失函数与反向传播 Star_. PyTorch pytorch 深度学习 python
损失函数定义与作用损失函数(lossfunction)在深度学习领域是用来计算搭建模型预测的输出值和真实值之间的误差。1.损失函数越小越好2.计算实际输出与目标之间的差距3.为更新输出提供依据（反向传播)常见的损失函数回归常见的损失函数有：均方差（MeanSquaredError，MSE）、平均绝对误差（MeanAbsoluteErrorLoss，MAE）、HuberLoss是一种将MSE与MAE
【深度学习】训练过程中一个OOM的问题，太难查了 weixin_40293999 深度学习深度学习人工智能
现象：各位大佬又遇到过ubuntu的这个问题么？现象是在训练过程中，ssh上不去了，能ping通，没死机，但是ubunutu的pc侧的显示器，鼠标啥都不好用了。只能重启。问题原因：OOM了95G，尼玛！！！！pytorch爆内存了，然后journald假死了，在journald被watchdog干掉之后，系统就崩溃了。这种规模的爆内存一般，即使被oomkill了，也要卡半天的，确实会这样，能不能配
云服务业界动态简报-20180128 Captain7
一、青云青云QingCloud推出深度学习平台DeepLearningonQingCloud，包含了主流的深度学习框架及数据科学工具包，通过QingCloudAppCenter一键部署交付，可以让算法工程师和数据科学家快速构建深度学习开发环境，将更多的精力放在模型和算法调优。二、腾讯云1.腾讯云正式发布腾讯专有云TCE(TencentCloudEnterprise)矩阵，涵盖企业版、大数据版、AI
机器学习VS深度学习 nfgo 机器学习
机器学习（MachineLearning,ML）和深度学习（DeepLearning,DL）是人工智能（AI）的两个子领域，它们有许多相似之处，但在技术实现和应用范围上也有显著区别。下面从几个方面对两者进行区分：1.概念层面机器学习：是让计算机通过算法从数据中自动学习和改进的技术。它依赖于手动设计的特征和数学模型来进行学习，常用的模型有决策树、支持向量机、线性回归等。深度学习：是机器学习的一个子领
大数据毕业设计hadoop+spark+hive知识图谱租房数据分析可视化大屏租房推荐系统 58同城租房爬虫房源推荐系统房价预测系统计算机毕业设计机器学习深度学习人工智能 2401_84572577 程序员大数据 hadoop 人工智能
做了那么多年开发，自学了很多门编程语言，我很明白学习资源对于学一门新语言的重要性，这些年也收藏了不少的Python干货，对我来说这些东西确实已经用不到了，但对于准备自学Python的人来说，或许它就是一个宝藏，可以给你省去很多的时间和精力。别在网上瞎学了，我最近也做了一些资源的更新，只要你是我的粉丝，这期福利你都可拿走。我先来介绍一下这些东西怎么用，文末抱走。（1）Python所有方向的学习路线（
深度学习-13-小语言模型之SmolLM的使用皮皮冰燃深度学习深度学习
文章附录1SmolLM概述1.1SmolLM简介1.2下载模型2运行2.1在CPU/GPU/多GPU上运行模型2.2使用torch.bfloat162.3通过位和字节的量化版本3应用示例4问题及解决4.1attention_mask和pad_token_id报错4.2max_new_tokens=205参考附录1SmolLM概述1.1SmolLM简介SmolLM是一系列尖端小型语言模型，提供三种规
基于深度学习的农作物病害检测 SEU-WYL 深度学习dnn 深度学习人工智能
基于深度学习的农作物病害检测利用卷积神经网络（CNN）、生成对抗网络（GAN）、Transformer等深度学习技术，自动识别和分类农作物的病害，帮助农业工作者提高作物管理效率、减少损失。1.农作物病害检测的挑战病害种类繁多：农作物病害的类型多样，不同病害在同一作物上的表现差异很大，同时同一种病害在不同生长阶段的症状也可能不同。环境影响：天气、光照、湿度等外部环境因素会影响农作物的表现，使得病害检
基于深度学习的文本引导的图像编辑 SEU-WYL 深度学习dnn 深度学习人工智能
基于深度学习的文本引导的图像编辑（Text-GuidedImageEditing）是一种通过自然语言文本指令对图像进行编辑或修改的技术。它结合了图像生成和自然语言处理（NLP）的最新进展，使用户能够通过描述性文本对图像内容进行精确的调整和操控。1.文本引导的图像编辑的挑战文本和图像之间的对齐：如何将文本中的语义信息准确地映射到图像中的特定区域或元素是一个关键挑战。这涉及到多模态数据的对齐和理解。编
深度学习--对抗生成网络（GAN, Generative Adversarial Network） Ambition_LAO 深度学习生成对抗网络
对抗生成网络（GAN,GenerativeAdversarialNetwork）是一种深度学习模型，由IanGoodfellow等人在2014年提出。GAN主要用于生成数据，通过两个神经网络相互对抗，来生成以假乱真的新数据。以下是对GAN的详细阐述，包括其概念、作用、核心要点、实现过程、代码实现和适用场景。1.概念GAN由两个神经网络组成：生成器（Generator）和判别器（Discrimina
深度学习：怎么看pth文件的参数奥利给少年深度学习人工智能
.pth文件是PyTorch模型的权重文件，它通常包含了训练好的模型的参数。要查看或使用这个文件，你可以按照以下步骤操作：1.确保你有模型的定义你需要有创建这个.pth文件时所用的模型的代码。这意味着你需要有模型的类定义和架构。2.加载模型权重使用PyTorch的load_state_dict方法来加载权重。这里是如何操作的：importtorchimporttorch.nnasnn#定义模型结构
chatgpt赋能python：如何在Python中安装Keras库？ turensu ChatGpt python chatgpt keras 计算机
如何在Python中安装Keras库？Keras是一个简单易用的神经网络库，由FrançoisChollet编写。它在Python编程语言中实现了深度学习的功能，可以使您更轻松地构建和试验不同类型的神经网络。如果您是一名Python开发人员，肯定会想知道如何在您的Python项目中安装Keras库。在本文中，我们将向您展示如何安装和配置Keras库。步骤1：安装Python要使用Keras库，您需
如何理解深度学习的训练过程奋斗的草莓熊深度学习人工智能 python scikit-learn virtualenv numpy pandas
文章目录1.训练是干什么？2.预训练模型进行训练，主要更改的是预训练模型的什么东西？1.训练是干什么？以yolov5为例子，训练的目的是把一组输入猫狗图像放到神经网络中，得到一个输出模型，这个模型下次可以直接用来识别哪个是猫，哪个是狗2.预训练模型进行训练，主要更改的是预训练模型的什么东西？超参数（Hyperparameters）：这是模型结构中定义的参数，比如：卷积核大小（kernel_size
Keras深度学习框架入门及实战指南司莹嫣Maude
Keras深度学习框架入门及实战指南keraskeras-team/keras:是一个基于Python的深度学习库，它没有使用数据库。适合用于深度学习任务的开发和实现，特别是对于需要使用Python深度学习库的场景。特点是深度学习库、Python、无数据库。项目地址:https://gitcode.com/gh_mirrors/ke/keras一、项目介绍Keras简介Keras是一款高级神经网络
深度学习驱动的车牌识别：技术演进与未来挑战逼子歌深度学习车牌识别神经网络字符识别 YOLO 卷积神经网络
一、引言1.1研究背景在当今社会，智能交通系统的发展日益重要，而车牌识别作为其关键组成部分，发挥着至关重要的作用。车牌识别技术广泛应用于交通管理、停车场管理、安防监控等领域。在交通管理中，它可以用于车辆识别、交通违法监控和车流统计等，提高交通管理的效率和准确性。在停车场管理中，实现车辆的自动识别和收费，提升管理和服务水平。在安防监控领域，可用于追踪嫌疑人及犯罪行为。深度学习的出现为车牌识别带来了重
每天五分钟玩转深度学习PyTorch：模型参数优化器torch.optim 幻风_huanfeng 深度学习框架pytorch 深度学习 pytorch 人工智能神经网络机器学习优化算法
本文重点在机器学习或者深度学习中，我们需要通过修改参数使得损失函数最小化(或最大化)，优化算法就是一种调整模型参数更新的策略。在pytorch中定义了优化器optim，我们可以使用它调用封装好的优化算法，然后传递给它神经网络模型参数，就可以对模型进行优化。本文是学习第6步(优化器)，参考链接pytorch的学习路线随机梯度下降算法在深度学习和机器学习中，梯度下降算法是最常用的参数更新方法，它的公式
什么是AIGC？有哪些免费工具？ chent_某位 AIGC
AIGC（AIGeneratedContent），即“人工智能生成内容”，是指通过人工智能技术自动生成各种类型的数字内容。AIGC让机器能够根据输入的信息或数据生成符合人类需求的文本、图像、音频、视频等内容，极大提高了内容创作的效率。AIGC的背景与起源随着深度学习和自然语言处理技术的快速发展，人工智能已经不再局限于简单的任务，如分类、预测和数据分析，而是具备了生成内容的能力。生成式AI模型，如O
transformer架构(Transformer Architecture)原理与代码实战案例讲解 AI架构设计之禅大数据AI人工智能 Python入门实战计算科学神经计算深度学习神经网络大数据人工智能大型语言模型 AI AGI LLM Java Python 架构设计 Agent RPA
transformer架构(TransformerArchitecture)原理与代码实战案例讲解关键词：Transformer,自注意力机制,编码器-解码器,预训练,微调,NLP,机器翻译作者：禅与计算机程序设计艺术/ZenandtheArtofComputerProgramming1.背景介绍1.1问题的由来自然语言处理（NLP）领域的发展经历了从规则驱动到统计驱动再到深度学习驱动的三个阶段。
如何有效的学习AI大模型？ Python程序员罗宾学习人工智能语言模型自然语言处理架构
学习AI大模型是一个系统性的过程，涉及到多个学科的知识。以下是一些建议，帮助你更有效地学习AI大模型：基础知识储备：数学基础：学习线性代数、概率论、统计学和微积分等，这些是理解机器学习算法的数学基础。编程技能：掌握至少一种编程语言，如Python，因为大多数AI模型都是用Python实现的。理论学习：机器学习基础：了解监督学习、非监督学习、强化学习等基本概念。深度学习：学习神经网络的基本结构，如卷
【深度学习】【OnnxRuntime】【Python】模型转化、环境搭建以及模型部署的详细教程牙牙要健康深度学习 onnx onnxruntime 深度学习 python 人工智能
【深度学习】【OnnxRuntime】【Python】模型转化、环境搭建以及模型部署的详细教程提示:博主取舍了很多大佬的博文并亲测有效,分享笔记邀大家共同学习讨论文章目录【深度学习】【OnnxRuntime】【Python】模型转化、环境搭建以及模型部署的详细教程前言模型转换--pytorch转onnxWindows平台搭建依赖环境onnxruntime调用onnx模型ONNXRuntime推理核
基于深度学习的多模态信息检索 SEU-WYL 深度学习dnn 深度学习人工智能
基于深度学习的多模态信息检索（MultimodalInformationRetrieval,MMIR）是指利用深度学习技术，从包含多种模态（如文本、图像、视频、音频等）的数据集中检索出满足用户查询意图的相关信息。这种方法不仅可以处理单一模态的数据，还可以在多种模态之间建立关联，从而更准确地满足用户需求。1.多模态信息检索的挑战异构数据表示：多模态数据通常具有不同的特征和表示形式（如文本的词嵌入与图
[黑洞与暗粒子]没有光的世界 comsci
无论是相对论还是其它现代物理学,都显然有个缺陷,那就是必须有光才能够计算但是,我相信,在我们的世界和宇宙平面中,肯定存在没有光的世界.... 那么,在没有光的世界,光子和其它粒子的规律无法被应用和考察,那么以光速为核心的 &nbs
jQuery Lazy Load 图片延迟加载 aijuans jquery
基于 jQuery 的图片延迟加载插件，在用户滚动页面到图片之后才进行加载。对于有较多的图片的网页，使用图片延迟加载，能有效的提高页面加载速度。版本： jQuery v1.4.4+ jQuery Lazy Load v1.7.2 注意事项：需要真正实现图片延迟加载，必须将真实图片地址写在 data-original 属性中。若 src
使用Jodd的优点 Kai_Ge jodd
1. 简化和统一 controller ，抛弃 extends SimpleFormController ，统一使用 implements Controller 的方式。 2. 简化 JSP 页面的 bind, 不需要一个字段一个字段的绑定。 3. 对 bean 没有任何要求，可以使用任意的 bean 做为 formBean。使用方法简介
jpa Query转hibernate Query 120153216 Hibernate
public List<Map> getMapList(String hql, Map map) { org.hibernate.Query jpaQuery = entityManager.createQuery(hql); if (null != map) { for (String parameter : map.keySet()) { jp
Django_Python3添加MySQL/MariaDB支持 2002wmj mariaDB
现状首先，[email protected] 中默认的引擎为 django.db.backends.mysql 。但是在Python3中如果这样写的话，会发现 django.db.backends.mysql 依赖 MySQLdb[5] ，而 MySQLdb 又不兼容 Python3 于是要找一种新的方式来继续使用MySQL。 MySQL官方的方案首先据MySQL文档[3]说，自从MySQL
在SQLSERVER中查找消耗IO最多的SQL 357029540 SQL Server
返回做IO数目最多的50条语句以及它们的执行计划。 select top 50 (total_logical_reads/execution_count) as avg_logical_reads, (total_logical_writes/execution_count) as avg_logical_writes, (tot
spring UnChecked 异常官方定义！ 7454103 spring
如果你接触过spring的事物管理！那么你必须明白 spring的非捕获异常！即 unchecked 异常！因为 spring 默认这类异常事物自动回滚！！ public static boolean isCheckedException(Throwable ex) { return !(ex instanceof RuntimeExcep
mongoDB 入门指南、示例 adminjun java mongodb 操作
一、准备工作 1、下载mongoDB 下载地址：http://www.mongodb.org/downloads 选择合适你的版本相关文档：http://www.mongodb.org/display/DOCS/Tutorial 2、安装mongoDB A、不解压模式：将下载下来的mongoDB-xxx.zip打开，找到bin目录，运行mongod.exe就可以启动服务，默
CUDA 5 Release Candidate Now Available aijuans CUDA
The CUDA 5 Release Candidate is now available at http://developer.nvidia.com/<wbr></wbr>cuda/cuda-pre-production. Now applicable to a broader set of algorithms, CUDA 5 has advanced fe
Essential Studio for WinRT网格控件测评 Axiba JavaScript html5
Essential Studio for WinRT界面控件包含了商业平板应用程序开发中所需的所有控件，如市场上运行速度最快的grid 和chart、地图、RDL报表查看器、丰富的文本查看器及图表等等。同时，该控件还包含了一组独特的库，用于从WinRT应用程序中生成Excel、Word以及PDF格式的文件。此文将对其另外一个强大的控件——网格控件进行专门的测评详述。网格控件功能 1、
java 获取windows系统安装的证书或证书链 bewithme windows
有时需要获取windows系统安装的证书或证书链，比如说你要通过证书来创建java的密钥库。有关证书链的解释可以查看此处。 public static void main(String[] args) { SunMSCAPI providerMSCAPI = new SunMSCAPI(); S
NoSQL数据库之Redis数据库管理(set类型和zset类型) bijian1013 redis 数据库 NoSQL
4.sets类型 Set是集合，它是string类型的无序集合。set是通过hash table实现的，添加、删除和查找的复杂度都是O(1)。对集合我们可以取并集、交集、差集。通过这些操作我们可以实现sns中的好友推荐和blog的tag功能。 sadd：向名称为key的set中添加元
异常捕获何时用Exception，何时用Throwable bingyingao
用Exception的情况 try { //可能发生空指针、数组溢出等异常 } catch (Exception e) {
【Kafka四】Kakfa伪分布式安装 bit1129 kafka
在http://bit1129.iteye.com/blog/2174791一文中，实现了单Kafka服务器的安装，在Kafka中，每个Kafka服务器称为一个broker。本文简单介绍下，在单机环境下Kafka的伪分布式安装和测试验证 1. 安装步骤 Kafka伪分布式安装的思路跟Zookeeper的伪分布式安装思路完全一样，不过比Zookeeper稍微简单些(不
Project Euler bookjovi haskell
Project Euler是个数学问题求解网站，网站设计的很有意思，有很多problem，在未提交正确答案前不能查看problem的overview，也不能查看关于problem的discussion thread，只能看到现在problem已经被多少人解决了，人数越多往往代表问题越容易。看看problem 1吧： Add all the natural num
Java-Collections Framework学习与总结-ArrayDeque BrokenDreams Collections
表、栈和队列是三种基本的数据结构，前面总结的ArrayList和LinkedList可以作为任意一种数据结构来使用，当然由于实现方式的不同，操作的效率也会不同。这篇要看一下java.util.ArrayDeque。从命名上看
读《研磨设计模式》-代码笔记-装饰模式-Decorator bylijinnan java 设计模式
声明：本文只为方便我个人查阅和理解，详细的分析以及源代码请移步原作者的博客http://chjavach.iteye.com/ import java.io.BufferedOutputStream; import java.io.DataOutputStream; import java.io.FileOutputStream; import java.io.Fi
Maven学习(一) chenyu19891124 Maven私服
学习一门技术和工具总得花费一段时间，5月底6月初自己学习了一些工具，maven+Hudson+nexus的搭建，对于maven以前只是听说，顺便再自己的电脑上搭建了一个maven环境，但是完全不了解maven这一强大的构建工具，还有ant也是一个构建工具，但ant就没有maven那么的简单方便，其实简单点说maven是一个运用命令行就能完成构建，测试，打包，发布一系列功
[原创]JWFD工作流引擎设计----节点匹配搜索算法(用于初步解决条件异步汇聚问题) 补充 comsci 算法工作 PHP 搜索引擎嵌入式
本文主要介绍在JWFD工作流引擎设计中遇到的一个实际问题的解决方案，请参考我的博文"带条件选择的并行汇聚路由问题"中图例A2描述的情况(http://comsci.iteye.com/blog/339756),我现在把我对图例A2的一个解决方案公布出来，请大家多指点节点匹配搜索算法(用于解决标准对称流程图条件汇聚点运行控制参数的算法) 需要解决的问题：已知分支
Linux中用shell获取昨天、明天或多天前的日期 daizj linux shell 上几年昨天获取上几个月
在Linux中可以通过date命令获取昨天、明天、上个月、下个月、上一年和下一年 # 获取昨天 date -d 'yesterday' # 或 date -d 'last day' # 获取明天 date -d 'tomorrow' # 或 date -d 'next day' # 获取上个月 date -d 'last month' #
我所理解的云计算 dongwei_6688 云计算
在刚开始接触到一个概念时，人们往往都会去探寻这个概念的含义，以达到对其有一个感性的认知，在Wikipedia上关于“云计算”是这么定义的，它说： Cloud computing is a phrase used to describe a variety of computing co
YII CMenu配置 dcj3sjt126com yii
Adding id and class names to CMenu We use the id and htmlOptions to accomplish this. Watch. //in your view $this->widget('zii.widgets.CMenu', array( 'id'=>'myMenu', 'items'=>$this-&g
设计模式之静态代理与动态代理 come_for_dream 设计模式
静态代理与动态代理代理模式是java开发中用到的相对比较多的设计模式，其中的思想就是主业务和相关业务分离。所谓的代理设计就是指由一个代理主题来操作真实主题，真实主题执行具体的业务操作，而代理主题负责其他相关业务的处理。比如我们在进行删除操作的时候需要检验一下用户是否登陆，我们可以删除看成主业务，而把检验用户是否登陆看成其相关业务
【转】理解Javascript 系列 gcc2ge JavaScript
理解Javascript_13_执行模型详解摘要: 在《理解Javascript_12_执行模型浅析》一文中,我们初步的了解了执行上下文与作用域的概念，那么这一篇将深入分析执行上下文的构建过程，了解执行上下文、函数对象、作用域三者之间的关系。函数执行环境简单的代码:当调用say方法时，第一步是创建其执行环境，在创建执行环境的过程中，会按照定义的先后顺序完成一系列操作:1.首先会创建一个
Subsets II hcx2013 set
Given a collection of integers that might contain duplicates, nums, return all possible subsets. Note: Elements in a subset must be in non-descending order. The solution set must not conta
Spring4.1新特性——Spring缓存框架增强 jinnianshilongnian spring4
目录 Spring4.1新特性——综述 Spring4.1新特性——Spring核心部分及其他 Spring4.1新特性——Spring缓存框架增强 Spring4.1新特性——异步调用和事件机制的异常处理 Spring4.1新特性——数据库集成测试脚本初始化 Spring4.1新特性——Spring MVC增强 Spring4.1新特性——页面自动化测试框架Spring MVC T
shell嵌套expect执行命令 liyonghui160com
一直都想把expect的操作写到bash脚本里,这样就不用我再写两个脚本来执行了,搞了一下午终于有点小成就,给大家看看吧. 系统:centos 5.x 1.先安装expect yum -y install expect 2.脚本内容: cat auto_svn.sh #!/bin/bash
Linux实用命令整理 pda158 linux
0. 基本命令　　linux 基本命令整理　　1. 压缩解压　　tar -zcvf a.tar.gz a #把a压缩成a.tar.gz 　　tar -zxvf a.tar.gz #把a.tar.gz解压成a 　　2. vim小结　　2.1 vim替换　　:m,ns/word_1/word_2/gc
独立开发人员通向成功的29个小贴士 shoothao 独立开发
概述：本文收集了关于独立开发人员通向成功需要注意的一些东西,对于具体的每个贴士的注解有兴趣的朋友可以查看下面标注的原文地址。明白你从事独立开发的原因和目的。保持坚持制定计划的好习惯。万事开头难，第一份订单是关键。培养多元化业务技能。提供卓越的服务和品质。谨小慎微。营销是必备技能。学会组织，有条理的工作才是最有效率的。 “独立
JAVA中堆栈和内存分配原理 uule java
1、栈、堆 1.寄存器：最快的存储区, 由编译器根据需求进行分配,我们在程序中无法控制.2. 栈：存放基本类型的变量数据和对象的引用，但对象本身不存放在栈中，而是存放在堆（new 出来的对象）或者常量池中（字符串常量对象存放在常量池中。）3. 堆：存放所有new出来的对象。4. 静态域：存放静态成员（static定义的）5. 常量池：存放字符串常量和基本类型常量（public static f

GraphSAGE NIPS 2017 代码分析（Tensorflow版）

文章目录

数据集

ppi数据集信息

toy-ppi-G.json 图的信息

toy-ppi-class_map.json

toy-ppi-id_map.json

toy-ppi-walks.txt

toy-ppi-feats.npy

实验要求

安装Docker与程序运行

运行

代码分析

__init__.py

utils.py

neigh_samplers.py

models.py

layers.py

minibatch.py

aggregators.py

prediction.py

supervised_train.py

unsupervised_train.py

inits.py

其他

citation_eval.py

ppi_eval.py

reddit_eval.py

参考