deepwalk配置和运行

下载源代码

https://github.com/phanein/deepwalk

数据集的定义

http://leitang.net/social_dimension.html

核心代码

walks = graph.build_deepwalk_corpus(G, num_paths=args.number_walks, path_length=args.walk_length, alpha=0, rand=random.Random(args.seed))

print("Training...")

model = Word2Vec(walks, size=args.representation_size, window=args.window_size, min_count=0, workers=args.workers)

安装

  • cd deepwalk-master
  • pip install -r requirements.txt
  • python setup.py install

复现试验结果

1. BlogCatalog dataset

生成Embedding

deepwalk --format mat --input example_graphs/blogcatalog.mat --max-memory-data-size 0 --number-walks 80 --representation-size 128 --walk-length 40 --window-size 10 --workers 1 --output example_graphs/blogcatalog.embeddings

评估

python example_graphs/scoring.py --emb example_graphs/blogcatalog.embeddings --network example_graphs/blogcatalog.mat --num-shuffle 10 --all

2. Karate dataset

生成Embedding

--format默认.adjlist文件

deepwalk --input example_graphs/karate.adjlist --max-memory-data-size 0 --number-walks 80 --representation-size 128 --walk-length 40 --window-size 10 --workers 1 --output example_graphs/karate.embeddings

评估

--network需要.mat文件

option如下:

usage: scoring [-h] --emb EMB --network NETWORK
               [--adj-matrix-name ADJ_MATRIX_NAME]
               [--label-matrix-name LABEL_MATRIX_NAME]
               [--num-shuffles NUM_SHUFFLES] [--all]

optional arguments:
  -h, --help            show this help message and exit
  --emb EMB             Embeddings file (default: None)
  --network NETWORK     A .mat file containing the adjacency matrix and node
                        labels of the input network. (default: None)
  --adj-matrix-name ADJ_MATRIX_NAME
                        Variable name of the adjacency matrix inside the .mat
                        file. (default: network)
  --label-matrix-name LABEL_MATRIX_NAME
                        Variable name of the labels matrix inside the .mat
                        file. (default: group)
  --num-shuffles NUM_SHUFFLES
                        Number of shuffles. (default: 2)
  --all                 The embeddings are evaluated on all training percents
                        from 10 to 90 when this flag is set to true. By
                        default, only training percents of 10, 50 and 90 are
                        used. (default: False)

 

 

 

你可能感兴趣的:(Network,Embedding)