deepwalk配置与使用

下载源代码

https://github.com/phanein/deepwalk
数据集的定义

http://leitang.net/social_dimension.html
核心代码

    walks = graph.build_deepwalk_corpus(G, num_paths=args.number_walks, path_length=args.walk_length, alpha=0, rand=random.Random(args.seed))
     
    print("Training...")
     
    model = Word2Vec(walks, size=args.representation_size, window=args.window_size, min_count=0, workers=args.workers)

安装

    cd deepwalk-master
    pip install -r requirements.txt
    python setup.py install

复现试验结果

1. BlogCatalog dataset

生成Embedding

deepwalk --format mat --input example_graphs/blogcatalog.mat --max-memory-data-size 0 --number-walks 80 --representation-size 128 --walk-length 40 --window-size 10 --workers 1 --output example_graphs/blogcatalog.embeddings

评估

python example_graphs/scoring.py --emb example_graphs/blogcatalog.embeddings --network example_graphs/blogcatalog.mat --num-shuffle 10 --all

2. Karate dataset

生成Embedding

--format默认.adjlist文件

deepwalk --input example_graphs/karate.adjlist --max-memory-data-size 0 --number-walks 80 --representation-size 128 --walk-length 40 --window-size 10 --workers 1 --output example_graphs/karate.embeddings

评估

--network需要.mat文件

option如下:

    usage: scoring [-h] --emb EMB --network NETWORK
                   [--adj-matrix-name ADJ_MATRIX_NAME]
                   [--label-matrix-name LABEL_MATRIX_NAME]
                   [--num-shuffles NUM_SHUFFLES] [--all]
     
    optional arguments:
      -h, --help            show this help message and exit
      --emb EMB             Embeddings file (default: None)
      --network NETWORK     A .mat file containing the adjacency matrix and node
                            labels of the input network. (default: None)
      --adj-matrix-name ADJ_MATRIX_NAME
                            Variable name of the adjacency matrix inside the .mat
                            file. (default: network)
      --label-matrix-name LABEL_MATRIX_NAME
                            Variable name of the labels matrix inside the .mat
                            file. (default: group)
      --num-shuffles NUM_SHUFFLES
                            Number of shuffles. (default: 2)
      --all                 The embeddings are evaluated on all training percents
                            from 10 to 90 when this flag is set to true. By
                            default, only training percents of 10, 50 and 90 are
                            used. (default: False)
---------------------  
作者:YizhuJiao  
来源:CSDN  
原文:https://blog.csdn.net/YizhuJiao/article/details/81095346  
版权声明:本文为博主原创文章,转载请附上博文链接!

你可能感兴趣的:(综合)