在安装Graphvite之前,请确保你已经下载了graphvite源码,并且安装好CUDA(只使用CPU可以不需要)以及安装好了faiss。
Graphvite 源码地址:https://github.com/DeepGraphLearning/graphvite
faiss 安装教程请参考上一篇博客:(一)Graphvite源码编译安装——faiss 源码安装
-----------------------------分割线--------------------------------------------------------------------------------------
安装步骤:
1、下载graphvite v0.1.0 版本并解压
nohup wget https://github.com/DeepGraphLearning/graphvite/archive/v0.1.0.tar.gz > a.log 2>&1 &
tar -xvf v0.1.0.tar.gz
2、安装配置环境
cd graphvite-0.1.0
conda install -y --file conda/requirements.txt
3、创建build 文件进行编译
mkdir build && cd build
4、修改cmake/FindGFlags.cmake与FindGlog.cmake文件,指定conda install 安装的gflags 与 glog (可以不修改)
cmake ..
vim ./cmake/FindGFlags.cmake
vim ./cmake/FindGlog.cmake
if(WIN32)
find_path(GFLAGS_INCLUDE_DIR gflags/gflags.h
PATHS ${GFLAGS_ROOT_DIR}/src/windows)
else()
find_path(GFLAGS_INCLUDE_DIR gflags/gflags.h
PATHS ${GFLAGS_ROOT_DIR}
/home/deng/anaconda3/envs/graphvite/include) # add gflags include
endif()
4、编译(修改3)
cmake -DFAISS_PATH=/home/deng/usr/local/faiss_gpu/lib -DCMAKE_INSTALL_PREFIX=$HOME/usr/local/graphvite_gpu ..
make
存在的warning :
[ 33%] Building CUDA object src/CMakeFiles/graphvite.dir/graphvite.cu.o
/home/deng/anaconda3/envs/graphvite/include/python3.7m/pybind11/cast.h(1003): warning: pointless comparison of unsigned integer with zero
detected during:
instantiation of "__nv_bool pybind11::detail::type_caster, void>>::load(pybind11::handle, __nv_bool) [with T=pybind11::detail::intrinsic_t>]"
(1927): here
instantiation of "__nv_bool pybind11::detail::argument_loader::load_impl_sequence(pybind11::detail::function_call &, pybind11::detail::index_sequence) [with Args=>, int, size_t>, Is=<0UL, 1UL, 2UL, 3UL>]"
(1907): here
instantiation of "__nv_bool pybind11::detail::argument_loader::load_args(pybind11::detail::function_call &) [with Args=>, int, size_t>]"
/home/deng/anaconda3/envs/graphvite/include/python3.7m/pybind11/pybind11.h(141): here
instantiation of "void pybind11::cpp_function::initialize(Func &&, Return (*)(Args...), const Extra &...) [with Func=lambda [](pybind11::detail::value_and_holder &, std::vector>, int, size_t)->void, Return=void, Args=>, int, size_t>, Extra=]"
/home/deng/anaconda3/envs/graphvite/include/python3.7m/pybind11/pybind11.h(72): here
instantiation of "pybind11::cpp_function::cpp_function(Func &&, const Extra &...) [with Func=lambda [](pybind11::detail::value_and_holder &, std::vector>, int, size_t)->void, Extra=, =void]"
/home/deng/anaconda3/envs/graphvite/include/python3.7m/pybind11/pybind11.h(1112): here
instantiation of "pybind11::class_ &pybind11::class_::def(const char *, Func &&, const Extra &...) [with type_=graphvite::GraphSolver<128UL, float, unsigned int>, options=<>, Func=lambda [](pybind11::detail::value_and_holder &, std::vector>, int, size_t)->void, Extra=]"
/home/deng/anaconda3/envs/graphvite/include/python3.7m/pybind11/detail/init.h(175): here
instantiation of "void pybind11::detail::initimpl::constructor::execute(Class &, const Extra &...) [with Args=>, int, size_t>, Class=pybind11::class_>, Extra=, =0]"
/home/deng/anaconda3/envs/graphvite/include/python3.7m/pybind11/pybind11.h(1141): here
instantiation of "pybind11::class_ &pybind11::class_::def(const pybind11::detail::initimpl::constructor &, const Extra &...) [with type_=graphvite::GraphSolver<128UL, float, unsigned int>, options=<>, Args=>, int, size_t>, Extra=]"
/home/deng/project/graphvite-0.1.0/include/bind.h(432): here
instantiation of "pyGraphSolver::pyGraphSolver(pybind11::handle, const char *, const Args &...) [with dim=128UL, Float=float, Index=unsigned int, Args=<>]"
/home/deng/project/graphvite-0.1.0/src/graphvite.cu(53): here
/home/deng/anaconda3/envs/graphvite/include/python3.7m/pybind11/cast.h(1003): warning: pointless comparison of unsigned integer with zero
detected during:
instantiation of "__nv_bool pybind11::detail::type_caster, void>>::load(pybind11::handle, __nv_bool) [with T=pybind11::detail::intrinsic_t>]"
(1927): here
instantiation of "__nv_bool pybind11::detail::argument_loader::load_impl_sequence(pybind11::detail::function_call &, pybind11::detail::index_sequence) [with Args=, Is=<0UL, 1UL>]"
(1907): here
instantiation of "__nv_bool pybind11::detail::argument_loader::load_args(pybind11::detail::function_call &) [with Args=]"
/home/deng/anaconda3/envs/graphvite/include/python3.7m/pybind11/pybind11.h(141): here
instantiation of "void pybind11::cpp_function::initialize(Func &&, Return (*)(Args...), const Extra &...) [with Func=lambda [](pybind11::detail::value_and_holder &, unsigned int)->void, Return=void, Args=, Extra=]"
/home/deng/anaconda3/envs/graphvite/include/python3.7m/pybind11/pybind11.h(72): here
instantiation of "pybind11::cpp_function::cpp_function(Func &&, const Extra &...) [with Func=lambda [](pybind11::detail::value_and_holder &, unsigned int)->void, Extra=, =void]"
/home/deng/anaconda3/envs/graphvite/include/python3.7m/pybind11/pybind11.h(1112): here
instantiation of "pybind11::class_ &pybind11::class_::def(const char *, Func &&, const Extra &...) [with type_=DType, options=<>, Func=lambda [](pybind11::detail::value_and_holder &, unsigned int)->void, Extra=]"
/home/deng/anaconda3/envs/graphvite/include/python3.7m/pybind11/detail/init.h(239): here
instantiation of "void pybind11::detail::initimpl::factory>>::execute(Class &, const Extra &...) && [with Func=lambda [](unsigned int)->DType, Return=DType, Args=, Class=pybind11::class_, Extra=<>]"
/home/deng/anaconda3/envs/graphvite/include/python3.7m/pybind11/pybind11.h(1153): here
instantiation of "pybind11::class_ &pybind11::class_::def(pybind11::detail::initimpl::factory &&, const Extra &...) [with type_=DType, options=<>, Args=DType, pybind11::detail::void_type (*)(), DType (unsigned int), pybind11::detail::void_type ()>, Extra=<>]"
/home/deng/anaconda3/envs/graphvite/include/python3.7m/pybind11/pybind11.h(1564): here
instantiation of "pybind11::enum_::enum_(const pybind11::handle &, const char *, const Extra &...) [with Type=DType, Extra=<>]"
/home/deng/project/graphvite-0.1.0/src/graphvite.cu(79): here
4、编译(不修改3)
cmake -DFAISS_PATH=/home/deng/usr/local/faiss_gpu/lib -DCMAKE_INSTALL_PREFIX=$HOME/usr/local/graphvite_gpu -DGLOG_INCLUDE_DIR=/home/deng/anaconda3/envs/graphvite/include -DGFLAGS_INCLUDE_DIR=/home/deng/anaconda3/envs/graphvite/include..
make
(修改:之前这里有误:-DFAISS_PATH=/home/deng/usr/local/faiss_gpu,根据CMakelist.txt , 这里需要FAISS_PATH需要导入的是lib目录即-DFAISS_PATH=/home/deng/usr/local/faiss_gpu/lib ) ,还得修改一下作者的CMakelist.txt,如下:
get_filename_component(FAISS_PARENT ${FAISS_PATH} DIRECTORY)
include_directories(${FAISS_PARENT}/include) # 这里添加/include
link_directories(${FAISS_PATH})
set(FAISS_LIBRARY ${FAISS_PATH}/libfaiss.so)
否则会提示头文件找不到的错误:
5、安装
cd ..
cd python
python setup.py install
error:
Traceback (most recent call last):
File "setup.py", line 22, in
from graphvite import __version__, lib_path, lib_file
File "/home/deng/project/graphvite-0.1.0/python/graphvite/__init__.py", line 36, in
lib = imp.load_dynamic("libgraphvite", lib_file)
File "/home/deng/anaconda3/envs/graphvite/lib/python3.7/imp.py", line 342, in load_dynamic
return _load(spec)
ImportError: /home/deng/project/graphvite-0.1.0/lib/libgraphvite.so: undefined symbol: _ZN3fLS13FLAGS_log_dirE
尝试了四天解决这个问题,从9月28日到今天。问题定位于链接glog库失败,该符号:
conda install 安装的libgraphvite.so 符号:_ZN3fLS13FLAGS_log_dirB5cxx11E
之前报错未定义的符号:_ZN3fLS13FLAGS_log_dirE
libglog.so 符号:_ZN3fLS13FLAGS_log_dirB5cxx11E
又通过检查运行时链接路径,各种方法均尝试了一遍,跟作者也沟通过好几次,就是失败,不知道为啥?只能说明自己的linux 基础太差,硬刚搞不定。路漫漫其修远兮,吾将上下而求索!(20191002)
-------------------------------------------------------------分割线----------------------------------------------------------------------------------------------
今天在师兄的指导下终于解决了这个问题。
解决方法如下:
之前定位到FLAGS_log_dir在graphvite 源码的include/util/io.h 这个头文件,其余地方均没有。如图:
又发现io.h的头文件如下:
也就是在logging.h 头文件中定义 FLAGS_log_dir 这个变量,但是我们在logging.h 文件里面搜索这个变量并没有找到(logging.h 需要对glog 进行安装生成)。之前就卡在了这里,不知道如何解决。但是刚刚师兄提供了一种思路:FLAGS_log_dir这个变量在程序中只在这里出现,其他地方并未使用,尽管我们暂时不知道它的作用,那我们可以试试干掉它(惊呆了)。师兄认为这个log的symbol估计因为版本变化,外部库没这个symbol了。通过git blame 查看修改可以发现。
最终安照师兄的方法干掉这个log,如下:
vim io.h # 修改log部分代码
void init_logging(int threshold = google::INFO, std::string dir = "", bool verbose = false) {
static bool initialized = false;
FLAGS_minloglevel = threshold;
if (dir == "")
FLAGS_logtostderr = true;
/** else
FLAGS_log_dir = dir;**/ // 注释掉这部分代码
FLAGS_log_prefix = verbose;
if (!initialized) {
google::InitGoogleLogging("graphvite");
initialized = true;
}
}
然后再重新进行编译安装:
cmake -DFAISS_PATH=/home/deng/usr/local/faiss_gpu/lib -DCMAKE_INSTALL_PREFIX=$HOME/usr/local/graphvite_gpu -DGLOG_INCLUDE_DIR=/home/deng/anaconda3/envs/graphvite/include -DGFLAGS_INCLUDE_DIR=/home/deng/anaconda3/envs/graphvite/include..
make
cd ../python/
python setup.py install
就会自动安装在anaconda 自己创建环境目录下:
~/anaconda3/envs/graphvite/bin/graphvite
不过在训练的时候出现了如下问题:
graphvite baseline quick start
error 如下:
running baseline: quick_start.yaml
loading graph from /home/deng/.graphvite/dataset/blogcatalog/blogcatalog_train.txt
0.00018387%
<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
Graph
------------------ Graph -------------------
#vertex: 10312, #edge: 333983
as undirected: yes, normalization: no
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
[time] GraphApplication.load: 0.282339 s
[time] GraphApplication.build: 0.598882 s
<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
GraphSolver<128, float32, uint32>
----------------- Resource -----------------
#worker: 1, #sampler: 7, #partition: 1
tied weights: no, episode size: 500
gpu memory limit: 15.3 GiB
gpu memory cost: 51.5 MiB
----------------- Sampling -----------------
augmentation step: 2, shuffle base: 2
random walk length: 40
random walk batch size: 100
#negative: 1, negative sample exponent: 0.75
----------------- Training -----------------
model: LINE
optimizer: SGD
learning rate: 0.025, lr schedule: linear
weight decay: 0.005
#epoch: 2000, batch size: 100000
resume: no
positive reuse: 1, negative weight: 5
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
Batch id: 0 / 6679
loss = 0
Batch id: 1000 / 6679
loss = 0.388631
Batch id: 2000 / 6679
loss = 0.383216
Batch id: 3000 / 6679
loss = 0.380334
Batch id: 4000 / 6679
loss = 0.376892
Batch id: 5000 / 6679
loss = 0.373871
Batch id: 6000 / 6679
loss = 0.372043
[time] GraphApplication.train: 11.2109 s
evaluate on node classification
effective labels: 14476 / 14476
OMP: Error #13: Assertion failure at z_Linux_util.cpp(2361).
OMP: Hint Please submit a bug report with this message, compile and run commands used, and machine configuration info including native compiler and operating system versions. Faster response will be obtained by including all program sources. For information on submitting this issue, please see http://www.intel.com/software/products/support/.
又是一个大坑,不过还是非常高兴,之前花费四天的坑终于填上了,太激动了!!!!
每天进步一点点!!
后面再持续更新!!(20191006)
---------------------------------------------------分割线---------------------------------------------
该问题已解决:
问题分析:OMP: Error #13: Assertion failure at z_Linux_util.cpp(2361).
这个错误是openmp的bug ,openmp是一个并行库,将openmp 的版本切换到intel-openmp=2019.4即可。
注意:这个bug 不应有,有些同学可能不会遇到此问题,因为只要你创建环境的时候默认安装的是稳定版本的intel-openmp,(目前所知intel-openmp=2019.5 是不稳定的)
如图:
我后面在师兄账号下重新搭建graphvite 工程的时候发现没有遇到此bug,就是因为创建环境的时候intel-openmp安装的是2019.4版本的。
参考链接:https://github.com/scikit-learn/scikit-learn/pull/15020