Python打造自己的爬虫代理池
agentpool是基于python3.7版本的。
yum -y install gcc gcc-c++ make zlib-devel bzip2-devel openssl-devel ncurses-devel sqlite-devel readline-devel tk-devel gdbm-devel db4-devel libpcap-devel xz-devel libffi-devel mariadb mariadb-server mariadb-devel
wget https://www.python.org/ftp/python/3.7.2/Python-3.7.2.tgz
tar -zxvf Python-3.7.0.tgz
cd Python-3.7.0
./configure
make && make install
python3 -V
wget https://codeload.github.com/wangluozhe/agentpool/zip/master
unzip master
cd agentpool-master
pip3 install -r requirements.txt
wget http://download.redis.io/releases/redis-5.0.8.tar.gz
tar -zxvf redis-5.0.8.tar.gz -C /usr/local/
cd /usr/local/redis-5.0.8
make && make install
ln -s /usr/local/redis-5.0.8/src/redis-benchmark /usr/local/bin
ln -s /usr/local/redis-5.0.8/src/redis-check-aof /usr/local/bin
ln -s /usr/local/redis-5.0.8/src/redis-check-rdb /usr/local/bin
ln -s /usr/local/redis-5.0.8/src/redis-cli /usr/local/bin
ln -s /usr/local/redis-5.0.8/src/redis-sentinel /usr/local/bin
ln -s /usr/local/redis-5.0.8/src/redis-server /usr/local/bin
redis-server
nohup python3 agentpool.py > agentpool.log 2>&1 &
... #其他配置
server
{
... #其他配置
location / {
proxy_pass http://127.0.0.1:8000/;
#proxy_pass http://www.nginx.com/; #域名
#proxy_pass http://www.nginx.com/index.html; #带路径
proxy_set_header Host $host;
proxy_set X-Real-IP $remote_addr;
proxy_set X-Forwarded-For $proxy_add_x_forwarded_for;
}
}
如果不会的话可以看看下面这三篇文章。
Nginx编译安装详细过程
Nginx的正向代理和反向代理以及负载均衡和动静分离
Nginx优化之高并发配置,支持2万到3万并发量