Scrapy Tool Greapy and SpiderKeeper

阅读更多
Scrapy Tool Greapy and SpiderKeeper

On my Ubuntu Master Virtual Machine
Check PIP Version
> pip --version
pip 18.1 from /home/carl/.pyenv/versions/3.6.0/lib/python3.6/site-packages/pip (python 3.6)
Install the Gerapy
> pip install gerapy
Check Version
> gerapy -version
0.8.5
In the working directory
> pwd
/home/carl/work
Initiate and create a working directory /home/carl/work/gerapy
> gerapy init
Initiate the database
> cd gerapy/
> gerapy migrate
Then there is  db.sqlite3
Start the Service
> gerapy runserver
Performing system checks...
System check identified no issues (0 silenced).
March 04, 2019 - 06:44:22
Django version 2.1.7, using settings 'gerapy.server.server.settings'
Starting development server at http://127.0.0.1:8000/
Quit the server with CONTROL-C.
Open to the world
> gerapy runserver 0.0.0.0:8000
Performing system checks...
System check identified no issues (0 silenced).
March 04, 2019 - 06:45:57
Django version 2.1.7, using settings 'gerapy.server.server.settings'
Starting development server at http://0.0.0.0:8000/
Then we can visit the page
http://ubuntu-master:8000/#/client
In the UI we can add clients and doing other things.
Install SpiderKeeper
> pip install spiderkeeper
> mkdir spiderkeeper
> cd spiderkeeper/
Start the web
> spiderkeeper --server=http://localhost:6800
Since Gerapy have no Authentication, we need to use SSH
> ssh -L 8010:localhost:8010 root@ubuntu-master -N
http://localhost:8010
Here is the major information to set up Docker Service
Here is the Dockerfile that have all the steps
#Set up Gerapy in Docker
#Prepre the OS
FROM centos/python-36-centos7
MAINTAINER Yiyi Kang
#set user
USER root
#install the softwarea
#upgrade pip
RUN pip3 install --upgrade pip
#install gerapy
RUN pip3 install gerapy
#init gerapy
RUN mkdir -p /tool/
WORKDIR /tool/
RUN gerapy init
WORKDIR /tool/gerapy/
RUN gerapy migrate
#set up the app
EXPOSE  8000
RUN     mkdir -p /app/
ADD     start.sh /app/
WORKDIR /app/

CMD    [ "./start.sh" ]
Here is the Makefile which have all the steps
IMAGE=sillycat/gerapy
TAG=sillycat-gerapy-1.0
NAME=sillycat-gerapy-1.0

docker-context:
build: docker-context
    docker build -t $(IMAGE):$(TAG) .
run:
    docker run -d -p 127.0.0.1:8010:8000 --restart always --name $(NAME) $(IMAGE):$(TAG)
debug:
    docker run -ti -p 8010:8000 --name $(NAME) $(IMAGE):$(TAG) /bin/bash
clean:
    docker stop ${NAME}
    docker rm ${NAME}
logs:
    docker logs ${NAME}
publish:
    docker push ${IMAGE}:${TAG}
fetch:
    docker pull ${IMAGE}:${TAG}
Here is the start command start.sh
#!/bin/sh -ex
#start the service
cd /tool/gerapy/
gerapy runserver 0.0.0.0:8000
Here is the readme how to access the UI
Gerapy use to list scrapyd.
## how to build
>make build
## how to run
>make run
## how ot stop
>make clean
## WebUI
ssh -L 8010:localhost:8010 carl@ubuntu-master -N
http://ubuntu-master:8010/
##

References:
https://blog.csdn.net/fengltxx/article/details/79894839
https://www.jianshu.com/p/f3447c90a0ec
https://github.com/Gerapy/Gerapy
https://github.com/DormyMo/SpiderKeeper
https://askubuntu.com/questions/112177/how-do-i-tunnel-and-browse-the-server-webpage-on-my-laptop

你可能感兴趣的:(Scrapy Tool Greapy and SpiderKeeper)