centos5.8 安装CutyCapt
CutyCapt 是Linux下抓取网页截图的软件,需要先安装QT,以下为Centos 5.8 64 Bit 安装
需求:有些网站长度过长,截图太慢.
例如:163的网站,要抓取就需要多次截图,比较麻烦
原来想用命令curl抓取网页内容,但是发现好多链接图片是无法抓取过来的.
1.增加atrpms的yum源
vi /etc/yum.repos.d/atrpms.repo
[atrpms]
baseurl=http://dl.atrpms.net/el$releasever-$basearch/atrpms/testing
enabled=1
gpgcheck=0
2.安装qt47及相应包(下面的这些安装包,下载下来强制安装吧,上面那个源里面有两个版本,容易造成版本冲突)
qt47-4.7.2-1_18.el5
qt47-devel-4.7.2-1_18.el5
qt47-x11-4.7.2-1_18.el5
qt47-webkit-4.7.2-1_18.el5
qt47-webkit-devel-4.7.2-1_18.el5
phonon-backend-gstreamer-4.7.2-1_18.el5
rpm -Uvh --force --nodeps qt47-devel-4.7.2-1_18.el5.x86_64.rpm
3. 安装 CutyCapt
svn安装
#yum install subversion
svn co https://cutycapt.svn.sourceforge.net/svnroot/cutycapt
mv cutycapt/CutyCapt /usr/local/cutycapt
cd /usr/local/cutycapt/
#这步网上很多都是直接用qmake,但是我的有问题,因为qmake是qt3的
qmake-qt47
make
#* 再执行make的时候有可能会报如下错误
# make
g++ -Wl,-O1 -o CutyCapt CutyCapt.o moc_CutyCapt.o -L/usr/lib64/qt47 -lQtWebKit -lQtSvg -L/usr/lib64/qt47 -lQtGui -lQtNetwork -lQtCore �Clpthread
/usr/lib64/qt47/libQtWebKit.so: undefined reference to `sqlite3_prepare16_v2'
/usr/lib64/qt47/libQtWebKit.so: undefined reference to `sqlite3_column_value'
collect2: ld returned 1 exit status
make: *** [CutyCapt] Error 1
解决:
升级到sqlite-3.6,不要卸载在安装 #如果是6.0,默认就是3.6
yum update sqlite
#* make完就生成了CutyCapt这个可执行文件。
4. 运行环境
# ./CutyCapt --help
CutyCapt: cannot connect to X server
#* 网上很多都是要在装个xvfb-run.sh的,其它不用这么麻烦:
echo "export DISPLAY=':1.0'" >> /etc/profile
source /etc/profile
vncserver
[root@zhaoyong cutycapt]# ./CutyCapt --help
---------------------------------------------------------------------
Usage: CutyCapt --url=http://www.example.org/ --out=localfile.png
---------------------------------------------------------------------
--help Print this help page and exit
--url=<url> The URL to capture (http:...|file:...|...)
--out=<path> The target file (.png|pdf|ps|svg|jpeg|...)
--out-format=<f> Like extension in --out, overrides heuristic
--min-width=<int> Minimal width for the image (default: 800)
--min-height=<int> Minimal height for the image (default: 600)
--max-wait=<ms> Don't wait more than (default: 90000, inf: 0)
--delay=<ms> After successful load, wait (default: 0)
--user-style-path=<path> Location of user style sheet file, if any
--user-style-string=<css> User style rules specified as text
--header=<name>:<value> request header; repeatable; some can't be set
--method=<get|post|put> Specifies the request method (default: get)
--body-string=<string> Unencoded request body (default: none)
--body-base64=<base64> Base64-encoded request body (default: none)
--app-name=<name> appName used in User-Agent; default is none
--app-version=<version> appVers used in User-Agent; default is none
--user-agent=<string> Override the User-Agent header Qt would set
--javascript=<on|off> JavaScript execution (default: on)
--java=<on|off> Java execution (default: unknown)
--plugins=<on|off> Plugin execution (default: unknown)
--private-browsing=<on|off> Private browsing (default: unknown)
--auto-load-images=<on|off> Automatic image loading (default: on)
--js-can-open-windows=<on|off> Script can open windows? (default: unknown)
--js-can-access-clipboard=<on|off> Script clipboard privs (default: unknown)
--print-backgrounds=<on|off> Backgrounds in PDF/PS output (default: off)
--zoom-factor=<float> Page zoom factor (default: no zooming)
--zoom-text-only=<on|off> Whether to zoom only the text (default: off)
--http-proxy=<url> Address for HTTP proxy server (default: none)
---------------------------------------------------------------------
<f> is svg,ps,pdf,itext,html,rtree,png,jpeg,mng,tiff,gif,bmp,ppm,xbm,xpm
---------------------------------------------------------------------
http://cutycapt.sf.net - (c) 2003-2010 Bjoern Hoehrmann -
[email protected]
安装中文语言包
# yum install fonts-chinese
最后就可以抓取想要的页面了
[root@zhaoyong ~]# cd /usr/local/cutycapt/
[root@zhaoyong cutycapt]# ./CutyCapt --url=http://www.163.com/ --out=/root/163.jpg ---> 抓取的页面的位置可以随意指定
转换整个页面至第一截屏
[root@zhaoyong ~]# convert -crop 1024x768+0+0 163.jpg 1632.jpg
缩小图片
[root@zhaoyong ~]# convert -resize 40%x40% 1632.jpg 1632.jpg