平时经常用到wget/curl命令行下载页面,有时候会遇到curl命令行下载为空的情况,如:
zhuliting@zhuliting:~$ curl 'http://m.youku.com' -o youku.html % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0 zhuliting@zhuliting:~$ file youku.html youku.html: ERROR: cannot open `youku.html' (No such file or directory)
zhuliting@zhuliting:~$ wget 'http://m.youku.com' -O youku.html --2015-08-06 20:09:44-- http://m.youku.com/ 正在解析主机 m.youku.com (m.youku.com)... 211.151.146.60 正在连接 m.youku.com (m.youku.com)|211.151.146.60|:80... 已连接。 已发出 HTTP 请求,正在等待回应... 302 Moved Temporarily 位置:http://m.youku.com/wap/ [跟随至新的 URL] --2015-08-06 20:09:44-- http://m.youku.com/wap/ 再次使用存在的到 m.youku.com:80 的连接。 已发出 HTTP 请求,正在等待回应... 200 OK 长度: 16603 (16K) [text/html] 正在保存至: “youku.html” 100%[======================================================================================================>] 16,603 86.4KB/s 用时 0.2s 2015-08-06 20:09:44 (86.4 KB/s) - 已保存 “youku.html” [16603/16603]) zhuliting@zhuliting:~$ file youku.html youku.html: HTML document, UTF-8 Unicode text
zhuliting@zhuliting:~$ curl 'http://m.youku.com' -o youku.html -v * Rebuilt URL to: http://m.youku.com/ * Hostname was NOT found in DNS cache % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0* Trying 211.151.146.60... * Connected to m.youku.com (211.151.146.60) port 80 (#0) > GET / HTTP/1.1 > User-Agent: curl/7.35.0 > Host: m.youku.com > Accept: */* > < HTTP/1.1 302 Moved Temporarily * Server nginx/0.8.54 is not blacklisted < Server: nginx/0.8.54 < Date: Thu, 06 Aug 2015 12:14:52 GMT < Content-Type: text/html;charset=utf-8 < Connection: keep-alive < Set-Cookie: JSESSIONID=ED87E6E12D2605176672773D570B8377; Path=/; HttpOnly < Location: http://www.youku.com < Content-Length: 0 < 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0 * Connection #0 to host m.youku.com left intact
通过man curl发现,可以通过加选项-L进行重定向,来解决下载为空的问题:
zhuliting@zhuliting:~$ curl 'http://m.youku.com' -o youku.html -v -L * Rebuilt URL to: http://m.youku.com/ * Hostname was NOT found in DNS cache % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0* Trying 211.151.146.60... * Connected to m.youku.com (211.151.146.60) port 80 (#0) > GET / HTTP/1.1 > User-Agent: curl/7.35.0 > Host: m.youku.com > Accept: */* > < HTTP/1.1 302 Moved Temporarily * Server nginx/0.8.54 is not blacklisted < Server: nginx/0.8.54 < Date: Thu, 06 Aug 2015 12:17:43 GMT < Content-Type: text/html;charset=utf-8 < Connection: keep-alive < Set-Cookie: JSESSIONID=5ED7472830728CC0C65B60F9EDCCE087; Path=/; HttpOnly < Location: http://www.youku.com < Content-Length: 0 < 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0 * Connection #0 to host m.youku.com left intact * Issue another request to this URL: 'http://www.youku.com' * Rebuilt URL to: http://www.youku.com/ * Hostname was NOT found in DNS cache * Trying 43.250.12.42... * Connected to www.youku.com (43.250.12.42) port 80 (#1) > GET / HTTP/1.1 > User-Agent: curl/7.35.0 > Host: www.youku.com > Accept: */* > < HTTP/1.1 200 OK < Content-Type: text/html < Accept-Ranges: bytes < ETag: "2574430301" < Last-Modified: Thu, 06 Aug 2015 12:16:02 GMT < Content-Length: 619907 < Connection: close < Date: Thu, 06 Aug 2015 12:17:43 GMT * Server b28www4 is not blacklisted < Server: b28www4 < { [data not shown] 100 605k 100 605k 0 0 394k 0 0:00:01 0:00:01 --:--:-- 458k * Closing connection 1