Parsero是一个用Python编写的免费脚本,它读取web服务器的robots.txt文件,探测“Disallow”的条目并返回响应状态码。
例:
200 OK The request has succeeded.
403 Forbidden The server understood the request, but is refusing to fulfill it.
404 Not Found The server hasn't found anything matching the Request-URI.
302 Found The requested resource resides temporarily under a different URI.
...
GitHub - behindthefirewalls/Parsero: Parsero | Robots.txt audit tool
# ubuntu 20.04
sudo apt install parsero
root@ubuntu:~# parsero -h
usage: parsero [-h] [-u URL] [-o] [-sb] [-f FILE]
optional arguments:
-h, --help show this help message and exit
-u URL Type the URL which will be analyzed
-o Show only the "HTTP 200" status code
-sb Search in Bing indexed Disallows
-f FILE Scan a list of domains from a list
# 查看百度声明禁止访问哪些链接
root@ubuntu:~# curl www.baidu.com/robots.txt
User-agent: Baiduspider
Disallow: /baidu
Disallow: /s?
Disallow: /ulink?
Disallow: /link?
Disallow: /home/news/data/
Disallow: /bh
User-agent: Googlebot
Disallow: /baidu
Disallow: /s?
Disallow: /shifen/
Disallow: /homepage/
Disallow: /cpro
Disallow: /ulink?
# parsero针对禁止访问的链接进行探测
root@ubuntu:~# parsero -u www.baidu.com
____
| _ \ __ _ _ __ ___ ___ _ __ ___
| |_) / _` | '__/ __|/ _ \ '__/ _ \
| __/ (_| | | \__ \ __/ | | (_) |
|_| \__,_|_| |___/\___|_| \___/
Starting Parsero v0.81 (https://github.com/behindthefirewalls/Parsero) at 12/19/23 13:31:55
Parsero scan report for www.baidu.com
http://www.baidu.com/ 200 OK
http://www.baidu.com/homepage/ 302 Found
http://www.baidu.com/baidu 302 Found
http://www.baidu.com/s? 302 Found
http://www.baidu.com/ulink? 404 Not Found
http://www.baidu.com/cpro 404 Not Found
http://www.baidu.com/bh 302 Found
http://www.baidu.com/link? 404 Not Found
http://www.baidu.com/shifen/ 200 OK
http://www.baidu.com/home/news/data/ 302 Found
[+] 10 links have been analyzed and 2 of them are available!!!
Finished in 0.45 seconds.
本文为"计算机网络实用工具系列"的内容之一,会持续更新其它相关博文。
我的博文内容主要针对“计算机网络”、“安全”、“运维”和“云计算”方向,感兴趣朋友的请关注我,我将不定期发布新的博文并不断改进已发布博文。
后期依据大家对博文的评论,点赞及关注情况,针对大家感兴趣的内容我也会录制视频并整理出成套的学习资料免费分享给大家,期待能和大家一起交流学习。