正则表达式:
(([a-z]+:/{2})?(?@)?(?:[\d\w]+[-\d\w].)+(?:com|cn|im|xin|shop|ltd|club|top|wang|xyz|site|vip|net|bb|cc|gov|ren|biz|red|link|mobi|info|org|com.cn|net.cn|org.cn|gov.cn|name|io|tt|coop|biz|aero|travel|pub|edu|CC|ink|pro|tv|kim|group|中国|我爱你|公司|网络|网址|集团)|localhost|(?:\d+.){3}\d+)(?::\d+)?(?:[/][%\w-.]+[#\w-.]+)(?:.[a-z]+)?(?:[.#/?](?:&?[a-z]+=?[#\w%-|])*)?)
测试用例:
http://www.test.cn
test.cn
–www.test.cn
a.b
a.bb
www.test.cn
https://open.weixin.qq.com/connect/oauth2/authorize?appid=wx26e7a7c7b3ee376c&redirect_uri=https%3A%2F%2Fwww.test.cn%2Fmobile%2Fmainline%2Finfo%2F4974167072967091022%3FagentId%3D1000018%26tenantKey%3DT7AKVDNF84%26corpId%3Dwx26e7a7c7b3ee376c%26suiteId%3Dtj449fa629ed498b1b&response_type=code&scope=snsapi_base&state=4cea876c-08c3-430c-ba1c-6de5db99b844&connect_redirect=1#wechat_redirect
https://work.weixin.qq.com/api/doc#10029/%E9%80%9A%E8%AE%AF%E5%BD%95%E9%80%89%E4%BA%BA%E6%8E%A5%E5%8F%A3
aa.bb
https://www.baidu.com/s?ie=utf-8&f=8&rsv_bp=1&rsv_idx=1&tn=baidu&wd=%E8%B6%85%E9%93%BE%E6%8E%A5%E5%A4%A7%E5%85%A8&oq=%25E7%2599%25BE%25E5%25BA%25A6%25E7%25BF%25BB%25E8%25AF%2591&rsv_pq=8e10386d000f01e1&rsv_t=99417jLisGhSh24v3BHnrdzH4289GpQVG9CPviAxrvAfwyHv3xlAu3HwSjE&rqlang=cn&rsv_enter=1&inputT=6295&rsv_sug3=52&rsv_sug1=60&rsv_sug7=101&bs=%E7%99%BE%E5%BA%A6%E7%BF%BB%E8%AF%91
shimo.im
http://127.0.0.1:9080/test
http://prototype.test.cn/%E5%85%B6%E4%BB%96/%E5%B8%AE%E5%8A%A9%E4%B8%AD%E5%BF%83
http://e-biaoge.cn
http://localhost:8080
https://www.test.cn/sssss?ddddd
http://shimo.im
http://shimo.org
https://www.test.cn?ddddd&ffffff
http://test.cn
http://www.test.cn/张三
https://www.test.cn/crms/customer?info=view_CustomerView|customerId_6999665833097540289
https://test.cn?ddd=abacd|&suiji=duoshao
http://www.test.cn/,
https://www.test.cn/tasks/7000115655580019362/shareToMe
https://www.test.cn/crms/customer?menu=key_mineCreate|state_0|module_customer|targetId_201&search=view_&table=view_&info=view_&tab=type_
http://restapi.amap.com/v3/staticmap?location=121.474533,31.172761&zoom=13&scale=2&size=200*100&markers=mid,A:121.474533,31.172761&key=bae9eb9e1301652ebfb8b66c413c0b84
https://passport.teems.cn/bindService?openId=456789-8i7MuelU8UVZM-f7_XrUM
http://tencentdba.com/blog/tendb-cluster/
http://dbaplus.cn/news-11-1205-1.html
http://localhost:9080/eteamslogin?ticket=ST-17-aa1MF6jGZxGv4b3AffPD-castest.eteams.cn
这里有个Spider内核分析的文章:
http://dbaplus.cn/news-11-1205-1.html,bb你好
https://redis.io
a.im
a.xin
a.shop
a.ltd
a.club
a.top
a.wang
a.xyz
a.site
a.vip
b.net
b.cc
b.ren
b.biz
b.red
b.link
b.mobi
b.info
b.org
b.org
b.gov
c.name
c.ink
c.pro
c.tv
c.kim
c.group
c.我爱你
c.中国
c.公司
c.网络
c.网址
c.集团
支持后缀:
(com|cn|im|xin|shop|ltd|club|top|wang|xyz|site|vip|net|cc|ren|biz|red|link|mobi|info|org|com.cn|net.cn|org.cn|gov.cn|name|ink|pro|tv|kim|group|我爱你|中国|公司|网络|网址|集团)