yandex bot user agent

爬虫识别网站收集和整理了 yandexbot 所有的 user-agent，方便大家识别 yandexbot。

yandexbot user-agent 列表

Mozilla/5.0 (compatible; YandexAccessibilityBot/3.0; +http://yandex.com/bots)

说明：YandexAccessibilityBot 下载页面以检查用户的可访问性。它每秒最多向站点发送 3 个请求。机器人会忽略Yandex.Webmaster 界面中的设置。

是否遵守 robots.txt 协议：否

Mozilla/5.0 (compatible; YandexAdNet/1.0; +http://yandex.com/bots)

说明：Yandex 广告机器人

是否遵守 robots.txt 协议：是

Mozilla/5.0 (compatible; YandexBlogs/0.99; robot; +http://yandex.com/bots)

说明：索引帖子评论的博客搜索机器人。

是否遵守 robots.txt 协议：是

Mozilla/5.0 (compatible; YandexBot/3.0; +http://yandex.com/bots)

说明：yandex 搜索引擎主要索引机器人

是否遵守 robots.txt 协议：是

Mozilla/5.0 (compatible; YandexBot/3.0; MirrorDetector; +http://yandex.com/bots)

说明：检测站点镜像机器人

是否遵守 robots.txt 协议：是

Mozilla/5.0 (compatible; YandexCalendar/1.0; +http://yandex.com/bots)

说明：Yandex.Calendar 机器人。根据用户的请求下载日历文件。这些文件通常位于禁止索引的目录中。

是否遵守 robots.txt 协议：否

Mozilla/5.0 (compatible; YandexDirect/3.0; +http://yandex.com/bots)

说明：下载有关 Yandex Advertising 网络合作伙伴网站内容的信息，以识别其主题类别以匹配相关广告。

是否遵守 robots.txt 协议：否

Mozilla/5.0 (compatible; YandexDirectDyn/1.0; +http://yandex.com/bots

说明：生成动态 banner

是否遵守 robots.txt 协议：否

Mozilla/5.0 (compatible; YandexFavicons/1.0; +http://yandex.com/bots)

说明：下载站点的图标文件以显示在搜索结果中。

是否遵守 robots.txt 协议：否

Mozilla/5.0 (compatible; YaDirectFetcher/1.0; Dyatel; +http://yandex.com/bots)

说明：下载广告的目标页面以检查其可用性和主题。这是在搜索结果和合作伙伴网站上放置广告所必需的。

是否遵守 robots.txt 协议：否，器人不使用 robots.txt 文件并忽略为其设置的指令。

Mozilla/5.0 (compatible; YandexForDomain/1.0; +http://yandex.com/bots)

说明：Yandex.Mail 域机器人，用于验证域所有权。

是否遵守 robots.txt 协议：是

Mozilla/5.0 (compatible; YandexImages/3.0; +http://yandex.com/bots)

说明：Yandex 图片索引机器人。

是否遵守 robots.txt 协议：是

Mozilla/5.0 (compatible; YandexImageResizer/2.0; +http://yandex.com/bots)

说明：移动设备机器人。

是否遵守 robots.txt 协议：是

Mozilla/5.0 (iPhone; CPU iPhone OS 8_1 like Mac OS X) AppleWebKit/600.1.4 (KHTML, like Gecko) Version/8.0 Mobile/12B411 Safari/600.1.4 (compatible; YandexBot/3.0; +http://yandex.com/bots)

说明： Ynadex 搜索引擎索引机器人。

是否遵守 robots.txt 协议：是

Mozilla/5.0 (iPhone; CPU iPhone OS 8_1 like Mac OS X) AppleWebKit/600.1.4 (KHTML, like Gecko) Version/8.0 Mobile/12B411 Safari/600.1.4 (compatible; YandexMobileBot/3.0; +http://yandex.com/bots)

说明：定义布局适合移动设备的页面。

是否遵守 robots.txt 协议：否

Mozilla/5.0 (compatible; YandexMarket/1.0; +http://yandex.com/bots)

说明：Yandex.Market 机器人。

是否遵守 robots.txt 协议：是

Mozilla/5.0 (compatible; YandexMarket/2.0; +http://yandex.com/bots)

说明：Yandex.Market 机器人。

是否遵守 robots.txt 协议：否

Mozilla/5.0 (compatible; YandexMedia/3.0; +http://yandex.com/bots)

说明：索引多媒体数据。

是否遵守 robots.txt 协议：是

Mozilla/5.0 (compatible; YandexMetrika/2.0; +http://yandex.com/bots yabs01)

说明：Yandex.Metrica 机器人。下载并缓存 CSS 样式以在 Webvisor 中呈现网站页面。

是否遵守 robots.txt 协议：否，机器人不使用 robots.txt 文件并忽略为其设置的指令。

Mozilla/5.0 (compatible; YandexMobileScreenShotBot/1.0; +http://yandex.com/bots)

说明：截取移动页面的屏幕截图。

是否遵守 robots.txt 协议：否

Mozilla/5.0 (compatible; YandexNews/4.0; +http://yandex.com/bots)

说明：Yandex.News 机器人。

是否遵守 robots.txt 协议：是

Mozilla/5.0 (compatible; YandexOntoDB/1.0; +http://yandex.com/bots)

说明：对象响应爬虫

是否遵守 robots.txt 协议：是

Mozilla/5.0 (compatible; YandexOntoDBAPI/1.0; +http://yandex.com/bots)

说明：下载动态数据的对象响应机器人。

是否遵守 robots.txt 协议：否

Mozilla/5.0 (compatible; YandexPagechecker/1.0; +http://yandex.com/bots)

说明：通过结构化数据验证器访问验证微标记的页面。

是否遵守 robots.txt 协议：是

Mozilla/5.0 (compatible; YandexPartner/3.0; +http://yandex.com/bots)

说明：下载有关 Yandex 合作伙伴网站内容的信息。

是否遵守 robots.txt 协议：否

Mozilla/5.0 (compatible; YandexRCA/1.0; +http://yandex.com/bots)

说明：收集数据以生成预览。例如，向导预览。

是否遵守 robots.txt 协议：否

Mozilla/5.0 (compatible; YandexSearchShop/1.0; +http://yandex.com/bots)

说明：按用户要求下载 YML 文件中的产品目录。这些文件通常放置在禁止索引的目录中。

是否遵守 robots.txt 协议：否

Mozilla/5.0 (compatible; YandexSitelinks; Dyatel; +http://yandex.com/bots)

说明：检查用作附加链接的页面的可用性。

是否遵守 robots.txt 协议：是

Mozilla/5.0 (compatible; YandexSpravBot/1.0; +http://yandex.com/bots)

说明：Yandex.Business 爬虫。

是否遵守 robots.txt 协议：是

Mozilla/5.0 (compatible; YandexTracker/1.0; +http://yandex.com/bots)

说明：Yandex.Tracker 爬虫。

是否遵守 robots.txt 协议：否

Mozilla/5.0 (compatible; YandexTurbo/1.0; +http://yandex.com/bots)

说明：抓取为生成 Turbo 页面而创建的 RSS 提要。它每秒最多向站点发送 3 个请求。机器人会忽略 Yandex.Webmaster 界面和 Crawl-delay 指令中的设置。

是否遵守 robots.txt 协议：是

Mozilla/5.0 (compatible; YandexVertis/3.0; +http://yandex.com/bots)

说明：垂直搜索机器人。

是否遵守 robots.txt 协议：是

Mozilla/5.0 (compatible; YandexVerticals/1.0; +http://yandex.com/bots)

说明：Yandex.Verticals 机器人：Auto.ru、Yanex.Realty、Yandex.Rabota、Yandex.Reviews。

是否遵守 robots.txt 协议：是

Mozilla/5.0 (compatible; YandexVideo/3.0; +http://yandex.com/bots)

说明：Yandex.Video 索引爬虫，显示的视频剪辑。

是否遵守 robots.txt 协议：是

Mozilla/5.0 (compatible; YandexVideoParser/1.0; +http://yandex.com/bots)

说明：Yandex.Video 索引爬虫，显示的视频剪辑。

是否遵守 robots.txt 协议：否

Mozilla/5.0 (compatible; YandexWebmaster/2.0; +http://yandex.com/bots)

说明：Yandex.Webmaster 机器人

是否遵守 robots.txt 协议：是

Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/W.X.Y.Z* Safari/537.36 (compatible; YandexScreenshotBot/3.0; +http://yandex.com/bots)

说明：截取页面的屏幕截图。

是否遵守 robots.txt 协议：是

* WXYZ 字符的组合是 Chrome 浏览器的用户代理版本的占位符。例如：101.0.4951.54。

总结

这篇文章收集和整理了所有 Yandex 爬虫的 User-agent 列表，由于 Yandex 业务繁杂，所以存在各种各样的爬虫，我们在运营网站的时候，如果不确定是不是 Yandex 的爬虫，我们可以看看上面的 User-agent 与您日志中的 User-agent 是否符合。

爬虫识别是一个专门识别互联网上各种爬虫的网站，使您免受伪造爬虫和恶意爬虫的侵扰。

yandex bot user agent

yandexbot user-agent 列表

总结

你可能感兴趣的:(爬虫)