wpo另类问题:不可轻视的蜘蛛爬行对服务器造成的负担


   



<p style="margin-top: 0.6em; margin-bottom: 0.3em; line-height: 16px; font-family: 'Lucida Grande', 'Lucida Sans Unicode', Calibri, Arial, Helvetica, Sans, FreeSans, Jamrul, Garuda, Kalimati; font-size: 13px; padding: 0px;">《<a style="color: #0071bb; padding: 0px; margin: 0px;" title="对照“BlueDavy的网站架构演变”说说外贸B2C网站实际应用" href="http://www.wpowhy.com/site-architecture-for-b2c-foreign-trade-websites-87/" target="_blank">对照“BlueDavy的网站架构演变”说说外贸B2C网站实际应用</a>》提到,比较大的China-Based?外贸B2C网站,一天的访客数量大约是10~100万。

<p style="margin-top: 0.6em; margin-bottom: 0.3em; line-height: 16px; font-family: 'Lucida Grande', 'Lucida Sans Unicode', Calibri, Arial, Helvetica, Sans, FreeSans, Jamrul, Garuda, Kalimati; font-size: 13px; padding: 0px;">实际上,10万只是指普通访客的IP,没有包括各种搜索蜘蛛的访问,搜索蜘蛛的访问频率很高,往往是数倍、甚至数十倍于访客IP数。这是外贸B2C网站和其他网站有些不同,网站做了很多动态转静态的搜索优化。一个有3万产品的B2C,页面总数量可能达到100万,包括:

<p style="margin-top: 0.6em; margin-bottom: 0.3em; line-height: 16px; font-family: 'Lucida Grande', 'Lucida Sans Unicode', Calibri, Arial, Helvetica, Sans, FreeSans, Jamrul, Garuda, Kalimati; font-size: 13px; padding: 0px;"><span style="color: #ff6600; padding: 0px; margin: 0px;"><em style="padding: 0px; margin: 0px;">2. 产品列表分页,10~20万个(分页规则可以按照产品名、产品价格、产品上架时间,分别升序降序排列,还可以按照每页显示产品的数量不同而变成不同的URL),总之都是为了搜索优化充页面。</em></span>

<p style="margin-top: 0.6em; margin-bottom: 0.3em; line-height: 16px; font-family: 'Lucida Grande', 'Lucida Sans Unicode', Calibri, Arial, Helvetica, Sans, FreeSans, Jamrul, Garuda, Kalimati; font-size: 13px; padding: 0px;"><span style="color: #ff6600; padding: 0px; margin: 0px;"><em style="padding: 0px; margin: 0px;">4. 论坛:1~10万</em></span>

<p style="margin-top: 0.6em; margin-bottom: 0.3em; line-height: 16px; font-family: 'Lucida Grande', 'Lucida Sans Unicode', Calibri, Arial, Helvetica, Sans, FreeSans, Jamrul, Garuda, Kalimati; font-size: 13px; padding: 0px;">————————————————————

<p style="margin-top: 0.6em; margin-bottom: 0.3em; line-height: 16px; font-family: 'Lucida Grande', 'Lucida Sans Unicode', Calibri, Arial, Helvetica, Sans, FreeSans, Jamrul, Garuda, Kalimati; font-size: 13px; padding: 0px;">?

<p style="margin-top: 0.6em; margin-bottom: 0.3em; line-height: 16px; font-family: 'Lucida Grande', 'Lucida Sans Unicode', Calibri, Arial, Helvetica, Sans, FreeSans, Jamrul, Garuda, Kalimati; font-size: 13px; padding: 0px;"><strong style="padding: 0px; margin: 0px;"><span style="color: #ff0000; padding: 0px; margin: 0px;">——服务器变慢了!(下面就是一个实际网站的例子)</span></strong>

<p style="margin-top: 0.6em; margin-bottom: 0.3em; line-height: 16px; font-family: 'Lucida Grande', 'Lucida Sans Unicode', Calibri, Arial, Helvetica, Sans, FreeSans, Jamrul, Garuda, Kalimati; font-size: 13px; padding: 0px;"><a style="color: #0071bb; padding: 0px; margin: 0px;" href="http://www.wpowhy.com/wp-content/uploads/2011/12/site-performance_nEO_IMG1.jpg"><img class="alignnone size-full wp-image-301" style="margin-top: 4px; margin-right: 10px; margin-bottom: 4px; margin-left: 10px; border-top-left-radius: 4px 4px; border-top-right-radius: 4px 4px; border-bottom-right-radius: 4px 4px; border-bottom-left-radius: 4px 4px; padding: 3px;" title="site-performance_nEO_IMG" src="http://www.wpowhy.com/wp-content/uploads/2011/12/site-performance_nEO_IMG1.jpg" alt="" width="749" height="181"></a>

<p style="margin-top: 0.6em; margin-bottom: 0.3em; line-height: 16px; font-family: 'Lucida Grande', 'Lucida Sans Unicode', Calibri, Arial, Helvetica, Sans, FreeSans, Jamrul, Garuda, Kalimati; font-size: 13px; padding: 0px;">看看谁在访问,如下图:

<p style="margin-top: 0.6em; margin-bottom: 0.3em; line-height: 16px; font-family: 'Lucida Grande', 'Lucida Sans Unicode', Calibri, Arial, Helvetica, Sans, FreeSans, Jamrul, Garuda, Kalimati; font-size: 13px; padding: 0px;">这是一个很有意思的现象。大家可以看看前面的例子:《<a style="color: #0071bb; padding: 0px; margin: 0px;" title="WPO网站性能优化对搜索引擎蜘蛛行为的影响" href="http://www.wpowhy.com/wpo-affecting-seo-spider-37/" target="_blank">WPO网站性能优化对搜索引擎蜘蛛行为的影响</a>》提到,网页打开速度快了一倍,Google蜘蛛访问的页面数和页面容量,增加了7倍。

<p style="margin-top: 0.6em; margin-bottom: 0.3em; line-height: 16px; font-family: 'Lucida Grande', 'Lucida Sans Unicode', Calibri, Arial, Helvetica, Sans, FreeSans, Jamrul, Garuda, Kalimati; font-size: 13px; padding: 0px;">?

<p style="margin-top: 0.6em; margin-bottom: 0.3em; line-height: 16px; font-family: 'Lucida Grande', 'Lucida Sans Unicode', Calibri, Arial, Helvetica, Sans, FreeSans, Jamrul, Garuda, Kalimati; font-size: 13px; padding: 0px;">这只是一个用户每天访问5000次的小网站,加上蜘蛛的访问就可能达到50万次。如果是一个用户每天访问10万次的网站,加上蜘蛛的访问,翻个5倍到10倍不出奇。这种现象出现在新闻网站、论坛网站的机会不大,但出现在B2C网站的机会很大。为什么?是因为B2C网站大多数做了很多SEO优化,网站的URL很多,一旦增加了新产品,每个URL显示的内容都会变化,蜘蛛都会当作页面已经更新,然后重新爬行一次。

<p style="margin-top: 0.6em; margin-bottom: 0.3em; line-height: 16px; font-family: 'Lucida Grande', 'Lucida Sans Unicode', Calibri, Arial, Helvetica, Sans, FreeSans, Jamrul, Garuda, Kalimati; font-size: 13px; padding: 0px;"><a style="color: #0071bb; padding: 0px; margin: 0px;" href="http://www.wpowhy.com/wp-content/uploads/2011/12/google-crawl-speed_nEO_IMG.jpg"><img class="alignnone size-full wp-image-302" style="margin-top: 4px; margin-right: 10px; margin-bottom: 4px; margin-left: 10px; border-top-left-radius: 4px 4px; border-top-right-radius: 4px 4px; border-bottom-right-radius: 4px 4px; border-bottom-left-radius: 4px 4px; padding: 3px;" title="google-crawl-speed_nEO_IMG" src="http://www.wpowhy.com/wp-content/uploads/2011/12/google-crawl-speed_nEO_IMG.jpg" alt="" width="775" height="284"></a>

<p style="margin-top: 0.6em; margin-bottom: 0.3em; line-height: 16px; font-family: 'Lucida Grande', 'Lucida Sans Unicode', Calibri, Arial, Helvetica, Sans, FreeSans, Jamrul, Garuda, Kalimati; font-size: 13px; padding: 0px;">?

<p style="margin-top: 0.6em; margin-bottom: 0.3em; line-height: 16px; font-family: 'Lucida Grande', 'Lucida Sans Unicode', Calibri, Arial, Helvetica, Sans, FreeSans, Jamrul, Garuda, Kalimati; font-size: 13px; padding: 0px;">sitemap不是提交的越多越好。如果太多了,google就分不清主次,让蜘蛛爬个遍。

<p style="margin-top: 0.6em; margin-bottom: 0.3em; line-height: 16px; font-family: 'Lucida Grande', 'Lucida Sans Unicode', Calibri, Arial, Helvetica, Sans, FreeSans, Jamrul, Garuda, Kalimati; font-size: 13px; padding: 0px;">所以坚决删掉不重要的sitemap,如下图:

<p style="margin-top: 0.6em; margin-bottom: 0.3em; line-height: 16px; font-family: 'Lucida Grande', 'Lucida Sans Unicode', Calibri, Arial, Helvetica, Sans, FreeSans, Jamrul, Garuda, Kalimati; font-size: 13px; padding: 0px;">?

<p style="margin-top: 0.6em; margin-bottom: 0.3em; line-height: 16px; font-family: 'Lucida Grande', 'Lucida Sans Unicode', Calibri, Arial, Helvetica, Sans, FreeSans, Jamrul, Garuda, Kalimati; font-size: 13px; padding: 0px;">第三种方法:屏蔽那些没用的蜘蛛

<p style="margin-top: 0.6em; margin-bottom: 0.3em; line-height: 16px; font-family: 'Lucida Grande', 'Lucida Sans Unicode', Calibri, Arial, Helvetica, Sans, FreeSans, Jamrul, Garuda, Kalimati; font-size: 13px; padding: 0px;"><a style="color: #0071bb; padding: 0px; margin: 0px;" href="http://www.wpowhy.com/wp-content/uploads/2011/12/robots-block.jpg"><img class="alignnone size-full wp-image-304" style="margin-top: 4px; margin-right: 10px; margin-bottom: 4px; margin-left: 10px; border-top-left-radius: 4px 4px; border-top-right-radius: 4px 4px; border-bottom-right-radius: 4px 4px; border-bottom-left-radius: 4px 4px; padding: 3px;" title="robots-block" src="http://www.wpowhy.com/wp-content/uploads/2011/12/robots-block.jpg" alt="" width="592" height="403"></a>

<p style="margin-top: 0.6em; margin-bottom: 0.3em; line-height: 16px; font-family: 'Lucida Grande', 'Lucida Sans Unicode', Calibri, Arial, Helvetica, Sans, FreeSans, Jamrul, Garuda, Kalimati; font-size: 13px; padding: 0px;">还有就是一些比价网站,经常有蜘蛛专门爬行B2C网站,如果你通过GA发现这些比价网站过来的访问ROI很低的话,干脆屏蔽了。

<p style="margin-top: 0.6em; margin-bottom: 0.3em; line-height: 16px; font-family: 'Lucida Grande', 'Lucida Sans Unicode', Calibri, Arial, Helvetica, Sans, FreeSans, Jamrul, Garuda, Kalimati; font-size: 13px; padding: 0px;">第四种解决方法:后端优化,参见

<p style="margin-top: 0.6em; margin-bottom: 0.3em; line-height: 16px; font-family: 'Lucida Grande', 'Lucida Sans Unicode', Calibri, Arial, Helvetica, Sans, FreeSans, Jamrul, Garuda, Kalimati; font-size: 13px; padding: 0px;">?

<p style="margin-top: 0.6em; margin-bottom: 0.3em; line-height: 16px; font-family: 'Lucida Grande', 'Lucida Sans Unicode', Calibri, Arial, Helvetica, Sans, FreeSans, Jamrul, Garuda, Kalimati; font-size: 13px; padding: 0px;">?

<p style="margin-top: 0.6em; margin-bottom: 0.3em; line-height: 16px; font-family: 'Lucida Grande', 'Lucida Sans Unicode', Calibri, Arial, Helvetica, Sans, FreeSans, Jamrul, Garuda, Kalimati; font-size: 13px; padding: 0px;">版权属于:?谭砚耘 (<a style="color: #0071bb; padding: 0px; margin: 0px;" title="TOTHETOP 至尚国际" href="http://www.tothetop.ca/" target="_blank">TOTHETOP至尚国际</a>?及?<a style="color: #0071bb; padding: 0px; margin: 0px;" title="创思集团" href="http://www.chance.net.cn/" target="_blank">创思集团</a>?)

<p style="margin-top: 0.6em; margin-bottom: 0.3em; line-height: 16px; font-family: 'Lucida Grande', 'Lucida Sans Unicode', Calibri, Arial, Helvetica, Sans, FreeSans, Jamrul, Garuda, Kalimati; font-size: 13px; padding: 0px;">如果你希望与作者交流,请发送邮件到?tanyanyun/at/163.com?别忘了修改小老鼠

 

你可能感兴趣的:(java,工作)