RSS是 Really Simple Syndication的缩写(对rss2.0而言,是这三个词的缩写,对rss1.0而言则是RDF Site Summary的缩写,1.0与2.0走的是两个体系)。
RSS 基于XML,所有的 RSS 必须遵循w3c网站上公布的XML 1.0 规范。
在一个RSS文档中,根元素是<rss>,带有一个必备属性version,用以指明该文档遵循的rss规范,如果rss文档遵循本规范,则version值必须是2.0。
<rss>元素只有一个子元素,包含关于频道的一些信息。频道(channel)是整个blog,项(item)指一篇文章或日志(也有称这为post)。
RSS2.0元素channel的子元素列表
元素(Element) | 描述(Description) | 重要性 |
title | 频道名称 | 必备 |
link | 频道的URL | 必备 |
Description | 频道的描述 | 必备 |
copyright | 频道内容的版权说明 | 可选 |
pubDate | 频道内容发布日期,格式遵循RFC822格式(年份可为2们或4位) | 可选 |
category | 指定频道所属的一个或几个类别 | 可选 |
generator | 生成该频道的程序名 | 可选 |
docs | 指向该RSS文件所用格式说明的URL | 可选 |
cloud | Allows processes to register with a cloud to be notified of updates to the channel, implementing a lightweight publish-subscribe protocol for RSS feeds. More info here. | 可选 |
ttl | 有效期,用以指明该频道可被缓存的最长时间 | 可选 |
image | 指定一个 GIF或JPEG或PNG图片,用以与频道一起显示 | 可选 |
rating | 这个频道的分级(主要指成人、限制、儿童等) | 可选 |
textInput | 指定一个text输入框供用户输入,具体信息及功能未定。 | 可选 |
skipHours | 提示新闻聚合器,那些小时时段它可以跳过。 | 可选 |
skipDays | 提示新闻聚合器,那些天它可以跳过。 | 可选 |
下面对其中我开发过程中遇到的问题进行整理:
知识1:加样式不生效
<?xml version="1.0" encoding="UTF-8"?> <rss version="2.0"> <channel> <title>实体营业厅</title> <link>http://www.sh.10086.cn</link> <description /> <item> <description><div><span style='color:blue;'>宝山</span><div style='float:left;widht:400px;'><span style='color:blue;>购机中心-通河路店</span><br/></div> </description> <author /> </item> </channel> </rss>
刚开始我在<description></description>节点中加的部分html代码,可通过浏览器(FireFox)无法看到样式生效后效果!看到的效果都是没有样式的!
在网上查了查,原来Firefox有这样个Bug---Feed View overrides XSLT stylesheet defined in XML document(Feed View overrides XSLT stylesheet defined in XML document)
Firefox在看到一个URL的页面是XML后,会先扫描(scan/sniff)文档的前512个字节,如果发现有<rss或者<feed,就判断这是个feed文档,不理会自带的style,而直接使用Firefox的方式展现这个文档。所以,最快速和简单的解办法就是不让Firefox在文档的前512字节里看到<rss和<feed。上述的Bug报告的反馈里,Firefox开发者就提到了这个Hack:
The emerging workaround for this problem (which isn’t new to us, since we’re using the same heuristic that IE7 betas have been using for months) is to put in a comment ranting about the evils of sniffing web content and overriding the desires of web developers which is long enough to move "<rss" or "<feed" out of the first 512 bytes, since that’s all we sniff.
这就说明了新浪网的rss为什么能够在firefox中正确显示,因为在新浪网的rss文件中的开头部分都有一个很长的英文注释,如下:
<?xml version="1.0" encoding="utf-8"?> <?xml-stylesheet type="text/xsl" title="XSL Formatting" href="/show_new_final.xsl" media="all"?> <!-- SINA Corporation (NASDAQ: SINA) is a leading online media company and value-added information service (VAS) provider for China and for Chinese communities worldwide. With a branded network of localized websites targeting Greater China and overseas Chinese, SINA provides services through five major business lines including SINA.com (online news and content), SINA Mobile (mobile value-added services), SINA Online (community-based services and games), SINA.net (search and enterprise services) and SINA E-commerce (online shopping), offering Internet users and government and business clients an array of services including online media and entertainment, online fee-based VAS/wireless VAS, and e-commerce and enterprise e-solutions. With 230 million registered users worldwide, 450 million daily page views and over 60 million active users for a variety of fee-based services, SINA is the most recognized Internet brand name in China and among Chinese communities globally. In various surveys and polls, SINA has been recognized as the most valuable brand and the most popular website in China. For 2003 and 2005, SINA was ranked the "Most Preferred Website" in China according to the Chinese Academy of Social Sciences and considered "The Most Respected Chinese Company" for three consecutive years in 2003, 2004 and 2005 by the Economic Observer and the Management Case Study Center of Beijing University. At the same time, South China Weekend in both 2003 and 2004 honored SINA with the prestigious award of the "Chinese Language Medium of the Year." --> <rss version="2.0" >
这是最简单的办法,代价是512字节。Mozilla Developer Center也提出了另外两个方法,见Custom styles for RSS。
而且可以通过添加xsl样式表的方式,来美化页面:
<?xml version="1.0" encoding="UTF-8"?> <?xml-stylesheet type="text/xsl" title="XSL Formatting" href="/sh/rss/yyt_xsl.xsl" media="all"?> <!--sh.10086.cn Corporation (NASDAQ: SH) is a leading online media company and value-added information service (VAS) provider for China and for Chinese communities worldwide. With a branded network of localized websites targeting Greater China and overseas Chinese, SINA provides services through five major business lines including SINA.com (online news and content), SINA Mobile (mobile value-added services), SINA Online (community-based services and games), SINA.net (search and enterprise services) and SINA E-commerce (online shopping), offering Internet users and government and business clients an array of services including online media and entertainment, online fee-based VAS/wireless VAS, and e-commerce and enterprise e-solutions.With 230 million registered users worldwide, 450 million daily page views and over 60 million active users for a variety of fee-based services, SINA is the most recognized Internet brand name in China and among Chinese communities globally.In various surveys and polls, SINA has been recognized as the most valuable brand and the most popular website in China. For 2003 and 2005, SINA was ranked the "Most Preferred Website" in China according to the Chinese Academy of Social Sciences and considered "The Most Respected Chinese Company" for three consecutive years in 2003, 2004 and 2005 by the Economic Observer and the Management Case Study Center of Beijing University. At the same time, South China Weekend in both 2003 and 2004 honored SH with the prestigious award of the Chinese Language Medium of the Year.--> <rss version="2.0"> <channel> <title><![CDATA[实体营业厅]]></title> <link><![CDATA[http://www.sh.10086.cn]]></link> <description /> <item> <value><![CDATA[1]]></value> <title><![CDATA[购机中心-通河路店]]></title> <address><![CDATA[通河路286--290号]]></address> <time><![CDATA[9:00--20:30]]></time> <code><![CDATA[]]></code> <phone><![CDATA[18621880223]]></phone> <busitype><![CDATA[基础业务]]></busitype> <quyu><![CDATA[宝山]]></quyu> <description><![CDATA[<div style="background-color:#eeeeee;width:280px;height:180px;float:left;margin-left:10px;text-align:left;margin-top:20px;padding:20px;"><div style="margin-top:6px;"><span style="color:blue;"><b>购机中心-通河路店</b></span></div><div style="margin-top:6px;"><span><b>地 址:</b></span>通河路286--290号</span></div><div style="margin-top:6px;"><span><b>营业时间:</b></span>9:00--20:30</span></div><div style="margin-top:6px;"><span><b>邮政编码:</b></span></span></div><div style="margin-top:6px;"><span><b>联系电话:</b></span>18621880223</span></div><div style="margin-top:6px;"><span><b>业务受理种类:</b></span>基础业务</span></div><div style="margin-top:6px;"><span><b>区 域:</b></span>宝山</span></div></div>]]></description> <link>http://www.sh.10086.cn?t=136609478033110793212516461462</link> </item> <item> <value><![CDATA[0]]></value> <title><![CDATA[宝山营业厅]]></title> <address><![CDATA[牡丹江路1512号]]></address> <time><![CDATA[9:00-19:00]]></time> <code><![CDATA[201900]]></code> <phone><![CDATA[13916710080]]></phone> <busitype><![CDATA[基础业务]]></busitype> <quyu><![CDATA[宝山]]></quyu> <description><![CDATA[<div style="background-color:#eeeeee;width:280px;height:180px;float:left;margin-left:10px;text-align:left;margin-top:20px;padding:20px;"><div style="margin-top:6px;"><span style="color:blue;"><b>宝山营业厅</b></span></div><div style="margin-top:6px;"><span><b>地 址:</b></span>牡丹江路1512号</span></div><div style="margin-top:6px;"><span><b>营业时间:</b></span>9:00-19:00</span></div><div style="margin-top:6px;"><span><b>邮政编码:</b></span>201900</span></div><div style="margin-top:6px;"><span><b>联系电话:</b></span>13916710080</span></div><div style="margin-top:6px;"><span><b>业务受理种类:</b></span>基础业务</span></div><div style="margin-top:6px;"><span><b>区 域:</b></span>宝山</span></div></div>]]></description> <link>http://www.sh.10086.cn?t=136609478033110793212516489061</link> </item> </channel> </rss>
Yyt_xsl.xsl
<?xml version="1.0" encoding="UTF-8"?> <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="2.0"> <xsl:template match="/"> <html> <head> <title> RSS-<xsl:value-of select="rss/channel/title"/> </title> </head> <body> <center> <xsl:apply-templates select="rss/channel"/> </center> </body> </html> </xsl:template> <xsl:template match="channel"> <xsl:for-each select="item"> <xsl:variable name="v"> <xsl:value-of select="value"/> </xsl:variable> <xsl:if test="$v > 0"> <div style="clear:both;color:blue;font-size:28px;margin-top:10px 0;font-weight:bold;text-align:left;padding-left:30px;"><xsl:value-of select="quyu"/> </div> </xsl:if> <div style="background-color:#eeeeee;width:280px;height:180px;float:left;margin-left:10px;text-align:left;margin-top:20px;padding:20px;"> <div style="margin-top:6px;"><span style="color:blue;"><b><xsl:value-of select="title"/></b></span> </div> <div style="margin-top:6px;"><span ><b>地 址:</b></span><xsl:value-of select="address"/> </div> <div style="margin-top:6px;"><span><b>营业时间:</b></span><xsl:value-of select="time"/> </div> <div style="margin-top:6px;"><span><b>邮政编码:</b></span><xsl:value-of select="code"/> </div> <div style="margin-top:6px;"><span><b>联系电话:</b></span><xsl:value-of select="phone"/> </div> <div style="margin-top:6px;"><span><b>业务受理种类:</b></span><xsl:value-of select="busitype"/> </div> <div style="margin-top:6px;"><span><b>区 域:</b></span><xsl:value-of select="quyu"/> </div> </div> </xsl:for-each> </xsl:template> </xsl:stylesheet>
知识点2:
Rss阅读器只检测到一个item中的内容,即只显示一个item里的内容,但item是有多个的!
原因: <item></item>节点中的<link></link>值不能有重复,如果值相同,则只会显示一个item。
参考资料:
https://bugzilla.mozilla.org/show_bug.cgi?id=338621#c1
http://www.25175.com/200609/25175/25175_html/2007-04/1598.html
http://hi.baidu.com/yisqiu/item/5adb4f21c8855c3394f62b1e