Rails中html_escape和sanitize


       一般来说,通常使用input的field都会做一些filter的动作,避免被不怀好意之徒塞一些危险的 HTML code(script等)进去搞破坏。在ROR中,我们在前面加一个h()(一般不用括号?不容易看到?)即可,h在ROR中起什么作用呢?它是 html_escape的alias(别名),它会将所有的"<xx>"变成&lt;,&gt,比如:
js 代码 <script>alert('a');</script>会变成:    &lt;script&gt;alert('a');&lt;/script&gt;
这样就完全做不了乱了。因为所有的tag都被搞掉了。这样太严格了,有时候我们需要开放一些字体,颜色等tag给用户使用,用h()就不行了,正好ROR中还有个方法,sanitize()这个方法可以帮你实现这个愿望。ROR API中:
       Sanitizes the given HTML by making form and script tags into regular text, and removing all "onxxx" attributes (so that arbitrary Javascript cannot be executed). Also removes href attributes that start with "javascript:".
       
      它会砍掉script这个tag,以及onXxxx之类的attribut,你没有机会执行javascript,但是你还可以塞一些div或iframe之类的tag让你的版面烂掉。

      所以我们需要自定义一个html filter,可以自由的指定我们放行的那些tag。网上发现了这个sanitize.rb,完美的帮我们实现愿望。如何使用:
第一行:
ruby 代码
  1. def sanitize( html, okTags='a href, b, br, i, p' )  
okTags代表就是允许的tag,目前有a,b,br,i,p之类的tag,如果输入<iframe>xxx</iframe>之类的不允许的code,就会出现
xxx.不允许的结果都将被砍掉。如果想增加span,或font这样的tag,则可以:
ruby 代码
  1. def sanitize( html, okTags='a href, b, br, i, p, span, font' )  

a href之间没有用逗号隔开,是代表sanitize允许a这个tag使用href这个attribute,比如:
<a href=" http://blackanger.iteye.com" _fcksavedurl="http://lightyror.blogspot.com" target="_blank">Haha</a>
只会出现: <a href= http://blackanger.iteye.com>Haha</a>,只有href这个属性可以保留,其他的被无情的砍掉。当我们输入这样的代码:
            <a href= http://blackanger.iteye.com>Haha
会自动帮你补齐tag:

            <a href=http://blackanger.iteye.com>Haha</a>

Put the following code in your helper:
#
# $Id: sanitize.rb 3 2005-04-05 12:51:14Z dwight $
#
# Copyright (c) 2005 Dwight Shih
# A derived work of the Perl version:
# Copyright (c) 2002 Brad Choate, bradchoate.com
#
# Permission is hereby granted, free of charge, to
# any person obtaining a copy of this software and
# associated documentation files (the "Software"), to
# deal in the Software without restriction, including
# without limitation the rights to use, copy, modify,
# merge, publish, distribute, sublicense, and/or sell
# copies of the Software, and to permit persons to
# whom the Software is furnished to do so, subject to
# the following conditions:
#
# The above copyright notice and this permission
# notice shall be included in all copies or
# substantial portions of the Software.
#
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY
# OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT
# LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND
# NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR
# COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES
# OR OTHER LIABILITY, WHETHER IN AN ACTION OF
# CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF
# OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
# OTHER DEALINGS IN THE SOFTWARE.
#

def sanitize( html, okTags='a href, b, br, i, p' )
# no closing tag necessary for these
soloTags = ["br","hr"]

# Build hash of allowed tags with allowed attributes
tags = okTags.downcase().split(',').collect!{ |s| s.split(' ') }
allowed = Hash.new
tags.each do |s|
    key = s.shift
    allowed[key] = s
end

# Analyze all <> elements
stack = Array.new
result = html.gsub( /(<.*?>)/m ) do | element |
    if element =~ /\A<\/(\w+)/ then
      # </tag>
      tag = $1.downcase
      if allowed.include?(tag) && stack.include?(tag) then
        # If allowed and on the stack
        # Then pop down the stack
        top = stack.pop
        out = "</#{top}>"
        until top == tag do
          top = stack.pop
          out << "</#{top}>"
        end
        out
      end
    elsif element =~ /\A<(\w+)\s*\/>/
      # <tag />
      tag = $1.downcase
      if allowed.include?(tag) then
        "<#{tag} />"
      end
    elsif element =~ /\A<(\w+)/ then
      # <tag ...>
      tag = $1.downcase
      if allowed.include?(tag) then
        if ! soloTags.include?(tag) then
          stack.push(tag)
        end
        if allowed[tag].length == 0 then
          # no allowed attributes
          "<#{tag}>"
        else
          # allowed attributes?
          out = "<#{tag}"
          while ( $' =~ /(\w+)=("[^"]+")/ )
            attr = $1.downcase
            valu = $2
            if allowed[tag].include?(attr) then
              out << " #{attr}=#{valu}"
            end
          end
          out << ">"
        end
      end
    end
end

# eat up unmatched leading >
while result.sub!(/\A([^<]*)>/m) { $1 } do end

# eat up unmatched trailing <
while result.sub!(/<([^>]*)\Z/m) { $1 } do end

# clean up the stack
if stack.length > 0 then
    result << "</#{stack.reverse.join('></')}>"
end

result
end

你可能感兴趣的:(JavaScript,html,Ruby,UP,Rails)