Specifying path for tidy rubygem
HTML Tidy is a library used to fix invalid HTML and give the source code a reasonable layout. It was developed by Dave Raggett of W3C, and is now maintained as a Sourceforge project. These are several versions of tidy available for various operating system. But the quickest way(not always easiest) to install on various unix systems are given below.
On debian based OS such as ubuntu, use apt-get to install
apt-get install tidy
On RPM based OS like fedora centOS, use yum to install
yum install tidy
On mac os x, use macports to install
port install tidy
For tidy to be used in ruby, a rubygem is available here. Just fire up gem install tidy to get it installed on your development machine. A nice documentation is provided here for reference.
gem install tidy
require 'tidy'
Tidy.path = '/usr/lib/tidylib.so'
html = 'Body'
xml = Tidy.open(:show_warnings=>true) do |tidy|
tidy.options.output_xml = true
puts tidy.options.show_warnings
xml = tidy.clean(html)
puts tidy.errors
puts tidy.diagnostics
xml
end
puts xml
While I was working on tidy on my mac, I noticed the Tidy.path variable explained above did not work for me. I figured out an equivalent path to be used on mac,
Tidy.path = '/usr/lib/libtidy.A.dylib'
Similar was the case with my production servers hosted on fedora/CentOS, I had to modify my path as
Tidy.path = '/usr/lib/libtidy-0.99.so.0'
To use both paths on my development and production environment, I modified the line 2 in the example above as
begin
Tidy.path = '/usr/lib/libtidy-0.99.so.0'
rescue LoadError
Tidy.path = '/usr/lib/libtidy.A.dylib'
end
Update:
If you’re getting the error:
/opt/ruby/ruby-1.8.6/lib/ruby/gems/1.8/gems/tidy-1.1.2/lib/tidy/tidybuf.rb:40: [BUG] Segmentation fault
Apply the following patch to fix it.