我们现在的网站大部分都是HTML的,如果希望将它们标准化,手工的一页一页修改非常麻烦。如果有一个工具能自动将HTML转换成符合标准的XHTML就好了。其实在这方面已经有许多商业和免费的工具软件可以使用,这里将要介绍的HTML Tidy 就是一个很基本但很有用的工具,它可以运行在多种平台上,而且是开放源代码的。
HTML Tidy 是一个能够HTML文件的错误及整齐的排列代码(即缩排)的软件及函式库。
HTML Tidy 是由W3C的Dave Raggett开发,其后成为一个Sourceforge的专案,其源码是使用ANSI C写成,而适用于不同操作系统的执行档则可因此而编译而成。而HTML Tidy是根据W3C license(宽松的BSD许可证)授权下发布。
修正HTML错误如:
[myFormat] add-xml-decl=no add-xml-space=no alt-text= assume-xml-procins=no bare=no break-before-br=no clean=no doctype=auto drop-empty-paras=yes drop-font-tags=yes drop-proprietary-attributes=no enclose-block-text=yes enclose-text=yes escape-cdata=no fix-bad-comments=yes fix-uri=yes hide-comments=no hide-endtags=no indent-cdata=no input-xml=no join-classes=no join-styles=yes logical-emphasis=yes lower-literals=yes ncr=yes new-blocklevel-tags= new-empty-tags= new-inline-tags= new-pre-tags= numeric-entities=no output-xhtml=no output-xml=no quote-ampersand=no quote-marks=no quote-nbsp=no repeated-attributes=keep-last replace-color=no show-body-only=no uppercase-attributes=no uppercase-tags=no word-2000=no show-errors=6 show-warnings=yes indent=auto indent-attributes=no indent-spaces=2 literal-attributes=no markup=yes tab-size=4 wrap=100 wrap-asp=yes wrap-attributes=no wrap-jste=yes wrap-php=yes wrap-script-literals=no wrap-sections=yes ascii-chars=no char-encoding=raw input-encoding= language= output-bom=auto output-encoding= newline= fix-backslash=yes force-output=no gnu-emacs=no quiet=no keep-time=yes write-back=yes tidy-mark=no default=0
下面的一些说明
[myFormat]
add-xml-decl=no
add-xml-space=no
alt-text=
assume-xml-procins=no
bare=no
break-before-br=no
clean=no
doctype=auto
drop-empty-paras=yes
drop-font-tags=no
drop-proprietary-attributes=no
enclose-block-text=yes
enclose-text=yes
escape-cdata=no
fix-bad-comments=yes
fix-uri=yes
hide-comments=no
hide-endtags=no
indent-cdata=no
input-xml=no
join-classes=no
join-styles=yes
logical-emphasis=yes
lower-literals=yes
ncr=yes
new-blocklevel-tags=
new-empty-tags=
new-inline-tags=
new-pre-tags=
numeric-entities=no
output-xhtml=no
output-xml=no
quote-ampersand=no
quote-marks=no
quote-nbsp=no
repeated-attributes=keep-last
replace-color=no
show-body-only=no
uppercase-attributes=no
uppercase-tags=no
word-2000=no
show-errors=6
show-warnings=yes
indent=auto
indent-attributes=no
indent-spaces=2
literal-attributes=no
markup=yes
tab-size=4
wrap=100
wrap-asp=yes
wrap-attributes=no
wrap-jste=yes
wrap-php=yes
wrap-script-literals=no
wrap-sections=yes
ascii-chars=no
char-encoding=raw
input-encoding=
language=
output-bom=auto
output-encoding=
newline=
fix-backslash=yes
force-output=no
gnu-emacs=no
quiet=no
keep-time=yes
write-back=yes
tidy-mark=no
default=0
下面贴上一张默认配置单,具体意义在google上找吧,很多
HTML, XHTML, XML
Diagnostics
Pretty Print
Character Encoding
Miscellaneous
HTML, XHTML, XML Options | Top | |
Option | Type | Default |
add-xml-decl | Boolean | no |
add-xml-pi | Boolean | no |
add-xml-space | Boolean | no |
alt-text | String | |
assume-xml-procins | Boolean | no |
bare | Boolean | no |
break-before-br | Boolean | no |
clean | Boolean | no |
doctype | DocType | auto |
drop-empty-paras | Boolean | yes |
drop-font-tags | Boolean | no |
drop-proprietary-attributes | Boolean | no |
enclose-block-text | Boolean | no |
enclose-text | Boolean | no |
escape-cdata | Boolean | no |
fix-bad-comments | Boolean | yes |
fix-uri | Boolean | yes |
hide-comments | Boolean | no |
hide-endtags | Boolean | no |
indent-cdata | Boolean | no |
input-xml | Boolean | no |
join-classes | Boolean | no |
join-styles | Boolean | yes |
logical-emphasis | Boolean | no |
lower-literals | Boolean | yes |
ncr | Boolean | yes |
new-blocklevel-tags | Tag names | |
new-empty-tags | Tag names | |
new-inline-tags | Tag names | |
new-pre-tags | Tag names | |
numeric-entities | Boolean | no |
output-xhtml | Boolean | no |
output-xml | Boolean | no |
quote-ampersand | Boolean | yes |
quote-marks | Boolean | no |
quote-nbsp | Boolean | yes |
repeated-attributes | - | keep-last |
replace-color | Boolean | no |
show-body-only | Boolean | no |
slide-style | Name | |
split | Boolean | no |
uppercase-attributes | Boolean | no |
uppercase-tags | Boolean | no |
word-2000 | Boolean | no |
Diagnostics Options | Top | |
Option | Type | Default |
error-file | String | |
force-output | Boolean | no |
gnu-emacs | Boolean | no |
quiet | Boolean | no |
show-errors | Integer | 6 |
show-warnings | Boolean | yes |
tidy-mark | Boolean | yes |
Pretty Print Options | Top | |
Option | Type | Default |
indent | AutoBool | no |
indent-attributes | Boolean | no |
indent-spaces | Integer | 2 |
literal-attributes | Boolean | no |
markup | Boolean | yes |
tab-size | Integer | 4 |
wrap | Integer | 68 |
wrap-asp | Boolean | yes |
wrap-attributes | Boolean | no |
wrap-jste | Boolean | yes |
wrap-php | Boolean | yes |
wrap-script-literals | Boolean | no |
wrap-sections | Boolean | yes |
Character Encoding Options | Top | |
Option | Type | Default |
ascii-chars | Boolean | yes |
char-encoding | Encoding | ascii |
input-encoding | Encoding | latin1 |
language | Language | |
output-bom | AutoBool | auto |
output-encoding | Encoding | ascii |
raw | Boolean | no |
Miscellaneous Options | Top | |
Option | Type | Default |
fix-backslash | Boolean | yes |
keep-time | Boolean | yes |
write-back | Boolean | no |