由于开发的三维整形软件,需要提供整容的专业报告,所以要在虚拟整容之后提供一个PDF文档。在网上查询了许多,大家常用的就是PDFLib。如何使用Acrobat标准的简体中文字体呢?特转载收藏了一些PDF基本知识,希望以后软件开发能够用的上。
PDF文件格式以其安全可靠,易于交换,及保真度高而成为电子文档的标准。PDFlib是一套在国际上非常流行的在服务器端批量生成PDF文档的功能强大的软件包。国外许多政府,税务,银行,水电,邮电部门用其在线生成PDF格式的单据及报表。
对于国内用户来说,如何使用PDFlib输出简体中文会是我们最关心的问题。在这里我将于大家一起分享自己的一些心得体会,不对之处请指正,若我所说于PDFlib手册有冲突,请以手册为准。我的邮箱是 :[email protected] 。
对于没有接触过PDFlib的朋友,如果你们感兴趣,可以从这个链接http://www.pdflib.com/products/pdflib/download/index.html 下载PDFlib软件包。(也可以到VC知识库工具与资源栏目下载) 在没有license的情况下,你仍可使用其所有功能,只是生成的PDF文档带有PDFlib的水印。
PDFlib提供C,C++, Java, Perl, PHP, Python, Tcl 及RealBasic的语言接口。以下所有的例子将采用C。
如何使用Acrobat 标准的简体中文字体
PDFlib自带STSong-Light,AdobeSongStd-Light-Acro,及STSongStd-Light-Acro三种简体中文字体。这三种字体同时也是Acrobat的简体中文标准字体。
以上三种字体均支持以下几种编码(Encoding):UniGB-UCS2-H,UniGB-UCS2-V,UniGB-UTF16-H,UniGB-UTF16-V,GB-EUC-H,GB-EUC-V,GBpc-EUC-H,GBpc-EUC-V,GBK-EUC-H,GBK-EUC-V,GBKp-EUC-H,GBKp-EUC-V,GBK2K-H,及GBK2K-V。各编码的定义请见下表1.1:
表1.1
Encoding | Character set and text format |
UniGB-UCS2-H UniGB-UCS2-V |
Unicode (UCS-2) encoding for the Adobe-GB1 character collection |
UniGB-UTF16-H UniGB-UTF16-V |
Unicode (UTF-16BE) encoding for the Adobe-GB1 character collection.Contains mappings for all characters in the GB18030-2000 character set. |
GB-EUC-H GB-EUC-V |
Microsoft Code Page 936 (charset 134), GB 2312-80 character set, EUC-CN encoding |
GBpc-EUC-H GBpc-EUC-V |
Macintosh, GB 2312-80 character set, EUC-CN encoding, Script Managercode 2 |
GBK-EUC-H GBK-EUC-V |
Microsoft Code Page 936 (charset 134), GBK character set, GBK encoding |
GBKp-EUC-H GBKp-EUC-V |
Same as GBK-EUC-H, but replaces half-width Latin characters withproportional forms and maps code 0x24 to dollar ($) instead of yuan (¥). |
GBK2K-H GBK2K-V |
GB 18030-2000 character set, mixed 1-, 2-, and 4-byte encoding |
int Font_CS; Font_CS = PDF_load_font(p, " STSong-Light ", 0, " ", " UniGB-UTF16-H");
不久,你们将会发现,字体与编码间可有非常多的组合,而PDFlib的字体功能(function)并不支持所有的组合。最为保险的组合是PDFlib自带三种字体与Unicode类编码的组合。
下面是一个使用PDFlib自带字体及编码的C 源程序
/*******************************************************************/ /* This example demostrates the usage of PDFlib builtin fonts /* based on Chinese Simplifed Windows. /*******************************************************************/ #include <stdio.h> #include <stdlib.h> #include <string.h> #include "pdflib.h" int main(void) { PDF *p = NULL; int i = 0, j = 0, Left = 50, Top = 800; int Font_E = 0, Font_CS = 0; char TextUnicode[] = "\x80\x7B\x53\x4F\x2D\x4E\x87\x65"; char TextCp936[] = "\xBC\xF2\xCC\xE5\xD6\xD0\xCE\xC4"; char EncodingName[100]; static const char *ChineseFont[] = {"STSong-Light", "AdobeSongStd-Light-Acro", "STSongStd-Light-Acro", }; static const char *Encoding[] = { "UniGB-UCS2-H", "UniGB-UCS2-V", "UniGB-UTF16-H", "UniGB-UTF16-V", "GB-EUC-H", "GB-EUC-V", "GBpc-EUC-H", "GBpc-EUC-V", "GBK-EUC-H", "GBK-EUC-V", "GBKp-EUC-H", "GBKp-EUC-V", "GBK2K-H", "GBK2K-V", }; const int fsize = sizeof ChineseFont / sizeof (char *); const int esize = sizeof Encoding / sizeof (char *); /* create a new PDFlib object */ if ((p = PDF_new()) == (PDF *) 0) { printf("Couldn''t create PDFlib object (out of memory)!\n"); return(2); } PDF_TRY(p) { if (PDF_begin_document(p, "pdflib_cs1.pdf", 0, "") == -1) { printf("Error: %s\n", PDF_get_errmsg(p)); return(2); } PDF_set_info(p, "Creator", "pdflib_cs1.c"); PDF_set_info(p, "Author", "[email protected]"); PDF_set_info(p, "Title", "Output Chinese Simplify with PDFlib builtin font"); Font_E = PDF_load_font(p, "Helvetica-Bold", 0, "winansi", ""); for (i = 0; i < fsize; i++) { /*Start a new page. */ Top = 800; PDF_begin_page_ext(p, a4_width, a4_height, ""); PDF_setfont(p, Font_E, 24); PDF_show_xy(p, ChineseFont[i] , Left + 50, Top); Top -= 30; for (j = 0; j < esize; j++) { Font_CS = PDF_load_font(p, ChineseFont[i], 0, Encoding[j], ""); PDF_setfont(p, Font_E, 12); strcpy(EncodingName, ""); strcat(EncodingName, Encoding[j]); strcat(EncodingName, ":"); PDF_show_xy(p, EncodingName , Left, Top); PDF_setfont(p, Font_CS, 12); if (strstr(Encoding[j], "-H") != NULL) { /* It''s horizontal encoding. */ Top -= 15; } if (strstr(Encoding[j], "UniGB") != NULL) { /* It''s unicode encoding. */ PDF_show_xy2(p, TextUnicode, 8, Left, Top); } else { /* It''s code page 936 encoding. */ PDF_show_xy2(p, TextCp936, 8, Left, Top); } if (strstr(Encoding[j], "-H") != NULL) { /* It''s horizontal encoding. */ Top -= 25; } else { /* It''s vertical encoding. */ Top -= 65; } } /* for */ /* End of page. */ PDF_end_page_ext(p, ""); } /* for */ PDF_end_document(p, ""); } PDF_CATCH(p) { printf("PDFlib exception occurred in pdflib_cs1 sample:\n"); printf("[%d] %s: %s\n", PDF_get_errnum(p), PDF_get_apiname(p), PDF_get_errmsg(p)); PDF_delete(p); return(2); } PDF_delete(p); return 0; }
除了PDFlib自带字体外,用户还可以使用安装在系统上的字体及其他用户字体。PDFlib称安装在Windows和Mac操作系统中的(存在于或被拷入相应系统字体目录的)TrueType, OpenType 和PostScript字体为宿主字体(Host Font)。PDFlib可直接引用字体名进行调用,但必须与文件名完全相同(严格区分大小写)。例如,调用安装在Windows系统中的字体:C:\WINDOWS\Fonts\SimHei.ttf
int Font_CS = 0; Font_CS = PDF_load_font(p, "SimHei", 0, "unicode", "");需要注意的是,字体名可能与字体文件名不同,甚至相同的字体在不同语言的操作系统下字体名称会有所不同。在 Windows 环境下查看字体名,可双击该字体文件,窗口打开后的第一行字除结尾的 TrueType, OpenType 外为字体名。例如,调用安装在 Windows 系统中的字体: C:\WINDOWS\Fonts\SimHei.ttf ,双击该文件后,窗口的第一行为“黑体 TrueType” 。则该文件的字体名为“黑体”。在 PDFlib 中若要调用多字节的文件名,须以 BOM+ UTF8 的形式。 “黑体”的 BOM+ UTF8 的形式为“ \xEF\xBB\xBF\xE9\xBB\x91\xE4\xBD\x93 ”。
因此对于中文黑体, 在中文WINDOWS下,则我们使用PDF_load_font(p, "\xEF\xBB\xBF\xE9\xBB\x91\xE4\xBD\x93", 0, "unicode", "");
在英文WINDOWS下则应使用
PDF_load_font(p, "SimHei", 0, "unicode", "");(小技巧: 我们可以使用Windows2000/XP自带的notepad获得UTF8编码,具体方法举例:在notepad中输入"黑体"并保存, 保存时在编码下拉框中选择UTF-8, 然后用UltraEdit,WinHex,VC等可以进行二进制编辑的工具打开该文件即可取得带BOM的UTF8字符串)
除安装在Windows系统中的字体之外,PDFlib还可以调用其他用户字体。但在调用之时,需要给出路径名。如我想用C:\Program Files\Adobe\Acrobat 7.0\Resource\CIDFont\AdobeSongStd-Light.otf 这个字体:Font_CS= PDF_load_font(p,"C:\\Program Files\\Adobe\\Acrobat 7.0\\Resource\\CIDFont\\ AdobeSongStd-Light",0, "unicode", "");但这里有个例外,那就是.ttc(TrueType Collection)字体。.ttc是集合字体文件,每个文件中含有多种字体。所以用户不能用文件名调用字体,而是要用真正的字体名。比方说,我们知道C:\WINDOWS\Fonts\MSGOTHIC.TTC 包含三种字体依次名为MS Gothic,MS PGothic,和MS UI Gothic。我们可以用以它们相应的字体名调用:
int Font_E = 0; Font_E= PDF_load_font(p, "MS Gothic", 0, "winansi", ""); /* Use MS Gothic */ PDF_setfont(p, Font_E, 20); PDF_show_xy(p, "MS Gothic font:" , 50, 800); Font_E= PDF_load_font(p, "MS PGothic", 0, "winansi", ""); /* Use MS PGothic */ …… Font_E= PDF_load_font(p, "MS UI Gothic", 0, "winansi", ""); /* Use MS UI Gothic */
可是我们经常并不清楚.ttc里包含哪些字体。在这种情况PDFlib提供了另一种调用方式—索引(Index)。用此方式,首先须给字体文件名一个别名,然后在别名后加冒号再加数字(0表示文件中第一种字体,1 表示第二种,依次类推。)
int Font_E = 0; /* Give “C:\WINDOWS\Fonts\MSGOTHIC.TTC an alias “gothic” */ PDF_set_parameter(p, "FontOutline", "gothic=C:\\WINDOWS\\Fonts\\MSGOTHIC.TTC"); Font_E= PDF_load_font(p, "gothic:0", 0, "winansi", ""); /* Use MS Gothic */ Font_E= PDF_load_font(p, "gothic:1", 0, "winansi", ""); /* Use MS PGothic */ Font_E= PDF_load_font(p, "gothic:2", 0, "winansi", ""); /* Use MS UI Gothic */ 下面是一个相关的例子--C 源程序/*******************************************************************/ /* This example demostrates the usage of host font and other fonts /* based on Chinese Simplifed Windows. /*******************************************************************/ #include <stdio.h> #include <stdlib.h> #include <string.h> #include "pdflib.h" int main(void) { PDF *p = NULL; int i = 0, j = 0, Left = 50, Top = 800; int Font_E = 0, Font_CS = 0; char fontfile[1024]; char buf[1024]; char TextUnicode[] = "\x80\x7B\x53\x4F\x2D\x4E\x87\x65"; /* create a new PDFlib object */ if ((p = PDF_new()) == (PDF *) 0) { printf("Couldn't create PDFlib object (out of memory)!\n"); return(2); } PDF_TRY(p) { if (PDF_begin_document(p, "pdflib_cs2.pdf", 0, "") == -1) { printf("Error: %s\n", PDF_get_errmsg(p)); return(2); } PDF_set_info(p, "Creator", "pdflib_cs2.c"); PDF_set_info(p, "Author", "[email protected]"); PDF_set_info(p, "Title", "Output Chinese Simplify with host font and others"); /* Start a new page. */ PDF_begin_page_ext(p, a4_width, a4_height, ""); Font_E = PDF_load_font(p, "Helvetica-Bold", 0, "winansi", ""); /* Using host font -- C:\WINDOWS\Fonts\SimHei.ttf. PDFlib is using BOM UTF8 string to calling multi-byte character string SimHei.ttf font name is "黑体", its corresponding BOM UTF8 string is "\xEF\xBB\xBF\xE9\xBB\x91\xE4\xBD\x93" */ Font_CS= PDF_load_font(p, "\xEF\xBB\xBF\xE9\xBB\x91\xE4\xBD\x93", 0, "unicode", ""); /* Font_CS= PDF_load_font(p, "SimHei", 0, "unicode", ""); */ PDF_setfont(p, Font_E, 20); PDF_show_xy(p, "SimHei font:" , Left, Top); PDF_setfont(p, Font_CS, 24); Top-=30; PDF_show_xy(p, TextUnicode , Left, Top); /* Using other disk-based font file that is not installed in system directory -- C:\PSFONTS\CS\gkai00mp.ttf*/ Top-=50; strcpy(fontfile, "C:\\PSFONTS\\CS\\gkai00mp.ttf"); sprintf(buf, "kai=%s", fontfile); /* Defines kai as alias for ..\gkai00mp.ttf */ PDF_set_parameter(p, "FontOutline", buf); Font_CS= PDF_load_font(p, "kai", 0, "unicode", ""); PDF_setfont(p, Font_E, 20); PDF_show_xy(p, "AR PL KaitiM GB font:" , Left, Top); PDF_setfont(p, Font_CS, 24); Top-=30; PDF_show_xy(p, TextUnicode , Left, Top); /* Using TrueType collection font with index -- C:\WINDOWS\Fonts\simsun.ttc*/ Top-=50; strcpy(fontfile, "C:\\WINDOWS\\Fonts\\simsun.ttc"); sprintf(buf, "simsun=%s", fontfile); /* Defines AdobeSongStd as alias for ..\AdobeSongStd-Light.otf This only need to claim once will be sufficient to configure all fonts in simsun.ttc*/ PDF_set_parameter(p, "FontOutline", buf); /* TTC files contain multiple separate fonts. Address 1st font by appending a colon character and 0 after alias simsun */ Font_CS= PDF_load_font(p, "simsun:0", 0, "unicode", ""); PDF_setfont(p, Font_E, 20); PDF_show_xy(p, "simsun:0 font:", Left, Top); PDF_setfont(p, Font_CS, 24); Top-=30; PDF_show_xy2(p, TextUnicode, 8, Left, Top); /*Address 2nd font by appending a colon character and 1 after alias simsun */ Top-=50; Font_CS= PDF_load_font(p, "simsun:1", 0, "unicode", ""); PDF_setfont(p, Font_E, 20); PDF_show_xy(p, "simsun:1 font:", Left, Top); PDF_setfont(p, Font_CS, 24); Top-=30; PDF_show_xy2(p, TextUnicode, 8, Left, Top); /* End of page. */ PDF_end_page_ext(p, ""); PDF_end_document(p, ""); } PDF_CATCH(p) { printf("PDFlib exception occurred in pdflib_cs2 sample:\n"); printf("[%d] %s: %s\n", PDF_get_errnum(p), PDF_get_apiname(p), PDF_get_errmsg(p)); PDF_delete(p); return(2); } PDF_delete(p); return 0; }
1.PDF_show
void PDF_show(PDF *p, const char *text)
void PDF_show2(PDF *p, const char *text, int len)
在当前坐标用当前字体及字体大小输出文本。
PDF_show将认为字符串是以空字符结尾(NULL);若字符串有可能含有空字符(如多字节字符串),用PDF_show2。2.PDF_show_xy
void PDF_show_xy(PDF *p, const char *text, double x, double y)
void PDF_show_xy2(PDF *p, const char *text, int len, double x, double y)
在给出的坐标用当前字体及字体大小输出文本。
PDF_show_xy将认为字符串是以空字符结尾(NULL);若字符串有可能含有空字符(如多字节字符串),用PDF_show_xy2。3.PDF_continue_text
void PDF_continue_text(PDF *p, const char *text)
void PDF_continue_text2(PDF *p, const char *text, int len)
在下一行用当前字体及字体大小输出文本。
PDF_continue_xy将认为字符串是以空字符结尾(NULL);若字符串有可能含有空字符(如多字节字符串),用PDF_continue_xy2。4.PDF_fit_textline
void PDF_fit_textline(PDF*p, const char *text, int len, double x, double y, const char *optlist)
在给出的坐标根据选择项输出一行文本。
若字符串是以空字符结尾(NULL),len为0;否则,给出具体字节数。5.PDF_fit_textflow
int PDF_create_textflow(PDF *p, const char *text, int len, const char *optlist)
建立文本流对象,并预处理文本为下面的文本格式化做准备。
若字符串是以空字符结尾(NULL),len为0;否则,给出具体字节数。
const char *PDF_fit_textflow(PDF *p, int textflow, double llx, double lly, double urx, double ury, const char *optlist)
将文本输出到相应的矩形块中。
lly, llx, ury, urx, 分别是矩形块左下角及右上角的纵横坐标。
void PDF_delete_textflow(PDF *p, int textflow)
删除文本流对象及相关数据结构。
小结1,2, 3 组函数简洁,直观,易用。4,5组函数可通过对选择项的控制而输出更灵活的文本格式。尤其是第5组函数,是专门为多行文本设计的,可通过选项控制对齐,字间距,边框显示,旋转等。但4,5组函数有个局限,在字符串是多字节时,它们只能处理Unicode类编码。换而言之,他们不支持cp936编码。
下面是一个相关的例子--C 源程序(下载源代码中包含了生成的pdf文件 –PDFlib_cs3.pdf)。
/*******************************************************************/ /* This example demostrates different ways to output Chinese Simplified text /* under Chinese Simplifed Windows. /*******************************************************************/ #include <stdio.h> #include <stdlib.h> #include <string.h> #include "pdflib.h" int main(void) { PDF *p = NULL; int i = 0, j = 0, Left = 50, Top = 800, Right = 545; int Font_E = 0, Font_CS = 0, Font_CS2 = 0, TextFlow = 0; char TextUnicode[] = "\x80\x7B\x53\x4F\x2D\x4E\x87\x65"; char TextCp936[] = "\xBC\xF2\xCC\xE5\xD6\xD0\xCE\xC4"; /* create a new PDFlib object */ if ((p = PDF_new()) == (PDF *) 0) { printf("Couldn''t create PDFlib object (out of memory)!\n"); return(2); } PDF_TRY(p) { if (PDF_begin_document(p, "pdflib_cs3.pdf", 0, "") == -1) { printf("Error: %s\n", PDF_get_errmsg(p)); return(2); } PDF_set_info(p, "Creator", "pdflib_cs3.c"); PDF_set_info(p, "Author", "[email protected]"); PDF_set_info(p, "Title", "Different Ways To Output Chinese Simplify"); /* Start a new page. */ PDF_begin_page_ext(p, a4_width, a4_height, ""); Font_E = PDF_load_font(p, "Helvetica-Bold", 0, "winansi", ""); Font_CS = PDF_load_font(p, "STSong-Light", 0, "UniGB-UCS2-H", ""); Font_CS2 = PDF_load_font(p, "STSong-Light", 0, "GB-EUC-H", ""); /* Using PDF_set_text_pos and PDF_show functions. */ PDF_setfont(p, Font_E, 20); PDF_set_text_pos(p, Left, Top); PDF_show(p, "Using PDF_set_text_pos and PDF_show to output text:"); Top-=30; PDF_set_text_pos(p, Left+20, Top); PDF_show(p, "UniGB-UCS2-H encoding:"); PDF_setfont(p, Font_CS, 24); Top-=30; PDF_set_text_pos(p, Left+20, Top); PDF_show2(p, TextUnicode, 8); Top-=30; PDF_setfont(p, Font_E, 20); PDF_set_text_pos(p, Left+20, Top); PDF_show(p, "GB-EUC-H encoding:"); PDF_setfont(p, Font_CS2, 24); Top-=30; PDF_set_text_pos(p, Left+20, Top); PDF_show2(p, TextCp936, 8); /* Using PDF_show_xy function. */ Top-=50; PDF_setfont(p, Font_E, 20); PDF_show_xy(p, "Using PDF_show_xy to output text:" , Left, Top); Top-=30; PDF_show_xy(p, "UniGB-UCS2-H encoding:" , Left+20, Top); PDF_setfont(p, Font_CS, 24); Top-=30; PDF_show_xy2(p, TextUnicode, 8, Left+20, Top); Top-=30; PDF_setfont(p, Font_E, 20); PDF_show_xy(p, "GB-EUC-H encoding:", Left+20, Top); Top-=30; PDF_setfont(p, Font_CS2, 24); PDF_show_xy2(p, TextCp936, 8, Left+20, Top); /* Using PDF_continue_text function. */ Top-=30; PDF_setfont(p, Font_E, 20); PDF_set_text_pos(p, Left, Top); PDF_continue_text(p, "Using PDF_continue_text to output text:"); Top-=30; PDF_set_text_pos(p, Left+20, Top); PDF_continue_text(p, "UniGB-UCS2-H encoding:"); PDF_setfont(p, Font_CS, 24); PDF_continue_text2(p, TextUnicode, 8); PDF_setfont(p, Font_E, 20); PDF_continue_text(p, "GB-EUC-H encoding:"); PDF_setfont(p, Font_CS2, 24); PDF_continue_text2(p, TextCp936, 8); /* Using PDF_fit_textline function. */ Top-=140; PDF_setfont(p, Font_E, 20); PDF_fit_textline(p, "Using PDF_fit_textline to output text:", 0, Left, Top, ""); Top-=30; PDF_fit_textline(p, "UniGB-UCS2-H encoding:", 0, Left+20, Top, ""); PDF_setfont(p, Font_CS, 24); Top-=30; PDF_fit_textline(p, TextUnicode, 8, Left+20, Top, ""); /* Using PDF_create_textflow, PDF_fit_textflow and PDF_delete_textflow function. */ Top-=30; PDF_setfont(p, Font_E, 20); TextFlow = PDF_create_textflow(p, "Using PDF_create_textflow, PDF_fit_textflow and PDF_delete_textflow to output text:", 0, "fontname=Helvetica-Bold fontsize=20 encoding=winansi"); PDF_fit_textflow(p, TextFlow, Left, Top, Right, Top-60, ""); Top-=60; TextFlow = PDF_create_textflow(p, "UniGB-UCS2-H encoding:", 0, "fontname=Helvetica-Bold fontsize=20 encoding=winansi"); PDF_fit_textflow(p, TextFlow, Left+20, Top, Right, Top-30, ""); Top-=30; TextFlow = PDF_create_textflow(p, TextUnicode, 8, "fontname=STSong-Light fontsize=24 encoding=UniGB-UCS2-H textlen=8"); PDF_fit_textflow(p, TextFlow, Left+20, Top, Right, Top-30, ""); PDF_delete_textflow(p, TextFlow); /* End of page. */ PDF_end_page_ext(p, ""); PDF_end_document(p, ""); } PDF_CATCH(p) { printf("PDFlib exception occurred in pdflib_cs3 sample:\n"); printf("[%d] %s: %s\n", PDF_get_errnum(p), PDF_get_apiname(p), PDF_get_errmsg(p)); PDF_delete(p); return(2); } PDF_delete(p); return 0; }
PDFlib的textformat参数用以设定文本输入形式,其有效值如下:
bytes: 在字符串中每个字节对应于一个字符。主要应用于8位编码。
utf8:字符串是 UTF-8编码。
ebcdicutf8:字符串是EBCDIC的UTF-8编码,只应用于IBM iSeries和zSeries。
utf16:字符串是 UTF-16编码。如果字符串是以Unicode的标记字节顺序号(BOM)开始,PDFlib会接收BOM信息后将其从字符串首移去。如果字符串不带BOM,字符串的字节顺序将取决于主机的字节顺序。Intel x86系统是小尾(little-endian,0xFFFE ), 而Sparc和PowerPC系统是大尾(big-endian, 0xFEFF)。
utf16be:字符串是大尾字节顺序的UTF-16编码。对BOM没有特殊处理。
utf16le:字符串是小尾字节顺序的UTF-16编码。对BOM没有特殊处理。
auto:对于8位编码,它相当于“bytes”, 对于宽字符字符串(Unicode, glyphid, UCS2 或UTF16 CMap),它相当于“utf16”。在编程语言里,我们将可以自动处理Unicode字符串的语言称为支持Unicode语言(Unicode-capable),它们是COM, .NET, Java, REALbasic及Tcl等。对于需对Unicode字符串进行特殊处理的语言称为不支持Unicode语言(non-Unicode-capable),它们是C, C++, Cobol, Perl, PHP, Python 及RPG等。
在non-Unicode-capable语言里,“auto”设置将会正确处理大部分文本字符串。
对于Unicode-capable语言,textformat参数的缺省值是“utf16”;而non-Unicode-capable语言的缺省值是“auto”。
除此之外,PDFlib还支持在SGML和HTML经常使用的字符引用方法(Character Reference)。前提是将参数charref设成真, textformat设成“bytes”:
PDF_set_parameter(p, "charref", "true"); PDF_set_parameter(p, "textformat", "bytes"); 下面给出一些有效的Character Reference: soft hyphen soft hyphen soft hyphen€ Euro glyph (hexadecimal)€ Euro glyph (decimal)€ Euro glyph (entity name)< less than sign> greater than sign& ampersand signΑ Greek Alpha 下面是一个相关的例子--C 源程序(附上生成的pdf文件 –PDFlib_cs4.pdf)。/*******************************************************************/ /* This example demostrates output Chinese Simplified text with different /* ''textformat'' option under Chinese Simplifed Windows. /*******************************************************************/ #include <stdio.h> #include <stdlib.h> #include <string.h> #include "pdflib.h" int main(void) { PDF *p = NULL; int Font_E = 0, Font_H = 0, Font_CS = 0, Left = 50, y = 800, i = 0; const int INCRY = 25; char text[128], buf[128]; /* 1 byte text (English: "Simplified Chinese") */ static const char byte_text[] = "\123\151\155\160\154\151\146\151\145\144\040\103\150\151\156\145\163\145"; static const int byte_len = 18; static const char byte2_text[] = {0x53,0x69,0x6D,0x70,0x6C,0x69,0x66,0x69,0x65, 0x64,0x20,0x43,0x68,0x69,0x6E,0x65,0x73,0x65}; static const int byte2_len = 18; /* 2 byte text (Simplified Chinese) */ static const unsigned short utf16_text[] = {0x7B80,0x4F53,0x4E2D,0x6587}; static const int utf16_len = 8; static const unsigned char utf16be_text[] ="\173\200\117\123\116\055\145\207"; static const int utf16be_len = 8; static const unsigned char utf16be_bom_text[] = "\376\377\173\200\117\123\116\055\145\207"; static const int utf16be_bom_len = 10; static const unsigned char utf16le_text[] ="\200\173\123\117\055\116\207\145"; static const int utf16le_len = 8; static const unsigned char utf16le_bom_text[] = "\377\376\200\173\123\117\055\116\207\145"; static const int utf16le_bom_len = 10; static const unsigned char utf8_text[] = "\347\256\200\344\275\223\344\270\255\346\226\207"; static const int utf8_len = 12; static const unsigned char utf8_bom_text[] = "\xEF\xBB\xBF\xE7\xAE\x80\xE4\xBD\x93\xE4\xB8\xAD\xE6\x96\x87"; static const int utf8_bom_len = 15; static const char htmlutf16_text[] = "简体中文"; static const int htmlutf16_len = sizeof(htmlutf16_text) - 1; typedef struct { char *textformat; char *encname; const char *textstring; const int *textlength; const char *bomkind; } TestCase; static const TestCase table_8[] = { { "bytes", "winansi", (const char *)byte_text, &byte_len, ""}, { "auto", "winansi", (const char *)byte_text, &byte_len, ""}, { "bytes", "winansi", (const char *)byte2_text, &byte2_len, ""}, }; static const TestCase table_16[] = { { "auto", "unicode", (const char *)utf16_text, &utf16_len, ""}, { "utf16", "unicode", (const char *)utf16_text, &utf16_len, ""}, { "auto", "unicode", (const char *)utf16be_bom_text, &utf16be_bom_len, ", UTF-16+BE-BOM"}, { "auto", "unicode", (const char *)utf16le_bom_text, &utf16le_bom_len, ", UTF-16+LE-BOM"}, { "utf16be", "unicode", (const char *)utf16be_text, &utf16be_len, ""}, { "utf16le", "unicode", (const char *)utf16le_text, &utf16le_len, ""}, { "utf8", "unicode", (const char *)utf8_text, &utf8_len, ""}, { "auto", "unicode", (const char *)utf8_bom_text, &utf8_bom_len, ", UTF-8+BOM"}, { "bytes", "unicode", (const char *)htmlutf16_text, &htmlutf16_len, ", HTML unicode character"}, }; const int tsize_8 = sizeof table_8 / sizeof (TestCase); const int tsize_16 = sizeof table_16 / sizeof (TestCase); /* create a new PDFlib object */ if ((p = PDF_new()) == (PDF *) 0) { printf("Couldn''t create PDFlib object (out of memory)!\n"); return(2); } PDF_TRY(p) { if (PDF_begin_document(p, "pdflib_cs4.pdf", 0, "") == -1) { printf("Error: %s\n", PDF_get_errmsg(p)); return(2); } PDF_set_info(p, "Creator", "pdflib_cs4.c"); PDF_set_info(p, "Author", "[email protected]"); PDF_set_info(p, "Title", "Output Chinese Simplify with Different textformat"); /* Start a new page. */ PDF_begin_page_ext(p, a4_width, a4_height, ""); Font_H = PDF_load_font(p, "Helvetica-Bold", 0, "winansi", ""); /* 8-bit encoding */ Font_E = PDF_load_font(p, "Times", 0, "winansi", ""); PDF_setfont(p, Font_H, 24); PDF_show_xy(p, "8-bit encoding", Left+40, y); y -= 2*INCRY; for (i = 0; i < tsize_8; ++i) { PDF_setfont(p, Font_H, 14); sprintf(text, "%s encoding, %s textformat %s: ", table_8[i].encname, table_8[i].textformat, table_8[i].bomkind); PDF_show_xy(p, text, Left, y); y -= INCRY; PDF_set_parameter(p, "textformat", table_8[i].textformat); PDF_setfont(p, Font_E, 14); PDF_show_xy(p, table_8[i].textstring, Left, y); y -= INCRY; } /* for */ /* 16-bit encoding */ PDF_setfont(p, Font_H, 24); y -= 2*INCRY; PDF_show_xy(p, "16-bit encoding", Left+40, y); y -= 2*INCRY; PDF_set_parameter(p, "charref", "true"); Font_CS = PDF_load_font(p, "STSong-Light", 0, "UniGB-UCS2-H", ""); for (i = 0; i < tsize_16; i++) { PDF_setfont(p, Font_H, 14); sprintf(text, "%s encoding, %s textformat %s: ", table_16[i].encname, table_16[i].textformat, table_16[i].bomkind); PDF_show_xy(p, text, Left, y); y -= INCRY; PDF_setfont(p, Font_CS, 14); sprintf(buf, "textformat %s",table_16[i].textformat); PDF_fit_textline(p, table_16[i].textstring, *table_16[i].textlength, Left, y, buf); y -= INCRY; } /* for */ /* End of page. */ PDF_end_page_ext(p, ""); PDF_end_document(p, ""); } PDF_CATCH(p) { printf("PDFlib exception occurred in pdflib_cs4 sample:\n"); printf("[%d] %s: %s\n", PDF_get_errnum(p), PDF_get_apiname(p), PDF_get_errmsg(p)); PDF_delete(p); return(2); } PDF_delete(p); return 0; }
一般来说, 每种基本字体, 都会有在其基础上变化字形的附加字体。比如,字体Arial, 就有其附加字体Arial Bold (粗体), Arial Italic(斜体), 及Arial Bold Italic(粗斜体)。一般你都可以找到或购买到相应的附加字体。
但有时为了应急,或对字体字形没有非常严格的要求。在这样的情况下,我们可以采用人工字形生成(Artificial font styles)。Artificial font styles是Acrobat的一个功能,它根据基本字形而模拟生成粗体,斜体及粗斜体。PDFlib支持这一功能,并遵守Acrobat对此功能的限制。目前此功能之局限于:
1. Acrobat标准字体, 就简体中文来说也就是PDFlib自带的STSong-Light,AdobeSongStd-Light-Acro,及STSongStd-Light-Acro三种简体中文字体。
2. PDFlib可以访问的.otf OpenType字体,并使用表1.1的编码(见《浅谈PDFlib中文输出(一)》), 且“embedding”参数设为假。下面是一个相关的例子--C 源程序(附上生成的pdf文件 –PDFlib_cs5.pdf)。
/*******************************************************************/ /* This example demostrates the usage of Artificial font styles /* under Chinese Simplifed Windows. /*******************************************************************/ #include <stdio.h> #include <stdlib.h> #include <string.h> #include "pdflib.h" int main(void) { PDF *p = NULL; int Font_H = 0, Font_CS = 0, Left = 50, y = 700; const int INCRY = 25; const char TextUnicode[] = "\x80\x7B\x53\x4F\x2D\x4E\x87\x65"; const int TEXTLEN = 8; /* create a new PDFlib object */ if ((p = PDF_new()) == (PDF *) 0) { printf("Couldn''t create PDFlib object (out of memory)!\n"); return(2); } PDF_TRY(p) { if (PDF_begin_document(p, "pdflib_cs5.pdf", 0, "") == -1) { printf("Error: %s\n", PDF_get_errmsg(p)); return(2); } PDF_set_info(p, "Creator", "pdflib_cs5.c"); PDF_set_info(p, "Author", "[email protected]"); PDF_set_info(p, "Title", "Usage of Artificial font styles"); /* Start a new page. */ PDF_begin_page_ext(p, a4_width, a4_height, ""); Font_H = PDF_load_font(p, "Helvetica-Bold", 0, "winansi", ""); PDF_setfont(p, Font_H, 24); PDF_show_xy(p, "Artificial Font Styles", Left + 100, y); /* Normal */ y -= 2 * INCRY; PDF_setfont(p, Font_H, 14); PDF_show_xy(p, "Normal", Left, y); y -= INCRY; Font_CS = PDF_load_font(p, "STSong-Light", 0, "UniGB-UCS2-H", ""); PDF_setfont(p, Font_CS, 14); PDF_show_xy2(p, TextUnicode, TEXTLEN, Left, y); /* Italic */ y -= 2 * INCRY; PDF_setfont(p, Font_H, 14); PDF_show_xy(p, "Italic", Left, y); y -= INCRY; Font_CS = PDF_load_font(p, "STSong-Light", 0, "UniGB-UCS2-H", "fontstyle italic"); PDF_setfont(p, Font_CS, 14); PDF_show_xy2(p, TextUnicode, TEXTLEN, Left, y); /* Bold */ y -= 2 * INCRY; PDF_setfont(p, Font_H, 14); PDF_show_xy(p, "Bold", Left, y); y -= INCRY; Font_CS = PDF_load_font(p, "STSong-Light", 0, "UniGB-UCS2-H", "fontstyle bold"); PDF_setfont(p, Font_CS, 14); PDF_show_xy2(p, TextUnicode, TEXTLEN, Left, y); /* Bold-italic */ y -= 2 * INCRY; PDF_setfont(p, Font_H, 14); PDF_show_xy(p, "Bold-italic", Left, y); y -= INCRY; Font_CS = PDF_load_font(p, "STSong-Light", 0, "UniGB-UCS2-H", "fontstyle bolditalic"); PDF_setfont(p, Font_CS, 14); PDF_show_xy2(p, TextUnicode, TEXTLEN, Left, y); /* End of page. */ PDF_end_page_ext(p, ""); PDF_end_document(p, ""); } PDF_CATCH(p) { printf("PDFlib exception occurred in pdflib_cs5 sample:\n"); printf("[%d] %s: %s\n", PDF_get_errnum(p), PDF_get_apiname(p), PDF_get_errmsg(p)); PDF_delete(p); return(2); } PDF_delete(p); return 0; }