今天和帮一个同事解决文件读取错误的问题,纠结了一天。
同事用 top -n 1 > top.txt 命令C语言写的文件读取,文件打开、读取都没有问题,在终端显示也现实正常。但是编译成cgi后通过网页访问该文件,则出现乱码了。
正常显示(终端显示):
top - 16:21:39 up 7:09, 3 users, load average: 1.03, 0.97, 0.95
Tasks: 196 total, 2 running, 194 sleeping, 0 stopped, 0 zombie
Cpu(s): 6.4%us, 2.1%sy, 0.0%ni, 91.4%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 2026356k total, 1491224k used, 535132k free, 217192k buffers
Swap: 4063228k total, 0k used, 4063228k free, 770236k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
2603 root 20 0 658m 205m 33m R 8.0 10.4 25:39.29 firefox
1842 root 20 0 92460 35m 12m S 4.7 1.8 5:56.84 Xorg
14071 root 20 0 99.8m 30m 13m S 3.7 1.6 0:23.20 npviewer.bin
网页访问时出现以下状况:
[H[2J(B[mtop - 10:12:00 up 59 min, 2 users, load average: 0.70, 0.67, 0.75(B[m[39;49m[K Tasks:(B[m[39;49m(B[m 194 (B[m[39;49mtotal,(B[m[39;49m(B[m 1 (B[m[39;49mrunning,(B[m[39;49m(B[m 193 (B[m[39;49msleeping,(B[m[39;49m(B[m 0 (B[m[39;49mstopped,(B[m[39;49m(B[m 0 (B[m[39;49mzombie(B[m[39;49m[K Cpu(s):(B[m[39;49m(B[m 4.3%(B[m[39;49mus,(B[m[39;49m(B[m 1.2%(B[m[39;49msy,(B[m[39;49m(B[m 0.0%(B[m[39;49mni,(B[m[39;49m(B[m 93.1%(B[m[39;49mid,(B[m[39;49m(B[m 1.3%(B[m[39;49mwa,(B[m[39;49m(B[m 0.0%(B[m[39;49mhi,(B[m[39;49m(B[m 0.0%(B[m[39;49msi,(B[m[39;49m(B[m 0.0%(B[m[39;49mst(B[m[39;49m[K Mem: (B[m[39;49m(B[m 2026356k (B[m[39;49mtotal,(B[m[39;49m(B[m 910236k (B[m[39;49mused,(B[m[39;49m(B[m 1116120k (B[m[39;49mfree,(B[m[39;49m(B[m 53032k (B[m[39;49mbuffers(B[m[39;49m[K Swap:(B[m[39;49m(B[m 4063228k (B[m[39;49mtotal,(B[m[39;49m(B[m 0k (B[m[39;49mused,(B[m[39;49m(B[m 4063228k (B[m[39;49mfree,(B[m[39;49m(B[m 527544k (B[m[39;49mcached(B[m[39;49m[K [6;1H [7m PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND (B[m[39;49m[K (B[m 1842 root 20 0 73104 29m 10m S 3.8 1.5 1:09.30 Xorg (B[m[39;49m (B[m 3136 root 20 0 73112 14m 10m S 1.9 0.7 0:04.43 npviewer.bin (B[m[39;49m (B[m 1 root 20 0 2876 1364 1152 S 0.0 0.1 0:00.98 init (B[m[39;49m (B[m 2 root 20 0 0 0 0 S 0.0 0.0 0:00.00 kthreadd (B[m[39;49m (B[m 3 root 20 0 0 0 0 S 0.0 0.0 0:00.05 ksoftirqd/0 (B[m[39;49m (B[m 4 root RT 0 0 0 0 S 0.0 0.0 0:00.00 migration/0 (B[m[39;49m (B[m 5 root RT 0 0 0 0 S 0.0 0.0 0:00.00 watchdog/0 (B[m[39;49m (B[m 6 root RT 0 0 0 0 S 0.0 0.0 0:00.00 migration/1 (B[m[39;49m
遇到这种情况,观察发现,乱码几乎都在换行的地方,我首先想到换行符的问题。网页时通过
进行,但是c语言要用\n。但是在我查完fgets()函数之后我就排除了。fgets()的功能是从文件结构体指针stream中读取数据,每次读取一行。读取的数据保存在buf指向的字符数组中,每次最多读取bufsize-1个字符(第bufsize个字符赋'\0'),如果文件中的该行,不足bufsize个字符,则读完该行就结束。由此可见如果遇到在命令行遇到换行符,换行符也是要读进来的。
然后我一个偶然的操作,把top.txt文件打开了,发现这个文件是:
[H[2J(B[mtop - 10:12:00 up 59 min, 2 users, load average: 0.70, 0.67, 0.75(B[m[39;49m[K
Tasks:(B[m[39;49m(B[m 194 (B[m[39;49mtotal,(B[m[39;49m(B[m 1 (B[m[39;49mrunning,(B[m[39;49m(B[m 193 (B[m[39;49msleeping,(B[m[39;49m(B[m 0 (B[m[39;49mstopped,(B[m[39;49m(B[m 0 (B[m[39;49mzombie(B[m[39;49m[K
Cpu(s):(B[m[39;49m(B[m 4.3%(B[m[39;49mus,(B[m[39;49m(B[m 1.2%(B[m[39;49msy,(B[m[39;49m(B[m 0.0%(B[m[39;49mni,(B[m[39;49m(B[m 93.1%(B[m[39;49mid,(B[m[39;49m(B[m 1.3%(B[m[39;49mwa,(B[m[39;49m(B[m 0.0%(B[m[39;49mhi,(B[m[39;49m(B[m 0.0%(B[m[39;49msi,(B[m[39;49m(B[m 0.0%(B[m[39;49mst(B[m[39;49m[K
Mem: (B[m[39;49m(B[m 2026356k (B[m[39;49mtotal,(B[m[39;49m(B[m 910236k (B[m[39;49mused,(B[m[39;49m(B[m 1116120k (B[m[39;49mfree,(B[m[39;49m(B[m 53032k (B[m[39;49mbuffers(B[m[39;49m[K
Swap:(B[m[39;49m(B[m 4063228k (B[m[39;49mtotal,(B[m[39;49m(B[m 0k (B[m[39;49mused,(B[m[39;49m(B[m 4063228k (B[m[39;49mfree,(B[m[39;49m(B[m 527544k (B[m[39;49mcached(B[m[39;49m[K
[6;1H
[7m PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND (B[m[39;49m[K
(B[m 1842 root 20 0 73104 29m 10m S 3.8 1.5 1:09.30 Xorg (B[m[39;49m
(B[m 3136 root 20 0 73112 14m 10m S 1.9 0.7 0:04.43 npviewer.bin (B[m[39;49m
(B[m 1 root 20 0 2876 1364 1152 S 0.0 0.1 0:00.98 init (B[m[39;49m
(B[m 2 root 20 0 0 0 0 S 0.0 0.0 0:00.00 kthreadd (B[m[39;49m
(B[m 3 root 20 0 0 0 0 S 0.0 0.0 0:00.05 ksoftirqd/0 (B[m[39;49m
我了个擦,,文件本身就是有乱码的,再读这个文件当然时有乱码的存在啊!!这不玩我呢吗。。
但是这个文件为什么会在终端正常显示而没有乱码呢?
最大的可能就是,终端的编码和文件的编码不一样。因此我又查了所有的编码形式,所有的编码都试过一遍之后,发现还是不行。。。
然后我回到终端,vi top.txt,有乱码。。cat top.txt,显示正常。我了个擦,vi乱码,cat正常又是个毛啊。只好重新百度。
发现有个2B和我一样正被这个问题困扰。过滤掉N楼人的错误回答后,终于在一个角落找到问题的本质:
那不是乱码,那些是终端控制字符,控制显示格式的。top 输出用于重定向要加 -b 选项。
因此,我查了一下top使用说明:
top参数说明
d 指定每两次屏幕信息刷新之间的时间间隔。当然用户可以使用s交互命令来改变之。
u 只查看指定用户名的进程
p 通过指定监控进程ID来仅仅监控某个进程的状态
n 设置退出前屏幕刷新的次数
b 将top输出编排成适合输出到文件的格式,可以使用这个选项创建进程日志
q 该选项将使top没有任何延迟的进行刷新。如果调用程序有超级用户权限,那么top将以尽可能高的优先级运行
c 显示整个命令行而不只是显示命令名
S 指定累计模式
s 使top命令在安全模式中运行。这将去除交互命令所带来的潜在危险。
i 使top不显示任何闲置或者僵死进程。
因此,最终的解决方法就是用命令用 top -b -n 1 > top.txt。