Python换行符问题:\r\n还是\n?

这是一个很经典的问题。因为不同系统下默认的换行符不同。字符处理时候,这样的“不同”会带来很大的问题,例如line[-2]和line.strip()会因为平台不同返回不同的值。


解决方法:

Python 2

PEP 278 -- Universal Newline Support,感谢毕勤的补充):

1)如果不是txt文件,建议用wb和rb来读写。通过二进制读写,不会有换行问题。

2)如果需要明文内容,请用rU来读取(强烈推荐),即U通用换行模式(Universal new line mode)。该模式会把所有的换行符(\r \n \r\n)替换为\n。只支持读入,但是也足够了。这是Python 提供给我们的最好的选择,没有之一。


对比r和rU的结果:

content = file(fn, 'r').read()
# test\r\ntest2
# 这里的换行会因不同系统而不同                       
content = file(fn, 'rU').read()
# test\ntest2
# 所有的换行都被统一,不分系统

Python 3

请注意:Python 3不推荐用rU模式!


open(file, mode='r', buffering=-1, encoding=None, errors=None, newline=None, closefd=True)


在Python 3,可以通过open函数的newline参数来控制Universal new line mode:读取时候,不指定newline,则默认开启Universal new line mode,所有\n, \r, or \r\n被默认转换为\n ;写入时,不指定newline,则换行符为各系统默认的换行符(\n, \r, or \r\n, ),指定为newline='\n',则都替换为\n(相当于Universal new line mode);不论读或者写时,newline=''都表示不转换。

newline controls how universal newlines works (it only applies to text mode). It can be None, '', '\n', '\r', and '\r\n'. It works as follows:
  • On input, if newline is None, universal newlines mode is enabled. Lines in the input can end in '\n', '\r', or '\r\n', and these are translated into '\n' before being returned to the caller. If it is '', universal newline mode is enabled, but line endings are returned to the caller untranslated. If it has any of the other legal values, input lines are only terminated by the given string, and the line ending is returned to the caller untranslated.
  • On output, if newline is None, any '\n' characters written are translated to the system default line separator,os.linesep. If newline is '', no translation takes place. If newline is any of the other legal values, any '\n' characters written are translated to the given string.

参考文献:

PEP 278 -- Universal Newline Support

Python 3 open: 2. Built-in Functions

你可能感兴趣的:(python,python换行符问题)