UnicodeDecodeError, invalid continuation byte

当用pandas库读取.csv文件时,出现如下报错:
My Code:

impor tpandas as pd
df=pd.read_csv('C:\\Users\\登亮\\Desktop\\test.csv',encoding='utf-8')

Error:

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xd3 in position 0: invalid continuation byte

Reason:
In binary, 0xE9 looks like1110 1001. If you read about UTF-8 on Wikipedia, you’ll see that such a byte must be followed by two of the form 10xx xxxx. So, for example

>>>b'\xe9\x80\x80'.decode('utf-8')u'\u9000'

But that’s just the mechanical cause of the exception. In this case, you have a string that is almost certainly encoded in latin 1. You can see how UTF-8 and latin 1 look different:

>>>u'\xe9'.encode('utf-8')b'\xc3\xa9'>>>u'\xe9'.encode('latin-1')b'\xe9'

(Note, I'm using a mix of Python 2 and 3 representation here. The input is valid in any version of Python, but your Python interpreter is unlikely to actually show both unicode and byte strings in this way.)
Solution:
Ttry calling read_csv withen coding='latin1',encoding='iso-8859-1'orencoding='cp1252'; these the various encodings found on Windows.

你可能感兴趣的:(UnicodeDecodeError, invalid continuation byte)