Python使用GAE遇到UnicodeDecodeError异常的解决
Google App Engine SDK: 1.4.0
Python: 2.6.6
用Html表单form向GAE提交参数,其中一个参数中含有中文,在用DataStore保存请求时出现了UnicodeDecodeError异常,如下:
'
ascii
'
codec can
'
t decode byte 0xe6 in position 0: ordinal not in range(128)
Traceback (most recent call last):
File "/base/python_runtime/python_lib/versions/1/google/appengine/ext/webapp/__init__.py", line 517, in __call__
handler.post(*groups)
File "/base/python_runtime/python_lib/versions/1/google/appengine/ext/db/__init__.py", line 895, in put
return datastore.Put(self._entity, config=config)
File "/base/python_runtime/python_lib/versions/1/google/appengine/api/datastore.py", line 404, in Put
return _GetConnection().async_put(config, entities, extra_hook).get_result()
File "/base/python_runtime/python_lib/versions/1/google/appengine/datastore/datastore_rpc.py", line 1130, in async_put
for pbs in pbsgen:
File "/base/python_runtime/python_lib/versions/1/google/appengine/datastore/datastore_rpc.py", line 993, in __generate_pb_lists
pb = value_to_pb(value)
File "/base/python_runtime/python_lib/versions/1/google/appengine/api/datastore.py", line 202, in entity_to_pb
return entity._ToPb()
File "/base/python_runtime/python_lib/versions/1/google/appengine/api/datastore.py", line 762, in _ToPb
properties = datastore_types.ToPropertyPb(name, values)
File "/base/python_runtime/python_lib/versions/1/google/appengine/api/datastore_types.py", line 1530, in ToPropertyPb
pbvalue = pack_prop(name, v, pb.mutable_value())
File "/base/python_runtime/python_lib/versions/1/google/appengine/api/datastore_types.py", line 1353, in PackString
pbvalue.set_stringvalue(unicode(value).encode( ' utf - 8 ' ))
UnicodeDecodeError: ' ascii ' codec can ' t decode byte 0xe6 in position 0 : ordinal not in range ( 128 )
Traceback (most recent call last):
File "/base/python_runtime/python_lib/versions/1/google/appengine/ext/webapp/__init__.py", line 517, in __call__
handler.post(*groups)
File "/base/python_runtime/python_lib/versions/1/google/appengine/ext/db/__init__.py", line 895, in put
return datastore.Put(self._entity, config=config)
File "/base/python_runtime/python_lib/versions/1/google/appengine/api/datastore.py", line 404, in Put
return _GetConnection().async_put(config, entities, extra_hook).get_result()
File "/base/python_runtime/python_lib/versions/1/google/appengine/datastore/datastore_rpc.py", line 1130, in async_put
for pbs in pbsgen:
File "/base/python_runtime/python_lib/versions/1/google/appengine/datastore/datastore_rpc.py", line 993, in __generate_pb_lists
pb = value_to_pb(value)
File "/base/python_runtime/python_lib/versions/1/google/appengine/api/datastore.py", line 202, in entity_to_pb
return entity._ToPb()
File "/base/python_runtime/python_lib/versions/1/google/appengine/api/datastore.py", line 762, in _ToPb
properties = datastore_types.ToPropertyPb(name, values)
File "/base/python_runtime/python_lib/versions/1/google/appengine/api/datastore_types.py", line 1530, in ToPropertyPb
pbvalue = pack_prop(name, v, pb.mutable_value())
File "/base/python_runtime/python_lib/versions/1/google/appengine/api/datastore_types.py", line 1353, in PackString
pbvalue.set_stringvalue(unicode(value).encode( ' utf - 8 ' ))
UnicodeDecodeError: ' ascii ' codec can ' t decode byte 0xe6 in position 0 : ordinal not in range ( 128 )
在详细看了GAE产生的log后,查到了原因。这是因为GAE SDK在保存字符串时,首先会将字符串转为unicode类型,从异常栈里可以看出这点:
pbvalue.set_stringvalue(unicode(value).encode(
'
utf-8
'
))
而默认的解码方式是“ascii”的,遇到就中文时,就出问题了。
因此解决办法是自己手动先将参数字符串转为unicode,方法很简单:
content
=
unicode(content,
'
utf-8
'
)
这样就异常就不会再出现了。
整个解析POST参数并保存的代码如下:
post_data
=
get_post_data()
# 将a=1&b=2&c=3形式的post data分割成dict
quoted = dict([x.split( ' = ' ) for x in post_data.split( ' & ' )]).get( ' status ' , '' )
# 需要将浏览器编码后的url解码
content = urllib.unquote_plus(quoted)
# 转为 unicode
content = unicode(content, ' utf-8 ' )
# 保存到GAE datastore
data = Data()
data.content = content
data.put()
# 将a=1&b=2&c=3形式的post data分割成dict
quoted = dict([x.split( ' = ' ) for x in post_data.split( ' & ' )]).get( ' status ' , '' )
# 需要将浏览器编码后的url解码
content = urllib.unquote_plus(quoted)
# 转为 unicode
content = unicode(content, ' utf-8 ' )
# 保存到GAE datastore
data = Data()
data.content = content
data.put()