Python中用json.loads解码字符串出错:ValueError: No JSON object could be decoded

原文链接:http://www.crifan.com/python_json_loads_valueerror_no_json_object_could_be_decoded/

【问题】

折腾:

【未解决】Python中使用json.loads解码字符串时出错:ValueError: Expecting property name: line 1 column 1 (char 1)

的过程中,结果又出错:

LINE 106  : INFO     photoInfoJson={id:’379879-87329678′,owner:’379879′,ownername:’shanshu’,title:’IMG_3464′,description:”,bucket:’shanshu’,key:’CsFzMuHz’,license:0,stats_notes: 0,albums: [‘379879-18 
1880′,],tags:[{name:’20121202′, author: ‘379879’},{name:’澶╁钩灞辫祻绾㈡灚’, author: ‘379879’}],owner:{id: 379879,username: ‘shanshu’,nickname: ‘shanshu’}} 
LINE 110  : INFO     photoInfoJsonAddQuote={‘id’:’379879-87329678′,’owner’:’379879′,’ownername’:’shanshu’,’title’:’IMG_3464′,’description’:”,’bucket’:’shanshu’,’key’:’CsFzMuHz’,’license’:0,’stats_not
es’: 0,’albums’: [‘379879-181880′,],’tags’:[{‘name’:’20121202′, ‘author’: ‘379879’},{‘name’:’澶╁钩灞辫祻绾㈡灚’, ‘author’: ‘379879’}],’owner’:{‘id’: 379879,’username’: ‘shanshu’,’nickname’: ‘shanshu’ 
}} 
LINE 112  : INFO     photoInfoJsonDoubleQuote={"id":"379879-87329678","owner":"379879","ownername":"shanshu","title":"IMG_3464","description":"","bucket":"shanshu","key":"CsFzMuHz","license":0,"stats_
notes": 0,"albums": ["379879-181880",],"tags":[{"name":"20121202", "author": "379879"},{"name":"澶╁钩灞辫祻绾㈡灚", "author": "379879"}],"owner":{"id": 379879,"username": "shanshu","nickname": "shans 
hu"}}

。。。

。。。

photoInfoDict = json.loads(photoInfoJsonDoubleQuote); 
File "D:\tmp\dev_install_root\Python27_x64\lib\json\__init__.py", line 326, in loads 
return _default_decoder.decode(s) 
File "D:\tmp\dev_install_root\Python27_x64\lib\json\decoder.py", line 366, in decode 
obj, end = self.raw_decode(s, idx=_w(s, 0).end()) 
File "D:\tmp\dev_install_root\Python27_x64\lib\json\decoder.py", line 384, in raw_decode 
raise ValueError("No JSON object could be decoded") 
ValueError: No JSON object could be decoded

 

【解决过程】

1.参考自己的帖子:

【已解决】Python中解析Json文件出错:ValueError : No JSON object could be decoded –> Python中Json库不支持带BOM的UTF-8

去添加编码参数试试:

?
1
2
3
4
5
6
7
8
9
10
11
photoInfoJsonAddQuote = re.sub(r "(,?)(\w+?)\s*?:" , r "\1'\2':" , photoInfoJson);
logging.info( "photoInfoJsonAddQuote=%s" , photoInfoJsonAddQuote);
photoInfoJsonDoubleQuote = photoInfoJsonAddQuote.replace( "'", "\"");
logging.info("photoInfoJsonDoubleQuote=%s", photoInfoJsonDoubleQuote);
 
#photoInfoJsonDoubleQuote = photoInfoJson.replace("'" , '"');
#logging.info("photoInfoJsonDoubleQuote=%s", photoInfoJsonDoubleQuote);
#photoInfoDict = json.loads(photoInfoJsonDoubleQuote);
photoInfoDict = json.loads(photoInfoJsonDoubleQuote, "UTF-8" );
#photoInfoDict = json.loads(photoInfoJsonDoubleQuote);
logging.info( "photoInfoDict=%s" , photoInfoDict);

结果是问题依旧。

虽然知道原因,此处json库不支持带BOM的UTF-8,但是此处是获得的字符串photoInfoJson,

所以,没法通过notepad++等去转换文件为不带BOM的UTF-8。

所以,需要重新想办法。

2.结果手动重新解码和编码:

?
1
2
3
4
5
6
7
8
9
10
11
12
13
photoInfoJsonAddQuote = re.sub(r "(,?)(\w+?)\s*?:" , r "\1'\2':" , photoInfoJson);
logging.info( "photoInfoJsonAddQuote=%s" , photoInfoJsonAddQuote);
photoInfoJsonDoubleQuote = photoInfoJsonAddQuote.replace( "'", "\"");
logging.info("photoInfoJsonDoubleQuote=%s", photoInfoJsonDoubleQuote);
photoInfoJsonDoubleQuoteUni = photoInfoJsonDoubleQuote.decode("UTF-8");
photoInfoJsonDoubleQuoteUtf8 = photoInfoJsonDoubleQuoteUni.encode("UTF-8");
 
#photoInfoJsonDoubleQuote = photoInfoJson.replace("'" , '"');
#logging.info("photoInfoJsonDoubleQuote=%s", photoInfoJsonDoubleQuote);
#photoInfoDict = json.loads(photoInfoJsonDoubleQuote, "UTF-8");
#photoInfoDict = json.loads(photoInfoJsonDoubleQuote);
photoInfoDict = json.loads(photoInfoJsonDoubleQuoteUtf8);
logging.info( "photoInfoDict=%s" , photoInfoDict);

结果还是

ValueError: No JSON object could be decoded

的错误。

3.结果去用代码测试了下,当前的确本身就是UTF-8的字符串:

?
1
2
print "type(photoInfoJson)=" , type (photoInfoJson); #type(photoInfoJson)=
print crifanLib.getStrPossibleCharset(photoInfoJson); #utf-8

但是不知道为何无法解码。

4.直接试试,单引号:

?
1
2
3
4
5
6
7
8
9
10
11
12
13
photoInfoJsonAddQuote = re.sub(r "(,?)(\w+?)\s*?:" , r "\1'\2':" , photoInfoJson);
logging.info( "photoInfoJsonAddQuote=%s" , photoInfoJsonAddQuote);
#photoInfoJsonDoubleQuote = photoInfoJsonAddQuote.replace("'", "\"");
#logging.info("photoInfoJsonDoubleQuote=%s", photoInfoJsonDoubleQuote);
#photoInfoJsonDoubleQuoteUni = photoInfoJsonDoubleQuote.decode("UTF-8");
#photoInfoJsonDoubleQuoteUtf8 = photoInfoJsonDoubleQuoteUni.encode("UTF-8");
 
#photoInfoJsonDoubleQuote = photoInfoJson.replace("'", '"');
#logging.info("photoInfoJsonDoubleQuote=%s", photoInfoJsonDoubleQuote);
#photoInfoDict = json.loads(photoInfoJsonDoubleQuote, "UTF-8");
#photoInfoDict = json.loads(photoInfoJsonDoubleQuote);
photoInfoDict = json.loads(photoInfoJsonAddQuote);
logging.info( "photoInfoDict=%s" , photoInfoDict);

看看效果,你的确会出现:

ValueError: Expecting property name: line 1 column 1 (char 1)

的错误。

5.仍旧再参考:

【已解决】Python中解析Json文件出错:ValueError : No JSON object could be decoded –> Python中Json库不支持带BOM的UTF-8

去试试,把其转换为ANSI的GB18030,:

?
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
photoInfoJsonAddQuote = re.sub(r "(,?)(\w+?)\s*?:" , r "\1'\2':" , photoInfoJson);
logging.info( "photoInfoJsonAddQuote=%s" , photoInfoJsonAddQuote);
photoInfoJsonDoubleQuote = photoInfoJsonAddQuote.replace( "'", "\"");
logging.info("photoInfoJsonDoubleQuote=%s", photoInfoJsonDoubleQuote);
photoInfoJsonDoubleQuoteUni = photoInfoJsonDoubleQuote.decode("UTF-8");
photoInfoJsonAddQuoteAnsi = photoInfoJsonDoubleQuoteUni.encode("GB18030");
 
print "type(photoInfoJson)=",type(photoInfoJsonAddQuoteAnsi);
print crifanLib.getStrPossibleCharset(photoInfoJsonAddQuoteAnsi);
 
#photoInfoJsonDoubleQuote = photoInfoJson.replace("'" , '"');
#logging.info("photoInfoJsonDoubleQuote=%s", photoInfoJsonDoubleQuote);
#photoInfoDict = json.loads(photoInfoJsonDoubleQuote, "UTF-8");
#photoInfoDict = json.loads(photoInfoJsonDoubleQuote);
photoInfoDict = json.loads(photoInfoJsonAddQuoteAnsi);
logging.info( "photoInfoDict=%s" , photoInfoDict);

结果问题依旧。

6.参考手册的解释:

json. loads ( s[,  encoding[,  cls[,  object_hook[,  parse_float[, parse_int[,  parse_constant[,  object_pairs_hook[,  **kw]]]]]]]] )

Deserialize s (a str or unicode instance containing a JSON document) to a Python object.

If s is a str instance and is encoded with an ASCII based encoding other than UTF-8 (e.g. latin-1), then an appropriate encoding name must be specified. Encodings that are not ASCII based (such as UCS-2) are not allowed and should be decoded to unicode first.

The other arguments have the same meaning as in load().

去改为unicode试试:

?
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
photoInfoJsonAddQuote = re.sub(r "(,?)(\w+?)\s*?:" , r "\1'\2':" , photoInfoJson);
logging.info( "photoInfoJsonAddQuote=%s" , photoInfoJsonAddQuote);
photoInfoJsonDoubleQuote = photoInfoJsonAddQuote.replace( "'", "\"");
logging.info("photoInfoJsonDoubleQuote=%s", photoInfoJsonDoubleQuote);
photoInfoJsonDoubleQuoteUni = photoInfoJsonDoubleQuote.decode("UTF-8");
#photoInfoJsonAddQuoteAnsi = photoInfoJsonDoubleQuoteUni.encode("GB18030");
 
print "type(photoInfoJson)=",type(photoInfoJsonDoubleQuoteUni);
#print crifanLib.getStrPossibleCharset(photoInfoJsonDoubleQuoteUni);
 
#photoInfoJsonDoubleQuote = photoInfoJson.replace("'" , '"');
#logging.info("photoInfoJsonDoubleQuote=%s", photoInfoJsonDoubleQuote);
#photoInfoDict = json.loads(photoInfoJsonDoubleQuote, "UTF-8");
#photoInfoDict = json.loads(photoInfoJsonDoubleQuote);
#photoInfoDict = json.loads(photoInfoJsonAddQuoteAnsi);
photoInfoDict = json.loads(photoInfoJsonDoubleQuoteUni);
logging.info( "photoInfoDict=%s" , photoInfoDict);

结果是问题依据。

7.再去试试,使用GB18030的看看是否能解码:

?
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
photoInfoJsonAddQuote = re.sub(r "(,?)(\w+?)\s*?:" , r "\1'\2':" , photoInfoJson);
logging.info( "photoInfoJsonAddQuote=%s" , photoInfoJsonAddQuote);
photoInfoJsonDoubleQuote = photoInfoJsonAddQuote.replace( "'", "\"");
logging.info("photoInfoJsonDoubleQuote=%s", photoInfoJsonDoubleQuote);
photoInfoJsonDoubleQuoteUni = photoInfoJsonDoubleQuote.decode("UTF-8");
photoInfoJsonAddQuoteAnsi = photoInfoJsonDoubleQuoteUni.encode("GB18030");
 
#print "type(photoInfoJson)=",type(photoInfoJsonDoubleQuoteUni);
#print crifanLib.getStrPossibleCharset(photoInfoJsonDoubleQuoteUni);
 
#photoInfoJsonDoubleQuote = photoInfoJson.replace("'" , '"');
#logging.info("photoInfoJsonDoubleQuote=%s", photoInfoJsonDoubleQuote);
#photoInfoDict = json.loads(photoInfoJsonDoubleQuote, "UTF-8");
#photoInfoDict = json.loads(photoInfoJsonDoubleQuote);
photoInfoDict = json.loads(photoInfoJsonAddQuoteAnsi, "GB18030" );
#photoInfoDict = json.loads(photoInfoJsonDoubleQuoteUni);
logging.info( "photoInfoDict=%s" , photoInfoDict);

结果是问题依旧。

8.单独,写上原始字符串,看看能否正确解码:

?
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
         #debug here write fix string json to test
         photoInfoJson = """{id:'379879-87329678',owner:'379879',ownername:'shanshu',title:'IMG_3464',description:'',bucket:'shanshu',key:'CsFzMuHz',license:0,stats_notes: 0,albums: ['379879-18
1880',],tags:[{name:'20121202', author: '379879'},{name:'天平山赏红枫', author: '379879'}],owner:{id: 379879,username: 'shanshu',nickname: 'shanshu'}}""" ;
 
         photoInfoJsonAddQuote = re.sub(r "(,?)(\w+?)\s*?:" , r "\1'\2':" , photoInfoJson);
         logging.info( "photoInfoJsonAddQuote=%s" , photoInfoJsonAddQuote);
         photoInfoJsonDoubleQuote = photoInfoJsonAddQuote.replace( "'", "\"");
         logging.info("photoInfoJsonDoubleQuote=%s", photoInfoJsonDoubleQuote);
 
         #photoInfoJsonDoubleQuoteUni = photoInfoJsonDoubleQuote.decode("UTF-8");
         #photoInfoJsonAddQuoteAnsi = photoInfoJsonDoubleQuoteUni.encode("GB18030");
         
         #print "type(photoInfoJson)=",type(photoInfoJsonDoubleQuoteUni);
         #print crifanLib.getStrPossibleCharset(photoInfoJsonDoubleQuoteUni);
         
         #photoInfoJsonDoubleQuote = photoInfoJson.replace("'" , '"');
         #logging.info("photoInfoJsonDoubleQuote=%s", photoInfoJsonDoubleQuote);
         #photoInfoDict = json.loads(photoInfoJsonDoubleQuote, "UTF-8");
         #photoInfoDict = json.loads(photoInfoJsonDoubleQuote);
         #photoInfoDict = json.loads(photoInfoJsonAddQuoteAnsi, "GB18030");
         photoInfoDict = json.loads(photoInfoJsonDoubleQuote);
         logging.info( "photoInfoDict=%s" , photoInfoDict);

结果却又是其他错误:

    photoInfoDict = json.loads(photoInfoJsonDoubleQuote);

  File "D:\tmp\dev_install_root\Python27_x64\lib\json\__init__.py", line 326, in loads

    return _default_decoder.decode(s)

  File "D:\tmp\dev_install_root\Python27_x64\lib\json\decoder.py", line 366, in decode

    obj, end = self.raw_decode(s, idx=_w(s, 0).end())

  File "D:\tmp\dev_install_root\Python27_x64\lib\json\decoder.py", line 382, in raw_decode

    obj, end = self.scan_once(s, idx)

ValueError: Invalid control character at: line 1 column 195 (char 195)

很诡异。

其看看195是哪个字符。

结果看到,其中有CR LF:

Python中用json.loads解码字符串出错:ValueError: No JSON object could be decoded_第1张图片

9.然后去把CR LF去掉:

Python中用json.loads解码字符串出错:ValueError: No JSON object could be decoded_第2张图片

然后再去运行试试,结果问题依旧。

10.把测试代码变为:

?
1
photoInfoJson = """{id:'379879-87329678',owner:'379879',ownername:'shanshu',title:'IMG_3464',description:'xxx',bucket:'shanshu',key:'CsFzMuHz',license:0,stats_notes: 0,albums: ['379879-181880',],tags:[{name:'20121202', author: '379879'},{name:'天平山赏红枫', author: '379879'}],owner:{id: 379879,username: 'shanshu',nickname: 'shanshu'}}""" ;

结果问题依旧。

11.结果把代码改为:

?
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
#debug here write fix string json to test
photoInfoJson = """{id:'379879-87329678',owner:'379879',ownername:'shanshu',title:'IMG_3464',description:'xxx',bucket:'shanshu',key:'CsFzMuHz',license:0,stats_notes: 0,albums: ['379879-181880'],tags:[{name:'20121202', author: '379879'},{name:'天平山赏红枫', author: '379879'}],owner:{id: 379879,username: 'shanshu',nickname: 'shanshu'}}""" ;
 
photoInfoJsonAddQuote = re.sub(r "(,?)(\w+?)\s*?:" , r "\1'\2':" , photoInfoJson);
logging.info( "photoInfoJsonAddQuote=%s" , photoInfoJsonAddQuote);
photoInfoJsonDoubleQuote = photoInfoJsonAddQuote.replace( "'", "\"");
logging.info("photoInfoJsonDoubleQuote=%s", photoInfoJsonDoubleQuote);
 
#photoInfoJsonDoubleQuoteUni = photoInfoJsonDoubleQuote.decode("UTF-8");
#photoInfoJsonAddQuoteAnsi = photoInfoJsonDoubleQuoteUni.encode("GB18030");
 
#print "type(photoInfoJson)=",type(photoInfoJsonDoubleQuoteUni);
#print crifanLib.getStrPossibleCharset(photoInfoJsonDoubleQuoteUni);
 
#photoInfoJsonDoubleQuote = photoInfoJson.replace("'" , '"');
#logging.info("photoInfoJsonDoubleQuote=%s", photoInfoJsonDoubleQuote);
#photoInfoDict = json.loads(photoInfoJsonDoubleQuote, "UTF-8");
#photoInfoDict = json.loads(photoInfoJsonDoubleQuote);
#photoInfoDict = json.loads(photoInfoJsonAddQuoteAnsi, "GB18030");
photoInfoDict = json.loads(photoInfoJsonDoubleQuote);
logging.info( "photoInfoDict=%s" , photoInfoDict);

竟然,终于,可以解码了。。。

 

其中,是把:

albums: [‘379879-181880’,]

改为了:

albums: [‘379879-181880’]

 

即,把列表类型的值的内部的最后一个“多余”的逗号去掉,就可以了。

 

【总结】

对于列表最后,添加上一个逗号,对于本身Python中的语法,是允许的;

对于其他语言,比如C语言,记得也是允许的。

但是很变态的是,在Python 2.7.3中的json库中,是不支持的。。。

导致,很多人,如果遇到类似问题,根本无从下手。。。

 

即:

  • 原因:

在Python 2.7.3中的json库中,是不支持的这种的:

albums: [‘379879-181880’,]

必须写成:

albums: [‘379879-181880’]

 

  • 解决办法:

对于上述这种非法的字符串:

{id:’379879-87329678′,owner:’379879′,ownername:’shanshu’,title:’IMG_3464′,description:’xxx’,bucket:’shanshu’,key:’CsFzMuHz’,license:0,stats_notes: 0,albums: [‘379879-181880’,],tags:[{name:’20121202′, author: ‘379879’},{name:’天平山赏红枫’, author: ‘379879’}],owner:{id: 379879,username: ‘shanshu’,nickname: ‘shanshu’}}

可以用:

?
1
2
3
4
addedSingleQuoteJsonStr = re.sub(r "(,?)(\w+?)\s*?:" , r "\1'\2':" , orginalJsonStr);
doubleQuotedJsonStr = addedSingleQuoteJsonStr.replace( "'" , "\"" );
#remove comma before end of list
removedLastCommaInList = re.sub(r ",\s*?]" , "]" , addedSingleQuoteJsonStr);

处理成,合法的:

{id:’379879-87329678′,owner:’379879′,ownername:’shanshu’,title:’IMG_3464′,description:’xxx’,bucket:’shanshu’,key:’CsFzMuHz’,license:0,stats_notes: 0,albums: [‘379879-181880′],tags:[{name:’20121202’, author: ‘379879’},{name:’天平山赏红枫’, author: ‘379879’}],owner:{id: 379879,username: ‘shanshu’,nickname: ‘shanshu’}}

你可能感兴趣的:(Python)