Python解析Json

Python解析Json_第1张图片

作者:金良([email protected]) csdn博客:http://blog.csdn.net/u012176591

1.解析字符串

obj = """ {"name": "Wes", "places_lived": ["United States", "Spain", "Germany"], "pet": null, "siblings": [{"name": "Scott", "age": 25, "pet": "Zuko"}, {"name": "Katie", "age": 33, "pet": "Cisco"}] } """

result = json.loads(obj) #解析字符串
result #打印result值

输出:

{u'name': u'Wes',
 u'pet': None,
 u'places_lived': [u'United States', u'Spain', u'Germany'],
 u'siblings': [{u'age': 25, u'name': u'Scott', u'pet': u'Zuko'},
  {u'age': 33, u'name': u'Katie', u'pet': u'Cisco'}]}

2.读取文本

下面是文本的实际内容,每个最外层的字典占文本中的一行

{ "a": "Mozilla\/5.0 (Windows NT 6.1; WOW64) AppleWebKit\/535.11 (KHTML, like Gecko) Chrome\/17.0.963.78 Safari\/535.11", "c": "US", "nk": 1, "tz": "America\/New_York", "gr": "MA", "g": "A6qOVH", "h": "wfLQtf", "l": "orofrog", "al": "en-US,en;q=0.8", "hh": "1.usa.gov", "r": "http:\/\/www.facebook.com\/l\/7AQEFzjSi\/1.usa.gov\/wfLQtf", "u": "http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/22415991", "t": 1331923247, "hc": 1331822918, "cy": "Danvers", "ll": [ 42.576698, -70.954903 ] }
{ "a": "GoogleMaps\/RochesterNY", "c": "US", "nk": 0, "tz": "America\/Denver", "gr": "UT", "g": "mwszkS", "h": "mwszkS", "l": "bitly", "hh": "j.mp", "r": "http:\/\/www.AwareMap.com\/", "u": "http:\/\/www.monroecounty.gov\/etc\/911\/rss.php", "t": 1331923249, "hc": 1308262393, "cy": "Provo", "ll": [ 40.218102, -111.613297 ] }

读取一行并打印出来,可以看到读取的内容,如下

open(path).readline()

打印的内容,注意最后一个字符是换行符,在后面我发现该换行符有或没有都能正常解析:

'{ "a": "Mozilla\\/5.0 (Windows NT 6.1; WOW64) AppleWebKit\\/535.11 (KHTML, like Gecko) Chrome\\/17.0.963.78 Safari\\/535.11", "c": "US", "nk": 1, "tz": "America\\/New_York", "gr": "MA", "g": "A6qOVH", "h": "wfLQtf", "l": "orofrog", "al": "en-US,en;q=0.8", "hh": "1.usa.gov", "r": "http:\\/\\/www.facebook.com\\/l\\/7AQEFzjSi\\/1.usa.gov\\/wfLQtf", "u": "http:\\/\\/www.ncbi.nlm.nih.gov\\/pubmed\\/22415991", "t": 1331923247, "hc": 1331822918, "cy": "Danvers", "ll": [ 42.576698, -70.954903 ] }\n'
import json
path = 'filename.txt'
records = [json.loads(line) for line in open(path)]

打印效果(records[0]):

{u'a': u'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/535.11 (KHTML, like Gecko) Chrome/17.0.963.78 Safari/535.11',
 u'al': u'en-US,en;q=0.8',
 u'c': u'US',
 u'cy': u'Danvers',
 u'g': u'A6qOVH',
 u'gr': u'MA',
 u'h': u'wfLQtf',
 u'hc': 1331822918,
 u'hh': u'1.usa.gov',
 u'l': u'orofrog',
 u'll': [42.576698, -70.954903],
 u'nk': 1,
 u'r': u'http://www.facebook.com/l/7AQEFzjSi/1.usa.gov/wfLQtf',
 u't': 1331923247,
 u'tz': u'America/New_York',
 u'u': u'http://www.ncbi.nlm.nih.gov/pubmed/22415991'}

3.读取json文件

内容是一个大的列表,第一行开头是一个[ 符,末行最后一个字符’]’,列表的元素是一个个的字典,每个字典占一行,每行结束为一个, 符,除了最后一行。
json格式文件内容如下:

[{"url": "http://home.cnblogs.com/u/panpannju/", "followers": ["tandier", "611154"], "name": "panpannju"},
{"url": "http://home.cnblogs.com/u/429306/", "followers": [], "name": "429306"},
{"url": "http://home.cnblogs.com/u/jkframe/", "followers": ["AleeGreat", "koalaer"], "name": "jkframe"},
{"url": "http://home.cnblogs.com/u/graicesun/", "followers": [], "name": "graicesun"},
{"url": "http://home.cnblogs.com/u/blueshinejason/", "followers": ["overmore"], "name": "blueshinejason"},
{"url": "http://home.cnblogs.com/u/AleeGreat/", "followers": [], "name": "AleeGreat"},
{"url": "http://home.cnblogs.com/u/490449/", "followers": ["superhuake"], "name": "490449"},
{"url": "http://home.cnblogs.com/u/619865/", "followers": [], "name": "619865"},
{"url": "http://home.cnblogs.com/u/holycy/", "followers": ["graicesun"], "name": "holycy"}]

读取文件所有内容

text_file = open("data.json", "r")
lines = text_file.readlines()

查看首行,末行及中间任意一行,观察效果

lines[0] #首行内容
'[{"url": "http://home.cnblogs.com/u/jinliangjiuzhuang/", "followers": [], "name": "jinliangjiuzhuang"},\n'

lines[-1] #末行内容
'{"url": "http://home.cnblogs.com/u/510419/", "followers": [], "name": "510419"}]'

lines[1] #非首行和末行的内容
'{"url": "http://home.cnblogs.com/u/NelsonWu/", "followers": ["jinger", "346359"], "name": "NelsonWu"},\n'

解析的Python语句如下,其实解析json的函数仍是 json.loads(),与之前的区别是,对读取的每行字符串进行了预处理,以去掉首行的[和末行的]

records = []
for line in lines:
    try:
        if line.startswith('['):#判断逻辑
            myline = line[1:-2] 
        elif line.endswith(']'):
            myline = line[:-1]
        else:
            myline = line[:-2]
        lineloads = json.loads(myline) #解析
    except:
        print myline #如果出错就打印改行内容
    records.append(myline)

对比第二部分的json.loads方法的输入,可知json.loads的输入字符串的换行符是可有可无的。

你可能感兴趣的:(json,python)