有一份json格式的文件,如下:
{
"cover": "http://p2.music.126.net/wsPS7l8JZ3EAOvlaJPWW-w==/109951163393967421.jpg?param=140y140",
"title": "2018上半年最热新歌TOP50",
"author": "网易云音乐",
"times": "1264万",
"url": "https://music.163.com/playlist?id=2303649893",
"id": "2303649893"
}
{
"cover": "http://p2.music.126.net/wpahk9cQCDtdzJPE52EzJQ==/109951163271025942.jpg?param=140y140",
"title": "你的青春里有没有属于你的一首歌?",
"author": "mayuko然",
"times": "4576万",
"url": "https://music.163.com/playlist?id=2201879658",
"id": "2201879658"
}
当我尝试用pd.read_json('data.json')
读取文件时,给我报错了,报错部分情况如下:
File "D:\python\lib\site-packages\pandas\io\json\json.py", line 853, in _parse_no_numpy
loads(json, precise_float=self.precise_float), dtype=None)
ValueError: Trailing data
一顿百度后发现原来是 json 格式错误问题,也是头一回知道有 jsonviewer 这种东西。需要将文件里面的字典作为元素保存在列表当中才行,修改后如下。
[{
"cover": "http://p2.music.126.net/wsPS7l8JZ3EAOvlaJPWW-w==/109951163393967421.jpg?param=140y140",
"title": "2018上半年最热新歌TOP50",
"author": "网易云音乐",
"times": "1264万",
"url": "https://music.163.com/playlist?id=2303649893",
"id": "2303649893"
},
{
"cover": "http://p2.music.126.net/wpahk9cQCDtdzJPE52EzJQ==/109951163271025942.jpg?param=140y140",
"title": "你的青春里有没有属于你的一首歌?",
"author": "mayuko然",
"times": "4576万",
"url": "https://music.163.com/playlist?id=2201879658",
"id": "2201879658"
}]
还有另外一种方法就是文件的每一行为一个完整的字典,然后在函数中修改参数pd.read_json('data.json',lines=True)
。lines
默认为 False
,设为 True
后可以按行读取 json 对象。 在 pandas.read_json 文档 中是这样说明的:
lines : boolean, default False. Read the file as a json object perline.New in version 0.19.0.
修改后的 json 文件如下:
{"cover": "http://p2.music.126.net/wsPS7l8JZ3EAOvlaJPWW-w==/109951163393967421.jpg?param=140y140","title": "2018上半年最热新歌TOP50","author": "网易云音乐","times": "1264万","url": "https://music.163.com/playlist?id=2303649893","id": "2303649893"}
{"cover": "http://p2.music.126.net/wpahk9cQCDtdzJPE52EzJQ==/109951163271025942.jpg?param=140y140","title": "你的青春里有没有属于你的一首歌?","author": "mayuko然","times": "4576万","url": "https://music.163.com/playlist?id=2201879658","id": "2201879658"}