在分析西游记时,比如孙悟空在小说中就有美猴王、孙行者、齐天大圣等名字,在分析人物关系时需要把多个名字的分析数据合并到一起.
首先是把多个名字的出场次数合并到一起:
keys = list(names.keys())
for i in keys:
if i == '美猴王' or i == '悟空' or i == '孙行者' or i == '齐天大圣' or i == '大师兄':
names['孙悟空'] += names[i]
names.pop(i)
if i == '悟净' or i == '沙僧':
names['沙悟净'] += names[i]
names.pop(i)
if i == '八戒' or i == '天蓬元帅' or i == '猪悟能':
names['猪八戒'] += names[i]
names.pop(i)
if i == '唐三藏':
names['唐僧'] += names[i]
names.pop(i)
names
这个字典里存的就是各个人物的出场次数,注意在遍历字典时不能进行增删操作,所以需要先用个list
来进行遍历。
接下来把例如孙悟空和美猴王与各人物的亲密关系合并起来:
temps = list(relationships.keys()) #第一层迭代合并人物关系列表里重复的人物
for i in temps:
if i == '美猴王' or i == '悟空' or i == '孙行者' or i == '齐天大圣' or i == '大师兄':
for n in relationships[i]:
if relationships['孙悟空'].get(n) is None:
relationships['孙悟空'][n] = relationships[i][n]
else:
relationships['孙悟空'][n] += relationships[i][n]
relationships.pop(i)
if i == '悟净' or i == '沙僧':
for n in relationships[i]:
if relationships['沙悟净'].get(n) is None:
relationships['沙悟净'][n] = relationships[i][n]
else:
relationships['沙悟净'][n] += relationships[i][n]
relationships.pop(i)
if i == '八戒' or i == '天蓬元帅' or i == '猪悟能':
for n in relationships[i]:
if relationships['猪八戒'].get(n) is None:
relationships['猪八戒'][n] = relationships[i][n]
else:
relationships['猪八戒'][n] += relationships[i][n]
relationships.pop(i)
if i == '唐三藏':
for n in relationships[i]:
if relationships['唐僧'].get(n) is None:
relationships['唐僧'][n] = relationships[i][n]
else:
relationships['唐僧'][n] += relationships[i][n]
relationships.pop(i)
relationships
是个二维的字典,它里面存的是各人物与其他人物之间的亲密度,如图所示:
{'唐僧': {'玉皇大帝': 2, '太上老君': 3, '孙悟空': 403, '猪八戒': 907, '沙悟净': 480, '寿星': 9, '镇元大仙': 6, '四海龙王': 6, '黄袍怪': 2, '金角大王': 2, '银角大王': 2, '精细鬼': 1, '哪吒三太子': 5, '西海龙王': 3, '牛魔王': 9, '红孩儿': 11, '南海龙王': 2, '北海龙王': 2, '快如风': 2, '急如火': 2, '太白金星': 1, '虎力大仙': 3, '鹿力大仙': 5, '昴日星官': 2, '阎王': 10, '六耳猕猴': 2, '二十八宿': 13, '王母娘娘': 1, '托塔李天王': 1, '嫦娥': 3, '凌虚子': 3, '唐僧': 52}, '
第一次的遍历是为了把同一人物的多个名字与其他人的亲密度合并到一起,接下来是第二次遍历:
temps2 = list(relationships.keys())#第二次迭代合合并同一人物的不同名字
for i in temps2:
temps3 = list(relationships[i].keys())
for n in temps3:
if n == '美猴王' or n == '悟空' or n == '孙行者' or n == '齐天大圣' or n == '大师兄':
if relationships[i].get('孙悟空') is None:
relationships[i]['孙悟空'] = relationships[i][n]
else:
relationships[i]['孙悟空'] += relationships[i][n]
relationships[i].pop(n)
if n == '八戒' or n == '天蓬元帅' or n == '猪悟能':
if relationships[i].get('猪八戒') is None:
relationships[i]['猪八戒'] = relationships[i][n]
else:
relationships[i]['猪八戒'] += relationships[i][n]
relationships[i].pop(n)
if n == '悟净' or n == '沙僧':
if relationships[i].get('沙悟净') is None:
relationships[i]['沙悟净'] = relationships[i][n]
else:
relationships[i]['沙悟净'] += relationships[i][n]
relationships[i].pop(n)
if n == '唐三藏':
if relationships[i].get('唐僧') is None:
relationships[i]['唐僧'] = relationships[i][n]
else:
relationships[i]['唐僧'] += relationships[i][n]
relationships[i].pop(n)
第二次遍历第二层的字典里的相同人物的多个名字合并起来。经过这几次的遍历,就能把多个名字的分析数据合并到一起了。下面是用networkx绘制出的图: