方法:使用difflib中的SequenceMatcher
示例:
from difflib import SequenceMatcher
a = "abcdef"
b = "abcde"
s = SequenceMatcher(None, a, b)
print(s.ratio())
for tag, i1, i2, j1, j2 in s.get_opcodes():
# print ("%7s a[%d:%d] (%s) b[%d:%d] (%s)" % (tag, i1, i2, a[i1:i2], j1, j2, b[j1:j2]))
if tag != 'equal':
print("%7s a[%d:%d] (%s) b[%d:%d] (%s)" % (tag, i1, i2, a[i1:i2], j1, j2, b[j1:j2]))
运行结果:
其中相似度计算为:
相同的字符为abcde共5个字符,a,b总共字符数为11,则相似度为2*5/11 = 0.90909090…
参考链接:https://blog.csdn.net/m0_37586703/article/details/105707507