[网安实践II] 实验1. 利用统计词频进行解密

[网安实践II] 实验1. 利用统计词频进行解密

  • 工具: SageMath

破解算法基本思想

由于明文符合某一语言的语法规范, 因此其字符出现的频率(词频)分布并不是均匀的, 符合一定统计规律. 以英文为例, 四字母词的排版就符合一定规律.
若将通过对密文暴力破解得到的明文进行词频统计, 即计算其四字母词概率的对数, 在长度相同的情况下,概率对数越大(绝对值越小), 与自然语言越接近.

伪明文评分文件ngram_score.py

如下文件ngram_score.py中代码用于检测伪明文与真实明文的接近程度, 会对给定的伪明文进行评分, 评分越高(绝对值越小), 则越可能为自然语言, 即越可能为明文.

  • 注: 运行时, 该文件和english_quadgrams.txt需要与以下破解代码文件置于同一目录.
'''
Allows scoring of text using n-gram probabilities
17/07/12
'''
from math import log10

class ngram_score(object):
    def __init__(self,ngramfile,sep=' '):
        ''' load a file containing ngrams and counts, calculate log probabilities '''
        self.ngrams = {}
        for line in file(ngramfile):
            key,count = line.split(sep)
            self.ngrams[key] = int(count)
        self.L = len(key)
        self.N = sum(self.ngrams.itervalues())
        #calculate log probabilities
        for key in self.ngrams.keys():
            self.ngrams[key] = log10(float(self.ngrams[key])/self.N)
        self.floor = log10(0.01/self.N)

    def score(self,text):
        ''' compute the score of text '''
        score = 0
        ngrams = self.ngrams.__getitem__
        for i in xrange(len(text)-self.L+1):
            if text[i:i+self.L] in self.ngrams: score += ngrams(text[i:i+self.L])
            else: score += self.floor
        return score

1. 单表代换密码解密

题干

以下为一段用单表代换密码加密的密文:
UNGLCKVVPGTLVDKBPNEWNLMGVMTTLTAZXKIMJMBBANTLCMOMVTNAAMILVTMCGTHMKQTLBMVCMXPIAMTLBMVGLTCKAUILEDMGPVLDHGOMIZWNLMGBZLGKSMAZBMKOMKTWNLMGBZKTLCKAMHMIMDMVGBZLXBLCSAZTBMMOMTVPGMOMVKJLTQPXCBPNEJLBBLUILVDKJKZ
请求出对应的明文(小写字母,单词之间不需空格)

代码

直接利用文件subcipher.ipynb中的代码, 如下

import random
from ngram_score import ngram_score
#参数初始化
ciphertext ='UNGLCKVVPGTLVDKBPNEWNLMGVMTTLTAZXKIMJMBBANTLCMOMVTNAAMILVTMCGTHMKQTLBMVCMXPIAMTLBMVGLTCKAUILEDMGPVLDHGOMIZWNLMGBZLGKSMAZBMKOMKTWNLMGBZKTLCKAMHMIMDMVGBZLXBLCSAZTBMMOMTVPGMOMVKJLTQPXCBPNEJLBBLUILVDKJKZ'
parentkey = list('ABCDEFGHIJKLMNOPQRSTUVWXYZ')
#只是用来声明key是个字典
key = {'A':'A'}
#读取quadgram statistics
fitness = ngram_score('english_quadgrams.txt')
parentscore = -99e9
maxscore = -99e9
j = 0

print('---------------------------start---------------------------')
while 1:
  j = j+1
  #随机打乱key中的元素
  random.shuffle(parentkey)
  #将密钥做成字典
  #密文:明文
  for i in range(len(parentkey)):
    key[parentkey[i]] = chr(ord('A')+i)
    #用字典一一映射解密
  decipher = ciphertext
  for i in range(len(decipher)):
    decipher = decipher[:i]+key[decipher[i]]+decipher[i+1:]
  parentscore = fitness.score(decipher)#计算适应度
  #在当前密钥下随机交换两个密钥的元素从而寻找是否有更优的解
  count = 0
  while count < 1000:
    a = random.randint(0,25)
    b = random.randint(0,25)
    #随机交换父密钥中的两个元素生成子密钥,并用其进行解密
    parentkey[a],parentkey[b]= parentkey[b],parentkey[a]
    key[parentkey[a]],key[parentkey[b]] = key[parentkey[b]],key[parentkey[a]]
    decipher = ciphertext
    for i in range(len(decipher)):
      decipher = decipher[:i]+key[decipher[i]]+decipher[i+1:]
    score = fitness.score(decipher)
    #此子密钥代替其对应的父密钥,提高明文适应度
    if score > parentscore:
      parentscore = score
      count = 0
    else:
      #还原
      parentkey[a],parentkey[b]=parentkey[b],parentkey[a]
      key[parentkey[a]],key[parentkey[b]]=key[parentkey[b]],key[parentkey[a]]
    count = count+1
    #print(count,parentscore)
  #输出该key和明文
  if parentscore > maxscore:
    maxscore = parentscore
    print ('Currrent key: '+''.join(parentkey))
    print ('Iteration total:', j)
    print ('Plaintext: ', decipher,maxscore)
    sys.stdout.flush() HillCryptosyste
  • 注: 课上提供的代码以python2标准编写, 需要修改部分代码以符合python3标准. 以上代码以进行修正.
    计算出的伪明文与实际明文有部分偏差, 可参考英文版《再别康桥》进行修改.

运行输出

  • 注: 每一次运行结果均有部分不同
    [网安实践II] 实验1. 利用统计词频进行解密_第1张图片

题解

buticannotsingaloudquietnessismyfarewellmusicevensummerinsectsheapsilenceformesilentiscambridgetonightveryquietlyitakemyleaveasquietlyasicameheregentlyiflickmysleevesnotevenawispofcloudwillibringaway

2.HILL密码解密

题干

以下是一段用HILL密码加密的密文:(密钥是二维矩阵)
RIAIXVVYYOQQBOCFHKBSQGALRWNAXJDSNGNGYLUFUJADTAHANGNGYLDBQGCHOXHAPEITCHGRQQWUSFXVVYAQHOFWMBOBUQILGCMGQTVZRUCRKXITIPLCIQOBDWSIQGHCNAAIWHILVCFSITUEJBQBJBXVEULECHQVYRQIMOELCRYRWHHUMLEDHUCHSFWHXVHKBSAQAPAVHAUEJBWNQBBBAPSUNAQPQGGJ
请解密出对应明文。

代码

需要对原文件subcipher.ipynb中代码进行修改, 将单表代换部分修改问HILL密码. 修改后代码如下.

import random
from ngram_score import ngram_score
#参数初始化
ciphertext ='RIAIXVVYYOQQBOCFHKBSQGALRWNAXJDSNGNGYLUFUJADTAHANGNGYLDBQGCHOXHAPEITCHGRQQWUSFXVVYAQHOFWMBOBUQILGCMGQTVZRUCRKXITIPLCIQOBDWSIQGHCNAAIWHILVCFSITUEJBQBJBXVEULECHQVYRQIMOELCRYRWHHUMLEDHUCHSFWHXVHKBSAQAPAVHAUEJBWNQBBBAPSUNAQPQGGJ'
#读取quadgram statistics
fitness = ngram_score('english_quadgrams.txt')
score = -99e9
maxscore = -99e9
j = 0  #记录迭代次数

H = HillCryptosystem(AlphabeticStrings(), 2)
C = H.encoding(ciphertext)
print('---------------------------start---------------------------')
for K in H.key_space():    # 遍历密钥空间
  if not K.is_invertible():    # 若秘钥的矩阵是不可逆的则跳过
        continue
  decipher = H.deciphering(K, C)
  score = fitness.score(str(decipher))
  #输出该key和明文
  if score > maxscore:
    maxscore = score
    print ('Currrent key: ')
    print (K)
    print ('Plaintext: ', decipher)
    print ('Score: ', score)
    sys.stdout.flush()

运行输出

---------------------------start---------------------------
Currrent key: 
[1 0]
[0 1]
Plaintext:  RIAIXVVYYOQQBOCFHKBSQGALRWNAXJDSNGNGYLUFUJADTAHANGNGYLDBQGCHOXHAPEITCHGRQQWUSFXVVYAQHOFWMBOBUQILGCMGQTVZRUCRKXITIPLCIQOBDWSIQGHCNAAIWHILVCFSITUEJBQBJBXVEULECHQVYRQIMOELCRYRWHHUMLEDHUCHSFWHXVHKBSAQAPAVHAUEJBWNQBBBAPSUNAQPQGGJ
Score:  -1761.0604450089113
Currrent key: 
[3 0]
[0 1]
Plaintext:  XIAIZVHYIOOQJOSFLKJSOGALXWNAZJBSNGNGILYFYJADPALANGNGILBBOGSHWXLAFEUTSHCROQQUGFZVHYAQLOTWEBWBYQULCCEGOTHZXUSRMXUTUPVCUQWBBWGIOGLCNAAIQHULHCTSUTYEDBOBDBZVKUVESHOVIROIEOKLSRIRQHLUELKDLUSHGFQHZVLKJSAQAPAVLAYEDBQNOBJBAPGUNAOPOGCJ
Score:  -1696.4826419233734
Currrent key: 
[0 3]
[1 2]
Plaintext:  ARUAJXUVGYMQEBJCQHOBAQVAWRANFXEDCNCNFYXUHUBAWTEHCNCNFYHDAQBCHOEHAPBIBCTGMQSWHSJXUVOAAHEFBMROSUHIOGUMNQDVERNCBKBIRICLAIROODISAQWHANUAFWHIEVUFBIOUDJHQDJJXEEULBCFQHYSQOMBENCHYFWCHNMHECHBCHSFWJXQHOBOAFAHAEHOUDJHWHQRBFAMSANDQAQZG
Score:  -1686.4556916705883
Currrent key: 
[0 1]
[3 4]
Plaintext:  UXIAZZWHIIMOEJLSSLIJCOLAIXANNZOBGNGNFINYRYDASPILGNGNFIXBCONSNWILKFRUNSJCMOIQHGZZWHQAWLYTLERWYYJUUCQEPOXHGXXSBMRUNUWVOURWSBKGCOKLANIAVQJUAHUTRUMYPDXOPDZZGKYVNSROLIEOYEXKXSLIVQCLVEPKCLNSHGVQZZSLIJQAPAVAILMYPDBQXORJPAWGANLOCOBC
Score:  -1666.975608158808
Currrent key: 
[4 1]
[5 0]
Plaintext:  IXIEVTYLOEQGOPFMKJSRGOLMWRANJNSHGDGDLWFAJCDIAJARGDGDLWBFGOHAXAAREFTCHARYQGUEFKVTYLQIOLWZBMBCQMLYCKGITOZFUDRSXUTCPACLQUBCWJISGOCFANIEHELYCNSXTCEGBBBSBBVTUQEZHAVCRMICOMLSRSRMHEUBLEDOUBHAFKHEVTKJSRQIPOVEAREGBBNUBSBPPOUYANPMGOJU
Score:  -1645.804727894536
Currrent key: 
[0 3]
[3 6]
Plaintext:  AXUAJZUHGIMOEJJSQLOJAOVAWXANFZEBCNCNFIXYHYBAWPELCNCNFIHBAOBSHWELAFBUBSTCMOSQHGJZUHOAALETBERWSYHUOCUENODHEXNSBMBURUCVAURWOBIGAOWLANUAFQHUEHUTBUOYDDHODDJZEKUVBSFOHISOOEBKNSHIFQCLNEHKCLBSHGFQJZQLOJOAFAHAELOYDDHQHORJFAMGANDOAOZC
Score:  -1636.6702711616374
Currrent key: 
[8 3]
[5 0]
Plaintext:  UNUUHJIHWGOMWRTWMDGBCAVIQJANDFGRCPCPVSTKDUBOAJARCPCPVSJHCALOZUARKNPOLOXGOMYSTUHJIHOOWNQRJOJEOSVUSOCUPARDYRXAZOPOFESPOAJEQBUICASJANUULSVUSRGHPOKOJDJUJDHJYEKHLOHSXUUSWOVOXAXULSYPVABUYPLOTULSHJMDGBOOFSHUARKOJDNUJUJRFSYMANFQCADM
Score:  -1622.28538113562
Currrent key: 
[8 3]
[1 4]
Plaintext:  UNGEPHURKWQSOTBUGLEVCAFMYHANTBYTYDYDLOZCPEZIIHSTYDYDLOVRCAJILESTKNNIJILWQSOOFEPHURMIWNITHIBGEOHEQIOEPALLQTXAXINIXGODOABGOVEMCAAHANGEBOHEKTSRNIIIDLVEDLPHQGWRJIXOJEKOUITIXAJEBOUDVANEUDJIFEBOPHGLEVMIVOTESTIIDLZEVEBTVOASANZYCAFS
Score:  -1509.820783666875
Currrent key: 
[10  7]
[ 1  4]
Plaintext:  QNKEZHQRIWSSGTTUKLYVMARMOHANXBOTODODBOHCZEHIWHETODODBOJRMAPIBEETINNIPIBWSSGOREZHQRUICNWTDITGYODESIGEZABLSTVAVINIVGGDGATGGVYMMAAHANKETODEITERNIWIFLJEFLZHSGCRPIVOPEIOQIXIVAPETOQDJANEQDPIRETOZHKLYVUIJOXEETWIFLHEJETTJOASANHYMARS
Score:  -1492.9242246562796
Currrent key: 
[20  1]
[ 1  4]
Plaintext:  INSETHIREWWSQTDUSLMVGAPMUHANFBUTUDUDHOXCTEXIYHCTUDUDHOLRGABIHECTENNIBIHWWSQOPETHIRKIONYTVIDGMOVEWIQETAHLWTRARINIRGQDQADGQVMMGAAHANSEDOVEETCRNIYIJLLEJLTHWGORBIROBEEOIIFIRABEDOIDLANEIDBIPEDOTHSLMVKILOFECTYIJLXELEDTLOASANXYGAPS
Score:  -1454.8276553642136
Currrent key: 
[ 2 17]
[ 3  8]
Plaintext:  CNUEBHYRSWUSYTXUALIVIAVMYHANXBMTCDCDROHCREBIMHOTCDCDROPRIAPIBEOTONFIPIJWUSQODEBHYROIKNATHILGCOLEEIAEVAHLGTBATIFIVGODEALGWVEMIAGHANUEDOLEITQRFIYIBLPEBLBHGGGRPINOTEAOUIDIBATEDOMDTAJEMDPIDEDOBHALIVOIFOHEOTYIBLFEPELTFOISANLYIAPS
Score:  -1440.5385389242883
Currrent key: 
[20  1]
[ 3  8]
Plaintext:  INCERHSRUWCSSTBUALGVGATMSHANBBWTIDIDDOPCDERIWHETIDIDDOVRGAVIREETENHIVIXWCSMOZERHSREIONATPIFGIOFEQIAETAPLYTRALIHITGEDQAFGKVQMGAYHANCEZOFEGTMRHISIRLVERLRHYGYRVINOLEAOCIZIRALEZOWDLAXEWDVIZEZORHALGVEIHOPEETSIRLHEVEFTHOGSANFYGAVS
Score:  -1405.0489191990162
Currrent key: 
[12  3]
[ 7 20]
Plaintext:  OXCETTELEEAGAPRMEJOREOTMYRANDNUHIDIDNWTAHCRISJIRIDIDNWTFEOLAZAIRUFTCLATYAGGENKTTELEISLOZHMNCMMRYMKSIROBFEDHSNUTCFAOLCUNCIJESEOCFANCETERYSNAXTCWGLBTSLBTTEQIZLALCVMYCUMFSHSVMTEABDEDOABLANKTETTEJOREIHOPEIRWGLBBUTSNPHOUYANDMEORU
Score:  -1398.6684497659912
Currrent key: 
[ 4 21]
[11 11]
Plaintext:  BNCEGHRREWASNTRURLBVEATMLHNNQBHTVDVDNOTCHERIFHVTVDVDNOGREALIZEVTHNTILITWASGONEGHRREIFNBTHINGMOREMISERAOLRTHANITIFGBDCANGVVEMEAPHNNCETOREFTNRTIWIYLTEYLGHEGVRLILOVEYOUIFIHAVETONDDADENDLINETOGHRLBVEIHOPEVTWIYLBETEATHOUSNNDYEARS
Score:  -1330.116096349177
Currrent key: 
[ 4 21]
[23 10]
Plaintext:  ONCUTJEHEGAMARRWEDOBEATIYJANDFURIPIPNSTKHUROSJIRIPIPNSTHEALOZUIRUNTOLOTGAMGSNUTJEHEOSNORHONEMSRUMOSURABDERHANOTOFEOPCANEIBEIEACJANCUTSRUSRAHTOWOLDTULDTJEEIHLOLSVUYSUOFOHAVUTSAPDADUAPLONUTSTJEDOBEOHSPUIRWOLDBUTUNRHSUMANDQEARM
Score:  -1313.3267792056824
Currrent key: 
[ 4 21]
[11 24]
Plaintext:  ONCETHEREWASATRUELOVEATMYHANDBUTIDIDNOTCHERISHITIDIDNOTREALIZEITUNTILITWASGONETHEREISNOTHINGMOREMISERABLETHANITIFGODCANGIVEMEACHANCETORESTARTIWILLTELLTHEGIRLILOVEYOUIFIHAVETOADDADEADLINETOTHELOVEIHOPEITWILLBETENTHOUSANDYEARS
Score:  -946.2853518865237

题解

oncetherewasatrueloveatmyhandbutididnotcherishitididnotrealizeituntilitwasgonethereisnothingmoremiserablethanitifgodcangivemeachancetorestartiwilltellthegirliloveyouifihavetoaddadeadlinetotheloveihopeitwillbetenthousandyears

3. 背包加密密码解密

题干

已知一背包加密的公钥为
{615436700291,415460700271,15508700231,846430100773,677471501215,139578302079,179168604148,789306608798,563224517265,364498233536,229056467022,670323428329,115934481316,44989786476,518624653302,149955258190,728568829281,796899516776,546782575075,178164449829,356328899658,712657799316,569303048254,223205396187,446410792374,892821584748,524144817108,132888933895,611875519857,877653387647,839906074973,35774353074},密文为6020587936087,试求明文二进制表示(形如:01001010000001110101001100001110).

代码

代码根据LLL算法进行修改, 如图对原A矩阵加上了一行约束条件, guess是猜测结果中"1"的个数.
[网安实践II] 实验1. 利用统计词频进行解密_第2张图片

pubKey = [615436700291,415460700271,15508700231,846430100773,677471501215,139578302079,179168604148,789306608798,563224517265,364498233536,229056467022,670323428329,115934481316,44989786476,518624653302,149955258190,728568829281,796899516776,546782575075,178164449829,356328899658,712657799316,569303048254,223205396187,446410792374,892821584748,524144817108,132888933895,611875519857,877653387647,839906074973,35774353074]
nbit = len(pubKey)
encoded = 6020587936087

def LLL_by_guess(guess):
    # create a large matrix of 0's (dimensions are public key length +1)
    A = Matrix(ZZ, nbit + 2, nbit + 2)  # 矩阵加一行一列
    # fill in the identity matrix
    print(f'start {guess}')
    for i in range(nbit):
        A[i, i] = 1
    # replace the bottom row with your public key
    for i in range(nbit):
        A[i, nbit] = pubKey[i]
    #添加最后一行前nbit个置为1
    for i in range(nbit):
        A[i, nbit+1] = 1
    # last element is the encoded message
    A[nbit, nbit] = -int(encoded)
    # 设置-guess
    A[nbit+1, nbit+1] = -guess
    res = A.LLL(delta=1, algorithm='NTL:LLL')
    for i in range(0, nbit + 2):
        # print solution
        M = res.row(i).list()
        flag = True
        for m in M:
            if m != 0 and m != 1:
                flag = False
                break
        if flag:
            print('***************')
            print (i, M)
            print('res=', end='') # 输出结果
            for i in M[:-2]:
                print(i, end='')
            print('')

for i in range(1, nbit+1):
    LLL_by_guess(i)

运行输出

[网安实践II] 实验1. 利用统计词频进行解密_第3张图片

题解

10111010011001110101001100001000

参考

利用4字母法进行解密

你可能感兴趣的:(网安实践II,安全)