《Web安全之机器学习入门》笔记:第十六章 16.6 生成常用密码

人们在设置密码时候,总是倾向于好记的密码,于是常见的密码有一定的关联性。我们可以尝试让RNN学习常见的密码,摸索其中的规律,然后自动生成密码。本小节示例RNN生成常用密码。

《Web安全之机器学习入门》笔记:第十六章 16.6 生成常用密码_第1张图片

1、数据集

使用WVS自带的密码字典作为训练集。约定密码长度不超过10,逐行读取密码文件中的每行密码,并将其序列化。

path = "../data/wvs-pass.txt"
maxlen = 10

file_lines = open(path, "r", encoding='utf-8').read()
X, Y, char_idx = \
    string_to_semi_redundant_sequences(file_lines, seq_maxlen=maxlen, redun_step=3)

2、构建RNN模型

g = tflearn.input_data(shape=[None, maxlen, len(char_idx)])
g = tflearn.lstm(g, 512, return_seq=True)
g = tflearn.dropout(g, 0.5)
g = tflearn.lstm(g, 512)
g = tflearn.dropout(g, 0.5)
g = tflearn.fully_connected(g, len(char_idx), activation='softmax')
g = tflearn.regression(g, optimizer='adam', loss='categorical_crossentropy',
                       learning_rate=0.001)

3、实例化序列器

m = tflearn.SequenceGenerator(g, dictionary=char_idx,
                              seq_maxlen=maxlen,
                              clip_gradients=5.0,
                              checkpoint_path='wvs_passwd')

4、完整代码

from __future__ import absolute_import, division, print_function

import os
from six import moves
import ssl

import tflearn
from tflearn.data_utils import *

path = "../data/wvs-pass.txt"
maxlen = 10

file_lines = open(path, "r", encoding='utf-8').read()
X, Y, char_idx = \
    string_to_semi_redundant_sequences(file_lines, seq_maxlen=maxlen, redun_step=3)

g = tflearn.input_data(shape=[None, maxlen, len(char_idx)])
g = tflearn.lstm(g, 512, return_seq=True)
g = tflearn.dropout(g, 0.5)
g = tflearn.lstm(g, 512)
g = tflearn.dropout(g, 0.5)
g = tflearn.fully_connected(g, len(char_idx), activation='softmax')
g = tflearn.regression(g, optimizer='adam', loss='categorical_crossentropy',
                       learning_rate=0.001)

m = tflearn.SequenceGenerator(g, dictionary=char_idx,
                              seq_maxlen=maxlen,
                              clip_gradients=5.0,
                              checkpoint_path='wvs_passwd')


for i in range(40):
    seed = random_sequence_from_string(file_lines, maxlen)
    m.fit(X, Y, validation_set=0.1, batch_size=128,
          n_epoch=1, run_id='password')
    print("-- TESTING...")
    print("-- Test with temperature of 1.2 --")
    print(m.generate(30, temperature=1.2, seq_seed=seed))
    print("-- Test with temperature of 1.0 --")
    print(m.generate(30, temperature=1.0, seq_seed=seed))
    print("-- Test with temperature of 0.5 --")
    print(m.generate(30, temperature=0.5, seq_seed=seed))

5、运行结果

......
Training Step: 13528  | total loss: 0.58454 | time: 316.845s
| Adam | epoch: 033 | loss: 0.58454 -- iter: 52224/52452
Training Step: 13529  | total loss: 0.58885 | time: 317.170s
| Adam | epoch: 033 | loss: 0.58885 -- iter: 52352/52452
Training Step: 13530  | total loss: 0.58233 | time: 334.266s
| Adam | epoch: 033 | loss: 0.58233 | val_loss: 0.36391 -- iter: 52452/52452
--
-- TESTING...
-- Test with temperature of 1.2 --
om
doidc.com
xiao
haining
jhidc
6588888

-- Test with temperature of 1.0 --
om
doidc.com
tama_.com
tomiin@367894321$
-- Test with temperature of 0.5 --
om
doidc.com
xiidc
idc02
idc0000
0000000

 

你可能感兴趣的:(Web安全之机器学习入门,web安全,机器学习,密码生成,RNN,循环神经网络)