本文的目的就是用最简单的方式获取 elmo 得到的word representation,看了一些其他人的介绍,其实最后对我有用的就这么多了,我只想要他生成的词向量。
简单介绍一下 elmo:Allen NLP在NAACL2018上的Best paper - Deep contextualized word representations,使用elmo让原有的模型在NLI等Task上效果提升。
那好,直接说怎么得到这个elmo。现在有tf,pytorch,keras各种版本。本文使用的官方给出的elmo片段方式,不用加在模型当中,直接获得词向量的Tensor,因为我只想用他的词向量,训练他的模型又耗时有耗机器。
首先在conda中新建环境:
conda create -n allennlp python=3.6
接着安装allennlp[保证你电脑里gcc是OK的,编译时需要C++的环境]
pip install allennlp
别断网就OK了,东西有点多,pytorch啥的全套。
然后,下载allennlp给出的训练好的参数和模型
网址:
这样方便你重复使用。
下面就是用这两个文件怎么得到词向量了:
from allennlp.commands.elmo import ElmoEmbedder
options_file = "/files/elmo_2x4096_512_2048cnn_2xhighway_options.json"
weight_file = "/files/elmo_2x4096_512_2048cnn_2xhighway_weights.hdf5"
elmo = ElmoEmbedder(options_file, weight_file)
# use batch_to_ids to convert sentences to character ids
context_tokens = [['I', 'love', 'you', '.'], ['Sorry', ',', 'I', 'don', "'t", 'love', 'you', '.']] #references
elmo_embedding, elmo_mask = elmo.batch_to_embeddings(context_tokens)
print(elmo_embedding)
print(elmo_mask)
Embedding:
tensor([[[[ 0.6923, -0.3261, 0.2283, ..., 0.1757, 0.2660, -0.1013],
[-0.7348, -0.0965, -0.1411, ..., -0.3411, 0.3681, 0.5445],
[ 0.3645, -0.1415, -0.0662, ..., 0.1163, 0.1783, -0.7290],
...,
[ 0.0000, 0.0000, 0.0000, ..., 0.0000, 0.0000, 0.0000],
[ 0.0000, 0.0000, 0.0000, ..., 0.0000, 0.0000, 0.0000],
[ 0.0000, 0.0000, 0.0000, ..., 0.0000, 0.0000, 0.0000]],
[[-1.1051, -0.4092, -0.4365, ..., -0.6326, 0.4735, -0.2577],
[ 0.0899, -0.4828, -0.5596, ..., 0.4372, 0.3840, -0.7343],
[-0.5538, -0.1473, -0.2441, ..., 0.2551, 0.0873, 0.2774],
...,
[ 0.0000, 0.0000, 0.0000, ..., 0.0000, 0.0000, 0.0000],
[ 0.0000, 0.0000, 0.0000, ..., 0.0000, 0.0000, 0.0000],
[ 0.0000, 0.0000, 0.0000, ..., 0.0000, 0.0000, 0.0000]],
[[-3.2634, -0.9448, -0.3199, ..., -1.2070, 0.6930, -0.2016],
[-0.3688, -0.7632, -0.0715, ..., 0.6294, 1.6869, -0.6655],
[-1.0870, -1.4243, -0.2445, ..., 0.0825, 0.5020, 0.2765],
...,
[ 0.0000, 0.0000, 0.0000, ..., 0.0000, 0.0000, 0.0000],
[ 0.0000, 0.0000, 0.0000, ..., 0.0000, 0.0000, 0.0000],
[ 0.0000, 0.0000, 0.0000, ..., 0.0000, 0.0000, 0.0000]]],
[[[ 0.5042, -0.6629, -0.0231, ..., -0.3084, -0.9741, -0.7230],
[ 0.1131, 0.1575, 0.1414, ..., 0.3718, -0.1432, -0.0248],
[ 0.6923, -0.3261, 0.2283, ..., 0.1757, 0.2660, -0.1013],
...,
[-0.7348, -0.0965, -0.1411, ..., -0.3411, 0.3681, 0.5445],
[ 0.3645, -0.1415, -0.0662, ..., 0.1163, 0.1783, -0.7290],
[-0.8872, -0.2004, -1.0601, ..., -0.2655, 0.2115, 0.1977]],
[[ 0.1221, -0.7032, 0.0169, ..., -0.3249, -0.4935, -0.4965],
[ 0.3399, -0.4682, 0.1888, ..., -0.0565, 0.1001, -0.0416],
[-0.8135, -0.8491, -0.3264, ..., -0.5674, 0.2638, 0.2006],
...,
[ 0.4460, -0.4475, -0.1583, ..., 0.4372, 0.3840, -0.7343],
[-0.1287, 0.0161, 0.0315, ..., 0.2551, 0.0873, 0.2774],
[-1.2373, -0.3373, 0.1098, ..., -0.0276, -0.0181, 0.0602]],
[[-0.0830, -1.5891, -0.2576, ..., -1.2944, 0.1082, 0.6745],
[-0.0724, -0.7200, 0.1463, ..., 0.6919, 0.9144, -0.1260],
[-2.3460, -1.1714, -0.7065, ..., -1.2885, 0.4679, 0.3800],
...,
[ 0.1246, -0.6929, 0.6330, ..., 0.6294, 1.6869, -0.6655],
[-0.5757, -1.0845, 0.5794, ..., 0.0825, 0.5020, 0.2765],
[-1.2392, -0.6155, -0.9032, ..., 0.0524, -0.0852, 0.0805]]]])
Mask:
tensor([[1, 1, 1, 1, 0, 0, 0, 0],
[1, 1, 1, 1, 1, 1, 1, 1]])
https://cstsunfu.github.io/2018/06/ELMo/
https://blog.csdn.net/sinat_26917383/article/details/81913790