torchtext—ValueError: Fan in and fan out can not be computed for tensor with fewer than 2 dimensions

vectors.unk_init = init.xavier_uniform_  # 没有命中的token的初始化方式

TEXT.build_vocab(train,  min_freq=5, vectors=vectors)

我的pytorch是1.1.0,torchtext是0.4.0。一运行到这两句就会报错。其实ValueError: Fan in and fan out can not be computed for tensor with fewer than 2 dimensions这个错误不来自torchtext,它来自init.xavier_uniform_ 。因为我们每次词向量是一维的(dim=1),但是init.xavier_uniform_用它来初始化给的参数维度必须再二维以上。

在torchtext的vocab.py中的Vectors类里修改下面最后一行。

class Vectors(object):

    def __init__(self, name, cache=None,
                 url=None, unk_init=None, max_vectors=None):
        """
        Arguments:
           name: name of the file that contains the vectors
           cache: directory for cached vectors
           url: url for download if vectors not found in cache
           unk_init (callback): by default, initialize out-of-vocabulary word vectors
               to zero vectors; can be any function that takes in a Tensor and
               returns a Tensor of the same size
           max_vectors (int): this can be used to limit the number of
               pre-trained vectors loaded.
               Most pre-trained vector sets are sorted
               in the descending order of word frequency.
               Thus, in situations where the entire set doesn't fit in memory,
               or is not needed for another reason, passing `max_vectors`
               can limit the size of the loaded set.
        """
        cache = '.vector_cache' if cache is None else cache
        self.itos = None
        self.stoi = None
        self.vectors = None
        self.dim = None
        self.unk_init = torch.Tensor.zero_ if unk_init is None else unk_init
        self.cache(name, cache, url=url, max_vectors=max_vectors)

    def __getitem__(self, token):
        if token in self.stoi:
            return self.vectors[self.stoi[token]]
        else:
            # return self.unk_init(torch.Tensor(self.dim))  这是原来的代码,换成下面的。
            return self.unk_init(torch.Tensor(1,self.dim)).squeeze(0)
            # self.dim它是一个数,当unk_init = init.xavier_uniform_时候,只传进去一个数他就会报错所以要在前面加上一个1,初始化之后还要用squeeze(0)把多余的维度抽出去,是填充的词向量也是一维的。

如果不愿意改代码,也可以把pytorch的版本降到0.4.0或者0.4.1试试。

你可能感兴趣的:(pytorch)