python2.7和python3.8代码兼容中的问题汇总

查看当前使用的编辑器版本:

if sys.version >= '3.8':
    import tensorflow.compat.v1 as tf
else:
    import tensorflow as tf

错误1:learn.preprocessing.VocabularyProcessor

python2.7中使用了learn.preprocessing.VocabularyProcessor处理词汇:

  • vocab_processor = learn.preprocessing.VocabularyProcessor(max_document_length)
  • vocab_processor.fit(x_text)
  • vocab_processor.save(os.path.join(out_dir, “vocab”))
  • vocab_processor.transform(x_text) vocab_processor = learn.preprocessing.VocabularyProcessor.restore(vocab_path)

tensorflow.contrib.learn模块,tf2.x没有这个模块进行使用,为了兼容有三种方法

1、 是使用keras进行数据预处理;

from keras.preprocessing.text import Tokenizer # one-hot编码
from keras.preprocessing import sequence   # 数据长度规范化

tokenizer = Tokenizer(num_words=5000, char_level=True, oov_token=UNK)
tokenizer.fit_on_texts(texts)

2、 使用word2vec进行数据预处理;

from gensim.models.word2vec import Word2Vec

model = Word2Vec(text,)
#模型加载
model = Word2Vec.load(model_path)
model.build_vocab(text, update= True) 更新词汇表
model.train(text,total_examples=model.corpus_count, epochs=model.iter

3、 通过纯代码方式实现该功能;

class VOCAB():
    def __init__(self,PAD_LEN=False):
        if PAD_LEN:
            self.PAD_LEN = PAD_LEN
        pass

    def restore(self,vocab_path):
        with open(vocab_path,"r+") as f:
            all_dict = json.load(f)
            self.vocabulary_ = all_dict['vocabulary']
            self.PAD_LEN = all_dict['pad_length']
        return

    def fit(self,data,min_count=0):
        words = sum([i.split(' ') for i in data],[])
        count = Counter(words)
        sorted_word_to_cnt = sorted(count.items(),key=itemgetter(1),reverse=True)
        sorted_words = ['PAD','UNK']
        for word,count in sorted_word_to_cnt:
            if count > min_count:
                sorted_words.append(word)
        word_to_id = {k:v for k,v in zip(sorted_words,range(len(sorted_words)))}
        self.vocabulary_ = word_to_id
        return word_to_id

    def transform(self,data):
        data_to_id = []
        for line in data:
            words = line.split(' ')[:self.PAD_LEN]
            words_id = []
            for word in words:
                words_id.append(self.vocabulary_.get(word,1))
            if len(words_id) < self.PAD_LEN:
                words_id += [0]*(self.PAD_LEN-len(words_id))
            data_to_id.append(words_id)
        return data_to_id

    def save(self,vocab_path):
        all_dict = {}
        all_dict['vocabulary'] = self.vocabulary_
        all_dict['pad_length'] = self.PAD_LEN
        with open(vocab_path,'w') as f:
            json.dump(all_dict,f)

错误2:AttributeError: module ‘tensorflow.compat.v1’ has no attribute ‘contrib’

initializer=tf.contrib.layers.xavier_initializer())
tf.contrib.layers.l2_regularizer(l2_lambda)

在这里插入图片描述
tf2中,contrib这个库被取消了,xavier_initializer函数返回一个用于初始化权重的初始化程序Xavier,这个初始化器是用来保持每一层的梯度大小都差不多相同。
解决方法:
1、 tf.2之后把tf.contrib.layers.xavier_initializer()替换成了tf.keras.initializers.glorot_normal() (Xavier和Glorot是对同一种初始化算法的不同命名方式),使用新的函数替换即可:

initializer=tf.keras.initializers.glorot_normal())
tf.keras.regularizers.l2(l2_lambda)

2、 使用tensorflow2.x的方法tf.initializers.GlorotUniform()进行初始化

initializer = tf.initializers.GlorotUniform(seed=1)

错误3:RuntimeError: tf.placeholder() is not compatible with eager execution.

x_input = tf.placeholder(tf.int32, [None, sequence_length], name="x_input")

在这里插入图片描述
解决方法:
python2.7和python3.8代码兼容中的问题汇总_第1张图片添加tf.disable_eager_execution()

错误4:ValueError: Cannot add function ‘xxx’ because a different function with the same name already exists

调用tensorflow.compat.v1 as tf 后,当想继续训练模型,增量学习的时候,报错
有一个同名function了
添加一句:tf.disable_v2_behavior()
问题解决!

总结:

tensorflow1.x和tensorflow2.x中contrib模块内容集成到下面三个包中:

  • tf.keras.layers.Layer
  • tf.keras.Model
  • tf.Module
    改写完代码之后分别训练两个版本的模型进行固化即可根据不同需求调用。

你可能感兴趣的:(踩过的坑,tensorflow,深度学习,人工智能)