中文短文本分类实例十-DeepMoji(Using millions of emojio ccurrences to learn any-domain representations for...)

一.概述

        DeepMoji(Using millions of emojio ccurrences to learn any-domain representations for detecting sentiment, emotion and sarcasm),是Bjarke Felbo等提出的一种联合Bi-LSTM和Attention的混合神经网络,对表情符号的情绪识别棒,当然在文本分类任务中表现也不错。

        情感识别,尤其是互联网时代社交网络上的情绪识别,无疑是丰富多彩的,尤其是各种网络语言的使用(表情、颜文字、火星文......),使得各种特殊字符等在网络上有了新的意义,而很多时候,这些特殊符号和表情等,往往能够表现出用户真正的情感和意图。这篇Deepmoji论文就是从这个问题入手的,看起来也很有意思。

        实践了一下,抛开各种tricks不谈,其实DeepMoji也没什么了,模型更加可以追溯到RCNN,甚至可以看成是将CNN换成了Attention罢了,虽然有些差别。可见选题的有趣程度也是很有意思的,个人之见。

        github项目地址:

                https://github.com/yongzhuo/Keras-TextClassification/tree/master/keras_textclassification/m10_DeepMoji

 

二. DeepMoji模型原理等

2.1 DeepMoji模型图

                                                            中文短文本分类实例十-DeepMoji(Using millions of emojio ccurrences to learn any-domain representations for...)_第1张图片

2.2 DeepMoji模型详解

        Embedding后接两层Bi-LSTM,然后再将这三层的输出拼接,到Attention,再接一个sofrmax,也就是那么回事,应该是一个比较简单的模型吧。论文还把许多笔墨放到微调fine-turned上,提出了一个"the chain-thaw transfer learning approach",大致意思就是顺序固定模型层,训练,看不出有多新的,至少我看不出来。

       最后比较什么的也只是和FastText等少数几个模型对比,姑且相信你是因为其他模型训练时间太久了吧,从paperweekly看到的,github上的star也好多了,看得出来大家都比较喜欢有意思的事情呀,不过也算是个。

 

三. DeepMoji代码实现

3.1   很简单的一个模型,两层LSTM,再加上个Attention层

3.2      核心代码

    def create_model(self, hyper_parameters):
        """
            构建神经网络, a bit like RCNN, R
        :param hyper_parameters:json,  hyper parameters of network
        :return: tensor, moedl
        """
        super().create_model(hyper_parameters)
        x = self.word_embedding.output
        x = Activation('tanh')(x)

        # entire embedding channels are dropped out instead of the
        # normal Keras embedding dropout, which drops all channels for entire words
        # many of the datasets contain so few words that losing one or more words can alter the emotions completely
        x = SpatialDropout1D(self.dropout_spatial)(x)

        if self.rnn_units=="LSTM":
                layer_cell = LSTM
        elif self.rnn_units=="GRU":
                layer_cell = GRU
        elif self.rnn_units=="CuDNNLSTM":
                layer_cell = CuDNNLSTM
        elif self.rnn_units=="CuDNNGRU":
                layer_cell = CuDNNGRU
        else:
            layer_cell = GRU


        # skip-connection from embedding to output eases gradient-flow and allows access to lower-level features
        # ordering of the way the merge is done is important for consistency with the pretrained model
        lstm_0_output = Bidirectional(layer_cell(units=self.rnn_units,
                                                 return_sequences=True,
                                                 activation='relu',
                                                 kernel_regularizer=regularizers.l2(self.l2),
                                                 recurrent_regularizer=regularizers.l2(self.l2)
                                                 ), name="bi_lstm_0")(x)
        lstm_1_output = Bidirectional(layer_cell(units=self.rnn_units,
                                                 return_sequences=True,
                                                 activation='relu',
                                                 kernel_regularizer=regularizers.l2(self.l2),
                                                 recurrent_regularizer=regularizers.l2(self.l2)
                                                 ), name="bi_lstm_1")(lstm_0_output)
        x = concatenate([lstm_1_output, lstm_0_output, x])

        # if return_attention is True in AttentionWeightedAverage, an additional tensor
        # representing the weight at each timestep is returned
        weights = None
        x = AttentionWeightedAverage(name='attlayer', return_attention=self.return_attention)(x)
        if self.return_attention:
            x, weights = x

        x = Dropout(self.dropout)(x)
        # x = Flatten()(x)
        # 最后就是softmax
        dense_layer = Dense(self.label, activation=self.activate_classify)(x)
        output = [dense_layer]
        self.model = Model(self.word_embedding.input, output)
        self.model.summary(120)

希望对你有所帮助!

你可能感兴趣的:(自然语言处理,短文本分类,深度学习与tensorflow,中文短文本分类)