BERT(从理论到实践): Bidirectional Encoder Representations from Transformers【1】
预训练模型:Apre-trainedmodelisasavednetworkthatwaspreviouslytrainedonalargedataset,typicallyonalarge-scaleimage-classificationtask.Youeitherusethepretrainedmodelasisorusetransferlearningtocustomizethismode