BERT(从理论到实践): Bidirectional Encoder Representations from Transformers【1】
Apre-trainedmodelisasavednetworkthatwaspreviouslytrainedonalargedataset,typicallyonalarge-scaleimage-classificationtask.Youeitherusethepretrainedmodelasisorusetransferlearningtocustomizethismode