强化学习:第三方库【TRL - Transformer Reinforcement Learning】
OverviewTRLisacutting-edgelibrarydesignedforpost-trainingfoundationmodelsusingadvancedtechniqueslikeSupervisedFine-Tuning(SFT),ProximalPolicyOptimization(PPO),andDirectPreferenceOptimization(DPO).Buil