NLP-预训练模型-2019-NLU+NLG:T5【Text-to-Text 预训练模型超大规模探索】【 微调T5用于文本摘要】

《原始论文:Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer》

2019年10月,Google 在《Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer》这篇论文中提出了一个最新的预训练模型 T5(Text-To-Text Transfer Transformer),其参数量达到了 110 亿,完爆 Bert Large 模型,且在多项 NLP 任务中达到 SOTA 性能。有人说,这是一种将探索迁移学习能力边界的模型。

当然,最大的冲击还是财大气粗,bigger and bigger,但翻完它长达 34 页的论文,发现其中的分析无疑是诚意满满(都是钱)。类似这样的大型实验探索论文也有一些,首先提出一个通用框架,接着进行了各种比对实验,获得一套建议参数,最后得到一个很强的 baseline。而我们之后做这方面实验就能参考它的一套参数。

对于 T5 这篇论文,Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer,无疑也是类似的论文。它的意义不在烧了多少钱,也不在屠了多少榜(砸钱就能砸出来),其中 idea 创新也不大,它最重要作用是给整个 NLP 预训练模型领域提供了一个通用框架,把所有任务都转化成一种形式,正如论文里所说的

introducing a unified framework that converts every language problem into a text-to-text format.

之后未来做 NLP 实验时,可能就不再是自己怎么调一些模型了,而是无论什么任务,直接拿来一个超大预训练模型,然后主要工作就变成了怎么把任务转换成合适的文本输入输出,于是我们就成了带引号的”数据科学家“。而且可以用于多种任务,而模型对这些任务的区分只是根据你构建的输入输出形式,其实这让我想起 Jeff Dean 在某次谈话中谈到的谷歌未来方向,想做一个超级模型,什么任务都能直接处理,而它内部可以是稀疏的,或者可以局部 Distill,来对单独任务进行处理。

二、直接使用T5预训练模型用于文本摘要

方式01、from transformers import AutoTokenizer, AutoModelForSeq2SeqLM

# https://github.com/huggingface/transformers/blob/master/src/transformers/models/t5/modeling_t5.py
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM

tokenizer = AutoTokenizer.from_pretrained(r'D:\Pretrained_Model\t5-base')
model = AutoModelForSeq2SeqLM.from_pretrained(r'D:\Pretrained_Model\t5-base')

# 用T5做文本摘要任务,前面加 "summarize:"识别符
text = """
        summarize: (CNN)For the second time during his papacy, Pope Francis has announced a new group of bishops and archbishops set to become cardinals -- and they come from all over the world.
        Pope Francis said Sunday that he would hold a meeting of cardinals on February 14 "during which I will name 15 new Cardinals who, coming from 13 countries from every continent, manifest the indissoluble links between the Church of Rome and the particular Churches present in the world," according to Vatican Radio.
        New cardinals are always important because they set the tone in the church and also elect the next pope, CNN Senior Vatican Analyst John L. Allen said. They are sometimes referred to as the princes of the Catholic Church.
        The new cardinals come from countries such as Ethiopia, New Zealand and Myanmar.
        "This is a pope who very much wants to reach out to people on the margins, and you clearly see that in this set," Allen said. "You're talking about cardinals from typically overlooked places, like Cape Verde, the Pacific island of Tonga, Panama, Thailand, Uruguay."
        But for the second time since Francis' election, no Americans made the list.
        "Francis' pattern is very clear: He wants to go to the geographical peripheries rather than places that are already top-heavy with cardinals," Allen said.
        Christopher Bellitto, a professor of church history at Kean University in New Jersey, noted that Francis announced his new slate of cardinals on the Catholic Feast of the Epiphany, which commemorates the visit of the Magi to Jesus' birthplace in Bethlehem.
        "On feast of three wise men from far away, the Pope's choices for cardinal say that every local church deserves a place at the big table."
        In other words, Francis wants a more decentralized church and wants to hear reform ideas from small communities that sit far from Catholicism's power centers, Bellitto said.
        That doesn't mean Francis is the first pontiff to appoint cardinals from the developing world, though. Beginning in the 1920s, an increasing number of Latin American churchmen were named cardinals, and in the 1960s, St. John XXIII, whom Francis canonized last year, appointed the first cardinals from Japan, the Philippines and Africa.
        In addition to the 15 new cardinals Francis named on Sunday, five retired archbishops and bishops will also be honored as cardinals.
        Last year, Pope Francis appointed 19 new cardinals, including bishops from Haiti and Burkina Faso.
        CNN's Daniel Burke and Christabelle Fombu contributed to this report.
"""
# CNN/DM答案:
# @highlight
# The 15 new cardinals will be installed on February 14
# @highlight
# They come from countries such as Myanmar and Tonga
# @highlight
# No Americans made the list this time or the previous time in Francis' papacy

inputs = tokenizer(text, max_length=1024, truncation=True, return_tensors='pt')

print('inputs = ', inputs)

summary_ids = model.generate(inputs['input_ids'])

print('\nsummary_ids = ', summary_ids)

print([tokenizer.decode(g, skip_special_tokens=True, clean_up_tokenization_spaces=False) for g in summary_ids])
print(tokenizer.batch_decode(summary_ids, skip_special_tokens=True, clean_up_tokenization_spaces=False))

打印结果:

Ignored unknown kwarg option direction
inputs =  {'input_ids': tensor([[21603,    10,    41,   254, 17235,    61,  3809,     8,   511,    97,
           383,   112,     3, 16281,  4710,     6, 17384, 11065,    65,  2162,
             3,     9,   126,   563,    13, 25214,     7,    11, 11508, 11514,
         10776,     7,   356,    12,   582,   895, 10270,     7,  1636,    11,
            79,   369,    45,    66,   147,     8,   296,     5, 17384, 11065,
           243,  1771,    24,     3,    88,   133,  1520,     3,     9,  1338,
            13,   895, 10270,     7,    30,  2083,   968,    96,    26,  7920,
            84,    27,    56,   564,   627,   126, 21967,     7,   113,     6,
          1107,    45,  1179,  1440,    45,   334, 10829,     6,  6571,     8,
             3,  8482,     7, 26175,  2416,   344,     8,  2345,    13,  7332,
            11,     8,  1090,  2345,    15,     7,   915,    16,     8,   296,
           976,  1315,    12, 25770,  5061,     5,   368,   895, 10270,     7,
            33,   373,   359,   250,    79,   356,     8,  5739,    16,     8,
          2078,    11,    92, 11924,     8,   416,  2783,    15,     6, 19602,
          5523, 25770, 25224,  1079,   301,     5, 10618,   243,     5,   328,
            33,  1664,     3,  4822,    12,    38,     8, 22277,     7,    13,
             8,  6502,  2345,     5,    37,   126,   895, 10270,     7,   369,
            45,  1440,   224,    38, 22138,     6,   368,  5725,    11, 27274,
             5,    96,  3713,    19,     3,     9,  2783,    15,   113,   182,
           231,  2746,    12,  1535,    91,    12,   151,    30,     8,  6346,
             7,     6,    11,    25,  3133,   217,    24,    16,    48,   356,
           976, 10618,   243,     5,    96,  3774,    31,    60,  2508,    81,
           895, 10270,     7,    45,  3115, 20633,  1747,     6,   114,  9702,
           781,   221,     6,     8,  5824,  3368,    13,   304,  1725,     9,
             6, 21099,     6, 10508,     6, 30758,   535,   299,    21,     8,
           511,    97,   437, 11065,    31,  4356,     6,   150,  5452,   263,
             8,   570,     5,    96,   371,    52, 11389,     7,    31,  3275,
            19,   182,   964,    10,   216,  2746,    12,   281,    12,     8,
         20187,   158,  5082,    88,  2593,  1066,   145,  1747,    24,    33,
           641,   420,    18,    88, 19649,    28,   895, 10270,     7,   976,
         10618,   243,     5, 14702,  5377,   155,   235,     6,     3,     9,
          5812,    13,  2078,   892,    44,  2566,   152,   636,    16,   368,
          5092,     6,  4466,    24, 11065,  2162,   112,   126, 21079,    13,
           895, 10270,     7,    30,     8,  6502,   377, 11535,    13,     8,
         12741,  8237,    63,     6,    84, 18681,    15,     7,     8,   719,
            13,     8, 22673,    12,  1850,    31,  3879,  4687,    16, 15659,
           109,  6015,     5,    96,  7638, 18886,    13,   386,  7624,  1076,
            45,   623,   550,     6,     8, 17384,    31,     7,  3703,    21,
           895, 10270,   497,    24,   334,   415,  2078, 15314,     3,     9,
           286,    44,     8,   600,   953,   535,    86,   119,  1234,     6,
         11065,  2746,     3,     9,    72,    20, 21411,  2078,    11,  2746,
            12,  1616,  5139,   912,    45,   422,  2597,    24,  2561,   623,
            45,  6502,   159,    51,    31,     7,   579,  6881,     6,  5377,
           155,   235,   243,     5,   466,   744,    31,    17,  1243, 11065,
            19,     8,   166, 19068,  5982,    12,     3,     9,   102,  2700,
           895, 10270,     7,    45,     8,  2421,   296,     6,   713,     5,
         22738,    16,     8, 13978,     7,     6,    46,  3094,   381,    13,
          6271,   797,  2078,   904,   130,  2650,   895, 10270,     7,     6,
            11,    16,     8,  8754,     7,     6,   472,     5,  1079,     3,
             4,     4, 13671,     6,  4068, 11065,    54,   106,  1601,   336,
           215,     6,  7817,     8,   166,   895, 10270,     7,    45,  3411,
             6,     8, 12729,    11,  2648,     5,    86,   811,    12,     8,
           627,   126,   895, 10270,     7, 11065,  2650,    30,  1771,     6,
           874, 10611, 11508, 11514, 10776,     7,    11, 25214,     7,    56,
            92,    36, 13242,    38,   895, 10270,     7,     5,  2506,   215,
             6, 17384, 11065,  7817,   957,   126,   895, 10270,     7,     6,
           379, 25214,     7,    45, 22179,    11,  4152,  2917,     9,  1699,
             7,    32,     5, 19602,    31,     7,  4173, 27575,    11,  2144,
         10333,   109,   377,  8038,    76,  9859,    12,    48,   934,     5,
             1]]), 'attention_mask': tensor([[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
         1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
         1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
         1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
         1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
         1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
         1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
         1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
         1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
         1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
         1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
         1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
         1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
         1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
         1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
         1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
         1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
         1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
         1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
         1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
         1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
         1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
         1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
         1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
         1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]])}

summary_ids =  tensor([[    0,   126,   895, 10270,     7,   369,    45,  1179,  1440,    45,
           334, 10829,     3,     5,    79,    33,  1664,     3,  4822,    12]])

['new cardinals come from 13 countries from every continent . they are sometimes referred to']
['new cardinals come from 13 countries from every continent . they are sometimes referred to']

Process finished with exit code 0

方式02、from transformers import T5Tokenizer, T5ForConditionalGeneration

# https://github.com/huggingface/transformers/blob/master/src/transformers/models/t5/modeling_t5.py
from transformers import T5Tokenizer, T5ForConditionalGeneration

tokenizer = T5Tokenizer.from_pretrained(r'D:\Pretrained_Model\t5-base')
model = T5ForConditionalGeneration.from_pretrained(r'D:\Pretrained_Model\t5-base')

text = """
         (CNN)For the second time during his papacy, Pope Francis has announced a new group of bishops and archbishops set to become cardinals -- and they come from all over the world.
        Pope Francis said Sunday that he would hold a meeting of cardinals on February 14 "during which I will name 15 new Cardinals who, coming from 13 countries from every continent, manifest the indissoluble links between the Church of Rome and the particular Churches present in the world," according to Vatican Radio.
        New cardinals are always important because they set the tone in the church and also elect the next pope, CNN Senior Vatican Analyst John L. Allen said. They are sometimes referred to as the princes of the Catholic Church.
        The new cardinals come from countries such as Ethiopia, New Zealand and Myanmar.
        "This is a pope who very much wants to reach out to people on the margins, and you clearly see that in this set," Allen said. "You're talking about cardinals from typically overlooked places, like Cape Verde, the Pacific island of Tonga, Panama, Thailand, Uruguay."
        But for the second time since Francis' election, no Americans made the list.
        "Francis' pattern is very clear: He wants to go to the geographical peripheries rather than places that are already top-heavy with cardinals," Allen said.
        Christopher Bellitto, a professor of church history at Kean University in New Jersey, noted that Francis announced his new slate of cardinals on the Catholic Feast of the Epiphany, which commemorates the visit of the Magi to Jesus' birthplace in Bethlehem.
        "On feast of three wise men from far away, the Pope's choices for cardinal say that every local church deserves a place at the big table."
        In other words, Francis wants a more decentralized church and wants to hear reform ideas from small communities that sit far from Catholicism's power centers, Bellitto said.
        That doesn't mean Francis is the first pontiff to appoint cardinals from the developing world, though. Beginning in the 1920s, an increasing number of Latin American churchmen were named cardinals, and in the 1960s, St. John XXIII, whom Francis canonized last year, appointed the first cardinals from Japan, the Philippines and Africa.
        In addition to the 15 new cardinals Francis named on Sunday, five retired archbishops and bishops will also be honored as cardinals.
        Last year, Pope Francis appointed 19 new cardinals, including bishops from Haiti and Burkina Faso.
        CNN's Daniel Burke and Christabelle Fombu contributed to this report.
"""
# CNN/DM答案:
# @highlight
# The 15 new cardinals will be installed on February 14
# @highlight
# They come from countries such as Myanmar and Tonga
# @highlight
# No Americans made the list this time or the previous time in Francis' papacy

inputs = tokenizer(text, max_length=1024, truncation=True, return_tensors='pt')

print('inputs = ', inputs)

summary_ids = model.generate(inputs['input_ids'])

print('\nsummary_ids = ', summary_ids)

print([tokenizer.decode(g, skip_special_tokens=True, clean_up_tokenization_spaces=False) for g in summary_ids])
print(tokenizer.batch_decode(summary_ids, skip_special_tokens=True, clean_up_tokenization_spaces=False))

打印结果:

inputs =  {'input_ids': tensor([[   41,   254, 17235,    61,  3809,     8,   511,    97,   383,   112,
             3, 16281,  4710,     6, 17384, 11065,    65,  2162,     3,     9,
           126,   563,    13, 25214,     7,    11, 11508, 11514, 10776,     7,
           356,    12,   582,   895, 10270,     7,  1636,    11,    79,   369,
            45,    66,   147,     8,   296,     5, 17384, 11065,   243,  1771,
            24,     3,    88,   133,  1520,     3,     9,  1338,    13,   895,
         10270,     7,    30,  2083,   968,    96,    26,  7920,    84,    27,
            56,   564,   627,   126, 21967,     7,   113,     6,  1107,    45,
          1179,  1440,    45,   334, 10829,     6,  6571,     8,     3,  8482,
             7, 26175,  2416,   344,     8,  2345,    13,  7332,    11,     8,
          1090,  2345,    15,     7,   915,    16,     8,   296,   976,  1315,
            12, 25770,  5061,     5,   368,   895, 10270,     7,    33,   373,
           359,   250,    79,   356,     8,  5739,    16,     8,  2078,    11,
            92, 11924,     8,   416,  2783,    15,     6, 19602,  5523, 25770,
         25224,  1079,   301,     5, 10618,   243,     5,   328,    33,  1664,
             3,  4822,    12,    38,     8, 22277,     7,    13,     8,  6502,
          2345,     5,    37,   126,   895, 10270,     7,   369,    45,  1440,
           224,    38, 22138,     6,   368,  5725,    11, 27274,     5,    96,
          3713,    19,     3,     9,  2783,    15,   113,   182,   231,  2746,
            12,  1535,    91,    12,   151,    30,     8,  6346,     7,     6,
            11,    25,  3133,   217,    24,    16,    48,   356,   976, 10618,
           243,     5,    96,  3774,    31,    60,  2508,    81,   895, 10270,
             7,    45,  3115, 20633,  1747,     6,   114,  9702,   781,   221,
             6,     8,  5824,  3368,    13,   304,  1725,     9,     6, 21099,
             6, 10508,     6, 30758,   535,   299,    21,     8,   511,    97,
           437, 11065,    31,  4356,     6,   150,  5452,   263,     8,   570,
             5,    96,   371,    52, 11389,     7,    31,  3275,    19,   182,
           964,    10,   216,  2746,    12,   281,    12,     8, 20187,   158,
          5082,    88,  2593,  1066,   145,  1747,    24,    33,   641,   420,
            18,    88, 19649,    28,   895, 10270,     7,   976, 10618,   243,
             5, 14702,  5377,   155,   235,     6,     3,     9,  5812,    13,
          2078,   892,    44,  2566,   152,   636,    16,   368,  5092,     6,
          4466,    24, 11065,  2162,   112,   126, 21079,    13,   895, 10270,
             7,    30,     8,  6502,   377, 11535,    13,     8, 12741,  8237,
            63,     6,    84, 18681,    15,     7,     8,   719,    13,     8,
         22673,    12,  1850,    31,  3879,  4687,    16, 15659,   109,  6015,
             5,    96,  7638, 18886,    13,   386,  7624,  1076,    45,   623,
           550,     6,     8, 17384,    31,     7,  3703,    21,   895, 10270,
           497,    24,   334,   415,  2078, 15314,     3,     9,   286,    44,
             8,   600,   953,   535,    86,   119,  1234,     6, 11065,  2746,
             3,     9,    72,    20, 21411,  2078,    11,  2746,    12,  1616,
          5139,   912,    45,   422,  2597,    24,  2561,   623,    45,  6502,
           159,    51,    31,     7,   579,  6881,     6,  5377,   155,   235,
           243,     5,   466,   744,    31,    17,  1243, 11065,    19,     8,
           166, 19068,  5982,    12,     3,     9,   102,  2700,   895, 10270,
             7,    45,     8,  2421,   296,     6,   713,     5, 22738,    16,
             8, 13978,     7,     6,    46,  3094,   381,    13,  6271,   797,
          2078,   904,   130,  2650,   895, 10270,     7,     6,    11,    16,
             8,  8754,     7,     6,   472,     5,  1079,     3,     4,     4,
         13671,     6,  4068, 11065,    54,   106,  1601,   336,   215,     6,
          7817,     8,   166,   895, 10270,     7,    45,  3411,     6,     8,
         12729,    11,  2648,     5,    86,   811,    12,     8,   627,   126,
           895, 10270,     7, 11065,  2650,    30,  1771,     6,   874, 10611,
         11508, 11514, 10776,     7,    11, 25214,     7,    56,    92,    36,
         13242,    38,   895, 10270,     7,     5,  2506,   215,     6, 17384,
         11065,  7817,   957,   126,   895, 10270,     7,     6,   379, 25214,
             7,    45, 22179,    11,  4152,  2917,     9,  1699,     7,    32,
             5, 19602,    31,     7,  4173, 27575,    11,  2144, 10333,   109,
           377,  8038,    76,  9859,    12,    48,   934,     5,     1]]), 'attention_mask': tensor([[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
         1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
         1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
         1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
         1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
         1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
         1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
         1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
         1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
         1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
         1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
         1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
         1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
         1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
         1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
         1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
         1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
         1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
         1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
         1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
         1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
         1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
         1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
         1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
         1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]])}

summary_ids =  tensor([[    0,   126,   895, 10270,     7,   369,    45,  1179,  1440,    45,
           334, 10829,     3,     5,    79,    33,   557,     3,  4822,    12]])

['new cardinals come from 13 countries from every continent . they are often referred to']
['new cardinals come from 13 countries from every continent . they are often referred to']

Process finished with exit code 0

三、微调T5(用于文本摘要、用数据集xsum来微调T5)

# https://github.com/huggingface/notebooks/blob/master/examples/summarization.ipynb
import nltk
import numpy as np
from datasets import load_dataset, load_metric
from transformers import AutoTokenizer
from transformers import AutoModelForSeq2SeqLM, DataCollatorForSeq2Seq, Seq2SeqTrainingArguments, Seq2SeqTrainer

model_checkpoint = r"D:\Pretrained_Model\t5-base"
raw_datasets = load_dataset("xsum")
metric = load_metric("rouge")

print('raw_datasets = ', raw_datasets)
print("raw_datasets['train'][0] = ", raw_datasets['train'][0])

tokenizer = AutoTokenizer.from_pretrained(model_checkpoint)

prefix = "summarize: "


def preprocess_function(examples):
    inputs = [prefix + doc for doc in examples["document"]]
    model_inputs = tokenizer(inputs, max_length=1024, truncation=True)

    # Setup the tokenizer for targets
    with tokenizer.as_target_tokenizer():
        labels = tokenizer(examples["summary"], max_length=128, truncation=True)

    model_inputs["labels"] = labels["input_ids"]
    return model_inputs


def compute_metrics(eval_pred):
    predictions, labels = eval_pred
    decoded_preds = tokenizer.batch_decode(predictions, skip_special_tokens=True)
    # Replace -100 in the labels as we can't decode them.
    labels = np.where(labels != -100, labels, tokenizer.pad_token_id)
    decoded_labels = tokenizer.batch_decode(labels, skip_special_tokens=True)

    # Rouge expects a newline after each sentence
    decoded_preds = ["\n".join(nltk.sent_tokenize(pred.strip())) for pred in decoded_preds]
    decoded_labels = ["\n".join(nltk.sent_tokenize(label.strip())) for label in decoded_labels]

    result = metric.compute(predictions=decoded_preds, references=decoded_labels, use_stemmer=True)
    # Extract a few results
    result = {key: value.mid.fmeasure * 100 for key, value in result.items()}

    # Add mean generated length
    prediction_lens = [np.count_nonzero(pred != tokenizer.pad_token_id) for pred in predictions]
    result["gen_len"] = np.mean(prediction_lens)

    return {k: round(v, 4) for k, v in result.items()}


tokenized_datasets = raw_datasets.map(preprocess_function, batched=True)

# ----------------------------------- Fine-tuning the model -----------------------------------
model = AutoModelForSeq2SeqLM.from_pretrained(model_checkpoint)
batch_size = 1
model_name = model_checkpoint.split("/")[-1]
args = Seq2SeqTrainingArguments(
    "finetuned-xsum",
    evaluation_strategy="epoch",
    learning_rate=2e-5,
    per_device_train_batch_size=batch_size,
    per_device_eval_batch_size=batch_size,
    weight_decay=0.01,
    save_total_limit=3,
    num_train_epochs=1,
    predict_with_generate=True,
    fp16=True,
    push_to_hub=False,
)

data_collator = DataCollatorForSeq2Seq(tokenizer, model=model)

trainer = Seq2SeqTrainer(
    model,
    args,
    train_dataset=tokenized_datasets["test"],
    eval_dataset=tokenized_datasets["validation"],
    data_collator=data_collator,
    tokenizer=tokenizer,
    compute_metrics=compute_metrics
)

trainer.train()

NLP-预训练模型-2019-NLU+NLG:T5【Text-to-Text 预训练模型超大规模探索】【 微调T5用于文本摘要】_第1张图片
输出:

raw_datasets =  DatasetDict({
    train: Dataset({
        features: ['document', 'summary', 'id'],
        num_rows: 204045
    })
    validation: Dataset({
        features: ['document', 'summary', 'id'],
        num_rows: 11332
    })
    test: Dataset({
        features: ['document', 'summary', 'id'],
        num_rows: 11334
    })
})
raw_datasets['train'][0] =  {'document': 'Recent reports have linked some France-based players with returns to Wales.\n"I\'ve always felt - and this is with my rugby hat on now; this is not region or WRU - I\'d rather spend that money on keeping players in Wales," said Davies.\nThe WRU provides £2m to the fund and £1.3m comes from the regions.\nFormer Wales and British and Irish Lions fly-half Davies became WRU chairman on Tuesday 21 October, succeeding deposed David Pickering following governing body elections.\nHe is now serving a notice period to leave his role as Newport Gwent Dragons chief executive after being voted on to the WRU board in September.\nDavies was among the leading figures among Dragons, Ospreys, Scarlets and Cardiff Blues officials who were embroiled in a protracted dispute with the WRU that ended in a £60m deal in August this year.\nIn the wake of that deal being done, Davies said the £3.3m should be spent on ensuring current Wales-based stars remain there.\nIn recent weeks, Racing Metro flanker Dan Lydiate was linked with returning to Wales.\nLikewise the Paris club\'s scrum-half Mike Phillips and centre Jamie Roberts were also touted for possible returns.\nWales coach Warren Gatland has said: "We haven\'t instigated contact with the players.\n"But we are aware that one or two of them are keen to return to Wales sooner rather than later."\nSpeaking to Scrum V on BBC Radio Wales, Davies re-iterated his stance, saying keeping players such as Scarlets full-back Liam Williams and Ospreys flanker Justin Tipuric in Wales should take precedence.\n"It\'s obviously a limited amount of money [available]. The union are contributing 60% of that contract and the regions are putting £1.3m in.\n"So it\'s a total pot of just over £3m and if you look at the sorts of salaries that the... guys... have been tempted to go overseas for [are] significant amounts of money.\n"So if we were to bring the players back, we\'d probably get five or six players.\n"And I\'ve always felt - and this is with my rugby hat on now; this is not region or WRU - I\'d rather spend that money on keeping players in Wales.\n"There are players coming out of contract, perhaps in the next year or so… you\'re looking at your Liam Williams\' of the world; Justin Tipuric for example - we need to keep these guys in Wales.\n"We actually want them there. They are the ones who are going to impress the young kids, for example.\n"They are the sort of heroes that our young kids want to emulate.\n"So I would start off [by saying] with the limited pot of money, we have to retain players in Wales.\n"Now, if that can be done and there\'s some spare monies available at the end, yes, let\'s look to bring players back.\n"But it\'s a cruel world, isn\'t it?\n"It\'s fine to take the buck and go, but great if you can get them back as well, provided there\'s enough money."\nBritish and Irish Lions centre Roberts has insisted he will see out his Racing Metro contract.\nHe and Phillips also earlier dismissed the idea of leaving Paris.\nRoberts also admitted being hurt by comments in French Newspaper L\'Equipe attributed to Racing Coach Laurent Labit questioning their effectiveness.\nCentre Roberts and flanker Lydiate joined Racing ahead of the 2013-14 season while scrum-half Phillips moved there in December 2013 after being dismissed for disciplinary reasons by former club Bayonne.', 'id': '29750031', 'summary': 'New Welsh Rugby Union chairman Gareth Davies believes a joint £3.3m WRU-regions fund should be used to retain home-based talent such as Liam Williams, not bring back exiled stars.'}
Ignored unknown kwarg option direction
  0%|          | 0/205 [00:00<?, ?ba/s]Ignored unknown kwarg option direction
  0%|          | 1/205 [00:00<01:27,  2.33ba/s]Ignored unknown kwarg option direction
Ignored unknown kwarg option direction
  1%|          | 2/205 [00:00<01:23,  2.43ba/s]Ignored unknown kwarg option direction
Ignored unknown kwarg option direction
  1%|| 3/205 [00:01<01:21,  2.49ba/s]Ignored unknown kwarg option direction
Ignored unknown kwarg option direction
  2%|| 4/205 [00:01<01:17,  2.58ba/s]Ignored unknown kwarg option direction
...
...
...
 97%|█████████▋| 199/205 [01:25<00:02,  2.07ba/s]Ignored unknown kwarg option direction
Ignored unknown kwarg option direction
 98%|█████████▊| 200/205 [01:26<00:02,  2.10ba/s]Ignored unknown kwarg option direction
Ignored unknown kwarg option direction
 98%|█████████▊| 201/205 [01:26<00:01,  2.09ba/s]Ignored unknown kwarg option direction
Ignored unknown kwarg option direction
 99%|█████████▊| 202/205 [01:27<00:01,  2.13ba/s]Ignored unknown kwarg option direction
Ignored unknown kwarg option direction
 99%|█████████▉| 203/205 [01:27<00:00,  2.14ba/s]Ignored unknown kwarg option direction
Ignored unknown kwarg option direction
Ignored unknown kwarg option direction
100%|██████████| 205/205 [01:28<00:00,  2.32ba/s]
Ignored unknown kwarg option direction
Ignored unknown kwarg option direction
Ignored unknown kwarg option direction
  0%|          | 0/12 [00:00<?, ?ba/s]Ignored unknown kwarg option direction
Ignored unknown kwarg option direction
  8%|| 1/12 [00:00<00:05,  2.16ba/s]Ignored unknown kwarg option direction
Ignored unknown kwarg option direction
 17%|█▋        | 2/12 [00:00<00:04,  2.07ba/s]Ignored unknown kwarg option direction
Ignored unknown kwarg option direction
 25%|██▌       | 3/12 [00:01<00:04,  2.13ba/s]Ignored unknown kwarg option direction
Ignored unknown kwarg option direction
 33%|███▎      | 4/12 [00:01<00:03,  2.07ba/s]Ignored unknown kwarg option direction
Ignored unknown kwarg option direction
Ignored unknown kwarg option direction
 42%|████▏     | 5/12 [00:02<00:03,  2.09ba/s]Ignored unknown kwarg option direction
 50%|█████     | 6/12 [00:02<00:02,  2.12ba/s]Ignored unknown kwarg option direction
Ignored unknown kwarg option direction
Ignored unknown kwarg option direction
 58%|█████▊    | 7/12 [00:03<00:02,  2.15ba/s]Ignored unknown kwarg option direction
 67%|██████▋   | 8/12 [00:03<00:01,  2.19ba/s]Ignored unknown kwarg option direction
Ignored unknown kwarg option direction
 75%|███████▌  | 9/12 [00:04<00:01,  2.20ba/s]Ignored unknown kwarg option direction
Ignored unknown kwarg option direction
 83%|████████▎ | 10/12 [00:04<00:00,  2.20ba/s]Ignored unknown kwarg option direction
Ignored unknown kwarg option direction
 92%|█████████▏| 11/12 [00:05<00:00,  2.18ba/s]Ignored unknown kwarg option direction
Ignored unknown kwarg option direction
100%|██████████| 12/12 [00:05<00:00,  2.26ba/s]
Ignored unknown kwarg option direction
  0%|          | 0/12 [00:00<?, ?ba/s]Ignored unknown kwarg option direction
Ignored unknown kwarg option direction
Ignored unknown kwarg option direction
  8%|| 1/12 [00:00<00:05,  1.85ba/s]Ignored unknown kwarg option direction
Ignored unknown kwarg option direction
 17%|█▋        | 2/12 [00:01<00:05,  1.92ba/s]Ignored unknown kwarg option direction
Ignored unknown kwarg option direction
 25%|██▌       | 3/12 [00:01<00:04,  1.97ba/s]Ignored unknown kwarg option direction
Ignored unknown kwarg option direction
 33%|███▎      | 4/12 [00:01<00:03,  2.02ba/s]Ignored unknown kwarg option direction
Ignored unknown kwarg option direction
 42%|████▏     | 5/12 [00:02<00:03,  2.09ba/s]Ignored unknown kwarg option direction
Ignored unknown kwarg option direction
 50%|█████     | 6/12 [00:02<00:02,  2.10ba/s]Ignored unknown kwarg option direction
Ignored unknown kwarg option direction
 58%|█████▊    | 7/12 [00:03<00:02,  2.11ba/s]Ignored unknown kwarg option direction
Ignored unknown kwarg option direction
 67%|██████▋   | 8/12 [00:03<00:01,  2.11ba/s]Ignored unknown kwarg option direction
Ignored unknown kwarg option direction
 75%|███████▌  | 9/12 [00:04<00:01,  2.11ba/s]Ignored unknown kwarg option direction
Ignored unknown kwarg option direction
 83%|████████▎ | 10/12 [00:04<00:00,  2.11ba/s]Ignored unknown kwarg option direction
Ignored unknown kwarg option direction
 92%|█████████▏| 11/12 [00:05<00:00,  2.13ba/s]Ignored unknown kwarg option direction
Ignored unknown kwarg option direction
100%|██████████| 12/12 [00:05<00:00,  2.22ba/s]
Using amp half precision backend
The following columns in the training set  don't have a corresponding argument in `T5ForConditionalGeneration.forward` and have been ignored: document, summary, id.
C:\Program_Files_AI\Anaconda3531\lib\site-packages\transformers\optimization.py:309: FutureWarning: This implementation of AdamW is deprecated and will be removed in a future version. Use thePyTorch implementation torch.optim.AdamW instead, or set `no_deprecation_warning=True` to disable this warning
  FutureWarning,
***** Running training *****
  Num examples = 11334
  Num Epochs = 1
  Instantaneous batch size per device = 1
  Total train batch size (w. parallel, distributed & accumulation) = 1
  Gradient Accumulation steps = 1
  Total optimization steps = 11334
  4%|| 500/11334 [02:30<1:12:22,  2.49it/s]Saving model checkpoint to finetuned-xsum\checkpoint-500
Configuration saved in finetuned-xsum\checkpoint-500\config.json
{'loss': 2.5572, 'learning_rate': 1.9124757367213695e-05, 'epoch': 0.04}
Model weights saved in finetuned-xsum\checkpoint-500\pytorch_model.bin
tokenizer config file saved in finetuned-xsum\checkpoint-500\tokenizer_config.json
Special tokens file saved in finetuned-xsum\checkpoint-500\special_tokens_map.json
Copy vocab file to finetuned-xsum\checkpoint-500\spiece.model
  9%|| 1000/11334 [05:06<1:01:50,  2.78it/s]Saving model checkpoint to finetuned-xsum\checkpoint-1000
{'loss': 2.3531, 'learning_rate': 1.8244220928180694e-05, 'epoch': 0.09}
Configuration saved in finetuned-xsum\checkpoint-1000\config.json
Model weights saved in finetuned-xsum\checkpoint-1000\pytorch_model.bin
tokenizer config file saved in finetuned-xsum\checkpoint-1000\tokenizer_config.json
Special tokens file saved in finetuned-xsum\checkpoint-1000\special_tokens_map.json
Copy vocab file to finetuned-xsum\checkpoint-1000\spiece.model
 13%|█▎        | 1500/11334 [07:38<48:28,  3.38it/s]Saving model checkpoint to finetuned-xsum\checkpoint-1500
Configuration saved in finetuned-xsum\checkpoint-1500\config.json
{'loss': 2.2812, 'learning_rate': 1.736191988706547e-05, 'epoch': 0.13}
Model weights saved in finetuned-xsum\checkpoint-1500\pytorch_model.bin
tokenizer config file saved in finetuned-xsum\checkpoint-1500\tokenizer_config.json
Special tokens file saved in finetuned-xsum\checkpoint-1500\special_tokens_map.json
Copy vocab file to finetuned-xsum\checkpoint-1500\spiece.model
 18%|█▊        | 2000/11334 [10:15<45:42,  3.40it/s]Saving model checkpoint to finetuned-xsum\checkpoint-2000
Configuration saved in finetuned-xsum\checkpoint-2000\config.json
{'loss': 2.2919, 'learning_rate': 1.648138344803247e-05, 'epoch': 0.18}
Model weights saved in finetuned-xsum\checkpoint-2000\pytorch_model.bin
tokenizer config file saved in finetuned-xsum\checkpoint-2000\tokenizer_config.json
Special tokens file saved in finetuned-xsum\checkpoint-2000\special_tokens_map.json
Copy vocab file to finetuned-xsum\checkpoint-2000\spiece.model
Deleting older checkpoint [finetuned-xsum\checkpoint-500] due to args.save_total_limit
 22%|██▏       | 2500/11334 [12:53<42:04,  3.50it/s]Saving model checkpoint to finetuned-xsum\checkpoint-2500
Configuration saved in finetuned-xsum\checkpoint-2500\config.json
{'loss': 2.2519, 'learning_rate': 1.5602611611081703e-05, 'epoch': 0.22}
Model weights saved in finetuned-xsum\checkpoint-2500\pytorch_model.bin
tokenizer config file saved in finetuned-xsum\checkpoint-2500\tokenizer_config.json
Special tokens file saved in finetuned-xsum\checkpoint-2500\special_tokens_map.json
Copy vocab file to finetuned-xsum\checkpoint-2500\spiece.model
Deleting older checkpoint [finetuned-xsum\checkpoint-1000] due to args.save_total_limit
 26%|██▋       | 3000/11334 [15:30<40:44,  3.41it/s]Saving model checkpoint to finetuned-xsum\checkpoint-3000
{'loss': 2.2395, 'learning_rate': 1.4720310569966474e-05, 'epoch': 0.26}
Configuration saved in finetuned-xsum\checkpoint-3000\config.json
Model weights saved in finetuned-xsum\checkpoint-3000\pytorch_model.bin
tokenizer config file saved in finetuned-xsum\checkpoint-3000\tokenizer_config.json
Special tokens file saved in finetuned-xsum\checkpoint-3000\special_tokens_map.json
Copy vocab file to finetuned-xsum\checkpoint-3000\spiece.model
Deleting older checkpoint [finetuned-xsum\checkpoint-1500] due to args.save_total_limit
 31%|███       | 3500/11334 [18:06<37:08,  3.52it/s]Saving model checkpoint to finetuned-xsum\checkpoint-3500
{'loss': 2.2298, 'learning_rate': 1.3839774130933477e-05, 'epoch': 0.31}
Configuration saved in finetuned-xsum\checkpoint-3500\config.json
Model weights saved in finetuned-xsum\checkpoint-3500\pytorch_model.bin
tokenizer config file saved in finetuned-xsum\checkpoint-3500\tokenizer_config.json
Special tokens file saved in finetuned-xsum\checkpoint-3500\special_tokens_map.json
Copy vocab file to finetuned-xsum\checkpoint-3500\spiece.model
Deleting older checkpoint [finetuned-xsum\checkpoint-2000] due to args.save_total_limit
 35%|███▌      | 4000/11334 [20:38<37:43,  3.24it/s]Saving model checkpoint to finetuned-xsum\checkpoint-4000
{'loss': 2.224, 'learning_rate': 1.2959237691900476e-05, 'epoch': 0.35}
Configuration saved in finetuned-xsum\checkpoint-4000\config.json
Model weights saved in finetuned-xsum\checkpoint-4000\pytorch_model.bin
tokenizer config file saved in finetuned-xsum\checkpoint-4000\tokenizer_config.json
Special tokens file saved in finetuned-xsum\checkpoint-4000\special_tokens_map.json
Copy vocab file to finetuned-xsum\checkpoint-4000\spiece.model
Deleting older checkpoint [finetuned-xsum\checkpoint-2500] due to args.save_total_limit
 40%|███▉      | 4500/11334 [23:12<48:04,  2.37it/s]Saving model checkpoint to finetuned-xsum\checkpoint-4500
{'loss': 2.2665, 'learning_rate': 1.207870125286748e-05, 'epoch': 0.4}
Configuration saved in finetuned-xsum\checkpoint-4500\config.json
Model weights saved in finetuned-xsum\checkpoint-4500\pytorch_model.bin
tokenizer config file saved in finetuned-xsum\checkpoint-4500\tokenizer_config.json
Special tokens file saved in finetuned-xsum\checkpoint-4500\special_tokens_map.json
Copy vocab file to finetuned-xsum\checkpoint-4500\spiece.model
Deleting older checkpoint [finetuned-xsum\checkpoint-3000] due to args.save_total_limit
 44%|████▍     | 5000/11334 [25:45<30:40,  3.44it/s]Saving model checkpoint to finetuned-xsum\checkpoint-5000
{'loss': 2.2154, 'learning_rate': 1.1196400211752252e-05, 'epoch': 0.44}
Configuration saved in finetuned-xsum\checkpoint-5000\config.json
Model weights saved in finetuned-xsum\checkpoint-5000\pytorch_model.bin
tokenizer config file saved in finetuned-xsum\checkpoint-5000\tokenizer_config.json
Special tokens file saved in finetuned-xsum\checkpoint-5000\special_tokens_map.json
Copy vocab file to finetuned-xsum\checkpoint-5000\spiece.model
Deleting older checkpoint [finetuned-xsum\checkpoint-3500] due to args.save_total_limit
 49%|████▊     | 5500/11334 [28:22<32:52,  2.96it/s]Saving model checkpoint to finetuned-xsum\checkpoint-5500
{'loss': 2.185, 'learning_rate': 1.0315863772719253e-05, 'epoch': 0.49}
Configuration saved in finetuned-xsum\checkpoint-5500\config.json
Model weights saved in finetuned-xsum\checkpoint-5500\pytorch_model.bin
tokenizer config file saved in finetuned-xsum\checkpoint-5500\tokenizer_config.json
Special tokens file saved in finetuned-xsum\checkpoint-5500\special_tokens_map.json
Copy vocab file to finetuned-xsum\checkpoint-5500\spiece.model
Deleting older checkpoint [finetuned-xsum\checkpoint-4000] due to args.save_total_limit
 53%|█████▎    | 6000/11334 [31:02<27:55,  3.18it/s]Saving model checkpoint to finetuned-xsum\checkpoint-6000
Configuration saved in finetuned-xsum\checkpoint-6000\config.json
{'loss': 2.2635, 'learning_rate': 9.433562731604025e-06, 'epoch': 0.53}
Model weights saved in finetuned-xsum\checkpoint-6000\pytorch_model.bin
tokenizer config file saved in finetuned-xsum\checkpoint-6000\tokenizer_config.json
Special tokens file saved in finetuned-xsum\checkpoint-6000\special_tokens_map.json
Copy vocab file to finetuned-xsum\checkpoint-6000\spiece.model
Deleting older checkpoint [finetuned-xsum\checkpoint-4500] due to args.save_total_limit
 57%|█████▋    | 6500/11334 [33:37<27:12,  2.96it/s]Saving model checkpoint to finetuned-xsum\checkpoint-6500
Configuration saved in finetuned-xsum\checkpoint-6500\config.json
{'loss': 2.2082, 'learning_rate': 8.553026292571027e-06, 'epoch': 0.57}
Model weights saved in finetuned-xsum\checkpoint-6500\pytorch_model.bin
tokenizer config file saved in finetuned-xsum\checkpoint-6500\tokenizer_config.json
Special tokens file saved in finetuned-xsum\checkpoint-6500\special_tokens_map.json
Copy vocab file to finetuned-xsum\checkpoint-6500\spiece.model
Deleting older checkpoint [finetuned-xsum\checkpoint-5000] due to args.save_total_limit
 62%|██████▏   | 7000/11334 [36:12<18:01,  4.01it/s]Saving model checkpoint to finetuned-xsum\checkpoint-7000
{'loss': 2.201, 'learning_rate': 7.670725251455797e-06, 'epoch': 0.62}
Configuration saved in finetuned-xsum\checkpoint-7000\config.json
Model weights saved in finetuned-xsum\checkpoint-7000\pytorch_model.bin
tokenizer config file saved in finetuned-xsum\checkpoint-7000\tokenizer_config.json
Special tokens file saved in finetuned-xsum\checkpoint-7000\special_tokens_map.json
Copy vocab file to finetuned-xsum\checkpoint-7000\spiece.model
Deleting older checkpoint [finetuned-xsum\checkpoint-5500] due to args.save_total_limit
 66%|██████▌   | 7500/11334 [38:52<16:44,  3.82it/s]Saving model checkpoint to finetuned-xsum\checkpoint-7500
{'loss': 2.1945, 'learning_rate': 6.791953414505029e-06, 'epoch': 0.66}
Configuration saved in finetuned-xsum\checkpoint-7500\config.json
Model weights saved in finetuned-xsum\checkpoint-7500\pytorch_model.bin
tokenizer config file saved in finetuned-xsum\checkpoint-7500\tokenizer_config.json
Special tokens file saved in finetuned-xsum\checkpoint-7500\special_tokens_map.json
Copy vocab file to finetuned-xsum\checkpoint-7500\spiece.model
Deleting older checkpoint [finetuned-xsum\checkpoint-6000] due to args.save_total_limit
 71%|███████   | 8000/11334 [41:33<16:26,  3.38it/s]Saving model checkpoint to finetuned-xsum\checkpoint-8000
{'loss': 2.1742, 'learning_rate': 5.911416975472032e-06, 'epoch': 0.71}
Configuration saved in finetuned-xsum\checkpoint-8000\config.json
Model weights saved in finetuned-xsum\checkpoint-8000\pytorch_model.bin
tokenizer config file saved in finetuned-xsum\checkpoint-8000\tokenizer_config.json
Special tokens file saved in finetuned-xsum\checkpoint-8000\special_tokens_map.json
Copy vocab file to finetuned-xsum\checkpoint-8000\spiece.model
Deleting older checkpoint [finetuned-xsum\checkpoint-6500] due to args.save_total_limit
 75%|███████▍  | 8500/11334 [44:16<14:14,  3.32it/s]Saving model checkpoint to finetuned-xsum\checkpoint-8500
{'loss': 2.2351, 'learning_rate': 5.029115934356803e-06, 'epoch': 0.75}
Configuration saved in finetuned-xsum\checkpoint-8500\config.json
Model weights saved in finetuned-xsum\checkpoint-8500\pytorch_model.bin
tokenizer config file saved in finetuned-xsum\checkpoint-8500\tokenizer_config.json
Special tokens file saved in finetuned-xsum\checkpoint-8500\special_tokens_map.json
Copy vocab file to finetuned-xsum\checkpoint-8500\spiece.model
Deleting older checkpoint [finetuned-xsum\checkpoint-7000] due to args.save_total_limit
 79%|███████▉  | 9000/11334 [47:02<13:46,  2.82it/s]Saving model checkpoint to finetuned-xsum\checkpoint-9000
{'loss': 2.2096, 'learning_rate': 4.146814893241574e-06, 'epoch': 0.79}
Configuration saved in finetuned-xsum\checkpoint-9000\config.json
Model weights saved in finetuned-xsum\checkpoint-9000\pytorch_model.bin
tokenizer config file saved in finetuned-xsum\checkpoint-9000\tokenizer_config.json
Special tokens file saved in finetuned-xsum\checkpoint-9000\special_tokens_map.json
Copy vocab file to finetuned-xsum\checkpoint-9000\spiece.model
Deleting older checkpoint [finetuned-xsum\checkpoint-7500] due to args.save_total_limit
 84%|████████▍ | 9500/11334 [49:46<08:49,  3.47it/s]Saving model checkpoint to finetuned-xsum\checkpoint-9500
{'loss': 2.1603, 'learning_rate': 3.2662784542085763e-06, 'epoch': 0.84}
Configuration saved in finetuned-xsum\checkpoint-9500\config.json
Model weights saved in finetuned-xsum\checkpoint-9500\pytorch_model.bin
tokenizer config file saved in finetuned-xsum\checkpoint-9500\tokenizer_config.json
Special tokens file saved in finetuned-xsum\checkpoint-9500\special_tokens_map.json
Copy vocab file to finetuned-xsum\checkpoint-9500\spiece.model
Deleting older checkpoint [finetuned-xsum\checkpoint-8000] due to args.save_total_limit
 88%|████████▊ | 10000/11334 [52:28<07:06,  3.13it/s]Saving model checkpoint to finetuned-xsum\checkpoint-10000
{'loss': 2.161, 'learning_rate': 2.3839774130933478e-06, 'epoch': 0.88}
Configuration saved in finetuned-xsum\checkpoint-10000\config.json
Model weights saved in finetuned-xsum\checkpoint-10000\pytorch_model.bin
tokenizer config file saved in finetuned-xsum\checkpoint-10000\tokenizer_config.json
Special tokens file saved in finetuned-xsum\checkpoint-10000\special_tokens_map.json
Copy vocab file to finetuned-xsum\checkpoint-10000\spiece.model
Deleting older checkpoint [finetuned-xsum\checkpoint-8500] due to args.save_total_limit
 93%|█████████▎| 10500/11334 [55:12<03:52,  3.58it/s]Saving model checkpoint to finetuned-xsum\checkpoint-10500
{'loss': 2.1606, 'learning_rate': 1.501676371978119e-06, 'epoch': 0.93}
Configuration saved in finetuned-xsum\checkpoint-10500\config.json
Model weights saved in finetuned-xsum\checkpoint-10500\pytorch_model.bin
tokenizer config file saved in finetuned-xsum\checkpoint-10500\tokenizer_config.json
Special tokens file saved in finetuned-xsum\checkpoint-10500\special_tokens_map.json
Copy vocab file to finetuned-xsum\checkpoint-10500\spiece.model
Deleting older checkpoint [finetuned-xsum\checkpoint-9000] due to args.save_total_limit
 97%|█████████▋| 11000/11334 [57:53<01:48,  3.08it/s]Saving model checkpoint to finetuned-xsum\checkpoint-11000
Configuration saved in finetuned-xsum\checkpoint-11000\config.json
{'loss': 2.1629, 'learning_rate': 6.211399329451209e-07, 'epoch': 0.97}
Model weights saved in finetuned-xsum\checkpoint-11000\pytorch_model.bin
tokenizer config file saved in finetuned-xsum\checkpoint-11000\tokenizer_config.json
Special tokens file saved in finetuned-xsum\checkpoint-11000\special_tokens_map.json
Copy vocab file to finetuned-xsum\checkpoint-11000\spiece.model
Deleting older checkpoint [finetuned-xsum\checkpoint-9500] due to args.save_total_limit
100%|██████████| 11334/11334 [59:38<00:00,  3.15it/s]The following columns in the evaluation set  don't have a corresponding argument in `T5ForConditionalGeneration.forward` and have been ignored: document, summary, id.
***** Running Evaluation *****
  Num examples = 11332
  Batch size = 1

  0%|          | 0/11332 [00:00<?, ?it/s]
  0%|          | 2/11332 [00:00<43:52,  4.30it/s]
  0%|          | 3/11332 [00:00<48:16,  3.91it/s]
  0%|          | 4/11332 [00:01<48:45,  3.87it/s]
  0%|          | 5/11332 [00:01<54:24,  3.47it/s]
  0%|          | 6/11332 [00:01<1:03:02,  2.99it/s]
  0%|          | 7/11332 [00:02<1:00:54,  3.10it/s]
  0%|          | 8/11332 [00:02<1:08:59,  2.74it/s]
  0%|          | 9/11332 [00:02<1:06:21,  2.84it/s]
  0%|          | 10/11332 [00:03<1:07:00,  2.82it/s]
  0%|          | 11/11332 [00:03<1:12:58,  2.59it/s]
  0%|          | 12/11332 [00:04<1:08:00,  2.77it/s]
...
...
...
100%|█████████▉| 11327/11332 [1:07:16<00:01,  2.83it/s]
100%|█████████▉| 11328/11332 [1:07:16<00:01,  2.89it/s]
100%|█████████▉| 11329/11332 [1:07:16<00:00,  3.00it/s]
100%|█████████▉| 11330/11332 [1:07:16<00:00,  3.09it/s]
100%|█████████▉| 11331/11332 [1:07:17<00:00,  3.07it/s]
                                                     
{'eval_loss': 1.9903812408447266, 'eval_rouge1': 32.2647, 'eval_rouge2': 10.6523, 'eval_rougeL': 25.628, 'eval_rougeLsum': 25.6236, 'eval_gen_len': 18.713, 'eval_runtime': 4055.0046, 'eval_samples_per_second': 2.795, 'eval_steps_per_second': 2.795, 'epoch': 1.0}
{'train_runtime': 7633.6468, 'train_samples_per_second': 1.485, 'train_steps_per_second': 1.485, 'train_loss': 2.2352082908984445, 'epoch': 1.0}
100%|██████████| 11334/11334 [2:07:13<00:00,  3.15it/s]
100%|██████████| 11332/11332 [1:07:34<00:00,  3.16it/s]
                                                       

Training completed. Do not forget to share your model on huggingface.co/models =)


100%|██████████| 11334/11334 [2:07:13<00:00,  1.48it/s]

Process finished with exit code 0

四、使用微调后的T5

# https://github.com/huggingface/transformers/blob/master/src/transformers/models/t5/modeling_t5.py
from transformers import T5Tokenizer, T5ForConditionalGeneration

tokenizer = T5Tokenizer.from_pretrained(r'D:\Pretrained_Model\t5-base-finetuning')
model = T5ForConditionalGeneration.from_pretrained(r'D:\Pretrained_Model\t5-base-finetuning')

text = """
         (CNN)For the second time during his papacy, Pope Francis has announced a new group of bishops and archbishops set to become cardinals -- and they come from all over the world.
        Pope Francis said Sunday that he would hold a meeting of cardinals on February 14 "during which I will name 15 new Cardinals who, coming from 13 countries from every continent, manifest the indissoluble links between the Church of Rome and the particular Churches present in the world," according to Vatican Radio.
        New cardinals are always important because they set the tone in the church and also elect the next pope, CNN Senior Vatican Analyst John L. Allen said. They are sometimes referred to as the princes of the Catholic Church.
        The new cardinals come from countries such as Ethiopia, New Zealand and Myanmar.
        "This is a pope who very much wants to reach out to people on the margins, and you clearly see that in this set," Allen said. "You're talking about cardinals from typically overlooked places, like Cape Verde, the Pacific island of Tonga, Panama, Thailand, Uruguay."
        But for the second time since Francis' election, no Americans made the list.
        "Francis' pattern is very clear: He wants to go to the geographical peripheries rather than places that are already top-heavy with cardinals," Allen said.
        Christopher Bellitto, a professor of church history at Kean University in New Jersey, noted that Francis announced his new slate of cardinals on the Catholic Feast of the Epiphany, which commemorates the visit of the Magi to Jesus' birthplace in Bethlehem.
        "On feast of three wise men from far away, the Pope's choices for cardinal say that every local church deserves a place at the big table."
        In other words, Francis wants a more decentralized church and wants to hear reform ideas from small communities that sit far from Catholicism's power centers, Bellitto said.
        That doesn't mean Francis is the first pontiff to appoint cardinals from the developing world, though. Beginning in the 1920s, an increasing number of Latin American churchmen were named cardinals, and in the 1960s, St. John XXIII, whom Francis canonized last year, appointed the first cardinals from Japan, the Philippines and Africa.
        In addition to the 15 new cardinals Francis named on Sunday, five retired archbishops and bishops will also be honored as cardinals.
        Last year, Pope Francis appointed 19 new cardinals, including bishops from Haiti and Burkina Faso.
        CNN's Daniel Burke and Christabelle Fombu contributed to this report.
"""
# CNN/DM答案:
# @highlight
# The 15 new cardinals will be installed on February 14
# @highlight
# They come from countries such as Myanmar and Tonga
# @highlight
# No Americans made the list this time or the previous time in Francis' papacy

inputs = tokenizer(text, max_length=1024, truncation=True, return_tensors='pt')

print('inputs = ', inputs)

summary_ids = model.generate(inputs['input_ids'])

print('\nsummary_ids = ', summary_ids)

print([tokenizer.decode(g, skip_special_tokens=True, clean_up_tokenization_spaces=False) for g in summary_ids])
print(tokenizer.batch_decode(summary_ids, skip_special_tokens=True, clean_up_tokenization_spaces=False))

打印结果:

inputs =  {'input_ids': tensor([[   41,   254, 17235,    61,  3809,     8,   511,    97,   383,   112,
             3, 16281,  4710,     6, 17384, 11065,    65,  2162,     3,     9,
           126,   563,    13, 25214,     7,    11, 11508, 11514, 10776,     7,
           356,    12,   582,   895, 10270,     7,  1636,    11,    79,   369,
            45,    66,   147,     8,   296,     5, 17384, 11065,   243,  1771,
            24,     3,    88,   133,  1520,     3,     9,  1338,    13,   895,
         10270,     7,    30,  2083,   968,    96,    26,  7920,    84,    27,
            56,   564,   627,   126, 21967,     7,   113,     6,  1107,    45,
          1179,  1440,    45,   334, 10829,     6,  6571,     8,     3,  8482,
             7, 26175,  2416,   344,     8,  2345,    13,  7332,    11,     8,
          1090,  2345,    15,     7,   915,    16,     8,   296,   976,  1315,
            12, 25770,  5061,     5,   368,   895, 10270,     7,    33,   373,
           359,   250,    79,   356,     8,  5739,    16,     8,  2078,    11,
            92, 11924,     8,   416,  2783,    15,     6, 19602,  5523, 25770,
         25224,  1079,   301,     5, 10618,   243,     5,   328,    33,  1664,
             3,  4822,    12,    38,     8, 22277,     7,    13,     8,  6502,
          2345,     5,    37,   126,   895, 10270,     7,   369,    45,  1440,
           224,    38, 22138,     6,   368,  5725,    11, 27274,     5,    96,
          3713,    19,     3,     9,  2783,    15,   113,   182,   231,  2746,
            12,  1535,    91,    12,   151,    30,     8,  6346,     7,     6,
            11,    25,  3133,   217,    24,    16,    48,   356,   976, 10618,
           243,     5,    96,  3774,    31,    60,  2508,    81,   895, 10270,
             7,    45,  3115, 20633,  1747,     6,   114,  9702,   781,   221,
             6,     8,  5824,  3368,    13,   304,  1725,     9,     6, 21099,
             6, 10508,     6, 30758,   535,   299,    21,     8,   511,    97,
           437, 11065,    31,  4356,     6,   150,  5452,   263,     8,   570,
             5,    96,   371,    52, 11389,     7,    31,  3275,    19,   182,
           964,    10,   216,  2746,    12,   281,    12,     8, 20187,   158,
          5082,    88,  2593,  1066,   145,  1747,    24,    33,   641,   420,
            18,    88, 19649,    28,   895, 10270,     7,   976, 10618,   243,
             5, 14702,  5377,   155,   235,     6,     3,     9,  5812,    13,
          2078,   892,    44,  2566,   152,   636,    16,   368,  5092,     6,
          4466,    24, 11065,  2162,   112,   126, 21079,    13,   895, 10270,
             7,    30,     8,  6502,   377, 11535,    13,     8, 12741,  8237,
            63,     6,    84, 18681,    15,     7,     8,   719,    13,     8,
         22673,    12,  1850,    31,  3879,  4687,    16, 15659,   109,  6015,
             5,    96,  7638, 18886,    13,   386,  7624,  1076,    45,   623,
           550,     6,     8, 17384,    31,     7,  3703,    21,   895, 10270,
           497,    24,   334,   415,  2078, 15314,     3,     9,   286,    44,
             8,   600,   953,   535,    86,   119,  1234,     6, 11065,  2746,
             3,     9,    72,    20, 21411,  2078,    11,  2746,    12,  1616,
          5139,   912,    45,   422,  2597,    24,  2561,   623,    45,  6502,
           159,    51,    31,     7,   579,  6881,     6,  5377,   155,   235,
           243,     5,   466,   744,    31,    17,  1243, 11065,    19,     8,
           166, 19068,  5982,    12,     3,     9,   102,  2700,   895, 10270,
             7,    45,     8,  2421,   296,     6,   713,     5, 22738,    16,
             8, 13978,     7,     6,    46,  3094,   381,    13,  6271,   797,
          2078,   904,   130,  2650,   895, 10270,     7,     6,    11,    16,
             8,  8754,     7,     6,   472,     5,  1079,     3,     4,     4,
         13671,     6,  4068, 11065,    54,   106,  1601,   336,   215,     6,
          7817,     8,   166,   895, 10270,     7,    45,  3411,     6,     8,
         12729,    11,  2648,     5,    86,   811,    12,     8,   627,   126,
           895, 10270,     7, 11065,  2650,    30,  1771,     6,   874, 10611,
         11508, 11514, 10776,     7,    11, 25214,     7,    56,    92,    36,
         13242,    38,   895, 10270,     7,     5,  2506,   215,     6, 17384,
         11065,  7817,   957,   126,   895, 10270,     7,     6,   379, 25214,
             7,    45, 22179,    11,  4152,  2917,     9,  1699,     7,    32,
             5, 19602,    31,     7,  4173, 27575,    11,  2144, 10333,   109,
           377,  8038,    76,  9859,    12,    48,   934,     5,     1]]), 'attention_mask': tensor([[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
         1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
         1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
         1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
         1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
         1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
         1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
         1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
         1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
         1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
         1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
         1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
         1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
         1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
         1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
         1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
         1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
         1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
         1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
         1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
         1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
         1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
         1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
         1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
         1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]])}

summary_ids =  tensor([[    0,    37, 17384,    65,  2650,   627,   126,   895, 10270,     7,
            45,  1179,  1440,     6,   379,     8, 12729,     6, 22179,    11]])

['The Pope has named 15 new cardinals from 13 countries, including the Philippines, Haiti and']
['The Pope has named 15 new cardinals from 13 countries, including the Philippines, Haiti and']

Process finished with exit code 0



参考资料:
T5,一个探索迁移学习边界的模型
T5 模型:NLP Text-to-Text 预训练模型超大规模探索
Google预训练语言模型T5
Transformers预训练模型使用:文本摘要 Summarization

你可能感兴趣的:(#,NLP/词向量_预训练模型,#,NLP应用/文本摘要,自然语言处理,深度学习,人工智能,T5,预训练语言模型)