python 为什么bart-large-cnn摘要模型会在不同的长度设置下给出有趣的输出？

juud5qan 于 2023-03-21 发布在 Python

关注(0)|答案(1)|浏览(116)

我有一段4226个字符的文字（316个单词+特殊字符）
我正在尝试min_length和max_length的不同组合以获取摘要

print(summarizer(INPUT, max_length = 1000, min_length=500, do_sample=False))

代码：
密码是

summarizer = pipeline("summarization", model="facebook/bart-large-cnn")

INPUT = """We see ChatGPT as an engine that will eventually power human interactions with computer systems in a familiar, natural, and intuitive way. As ChatGPT stated, large language models can be put to work as a communication engine in a variety of applications across a number of vertical markets. Glaringly absent in its answer is the use of ChatGPT in search engines. Microsoft, which is an investor in OpenAI, is integrating ChatGPT into its Bing search engine. The use of a large language model enables more complex and more natural searches and extract deeper meaning and better context from source material. This is ultimately expected to deliver more robust and useful results. Is AI coming for your job? Every wave of new and disruptive technology has incited fears of mass job losses due to automation, and we are already seeing those fears expressed relative to AI generally and ChatGPT specifically. The year 1896, when Henry Ford rolled out his first automobile, was probably not a good year for buggy whip makers. When IBM introduced its first mainframe, the System/360, in 1964, office workers feared replacement by mechanical brains that never made mistakes, never called in sick, and never took vacations. There are certainly historical cases of job displacement due to new technology adoption, and ChatGPT may unseat some office workers or customer service reps. However, we think AI tools broadly will end up as part of the solution in an economy that has more job openings than available workers. However, economic history shows that technology of any sort (i.e., manufacturing technology, communications technology, information technology) ultimately makes productive workers more productive and is net additive to employment and economic growth. How big is the opportunity? The broad AI hardware and services market was nearly USD 36bn in 2020, based on IDC and Bloomberg Intelligence data. We expect the market to grow by 20% CAGR to reach USD 90bn by 2025. Given the relatively early monetization stage of conversational AI, we estimate that the segment accounted for 10% of the broader AI’s addressable market in 2020, predominantly from enterprise and consumer subscriptions. That said, user adoption is rapidly rising. ChatGPT reached its first 1 million user milestone in a week, surpassing Instagram to become the quickest application to do so. Similarly, we see strong interest from enterprises to integrate conservational AI into their existing ecosystem. As a result, we believe conversational AI’s share in the broader AI’s addressable market can climb to 20% by 2025 (USD 18–20bn). Our estimate may prove to be conservative; they could be even higher if conversational AI improvements (in terms of computing power, machine learning, and deep learning capabilities), availability of talent, enterprise adoption, spending from governments, and incentives are stronger than expected. How to invest in AI? We see artificial intelligence as a horizontal technology that will have important use cases across a number of applications and industries. From a broader perspective, AI, along with big data and cybersecurity, forms what we call the ABCs of technology. We believe these three major foundational technologies are at inflection points and should see faster adoption over the next few years as enterprises and governments increase their focus and investments in these areas. Conservational AI is currently in its early stages of monetization and costs remain high as it is expensive to run. Instead of investing directly in such platforms, interested investors in the short term can consider semiconductor companies, and cloud-service providers that provides the infrastructure needed for generative AI to take off. In the medium to long term, companies can integrate generative AI to improve margins across industries and sectors, such as within healthcare and traditional manufacturing. Outside of public equities, investors can also consider opportunities in private equity (PE). We believe the tech sector is currently undergoing a new innovation cycle after 12–18 months of muted activity, which provides interesting and new opportunities that PE can capture through early-stage investments."""

print(summarizer(INPUT, max_length = 1000, min_length=500, do_sample=False))

我的问题是：

Q1：以下警告消息是什么意思？`Your max_length is set to 1000, ...`

您的max_length设置为1000，但input_length只有856。您可以考虑手动减小max_length，例如summarizer（'...'，max_length=428）

Q2：上面这条消息后，发布了一个2211字的摘要，怎么会这样？

Q3：在上述2211个字符中，前933个字符是来自文本的有效内容，但随后发布的文本如下

如需保密支持，请致电撒玛利亚会08457 90 90 90或访问当地的撒玛利亚会分支，请参阅www.samaritans.org了解详情。

Q4：min_length和max_length实际上是如何工作的（它似乎没有遵循给定的限制）？

Q5：我可以给予这个摘要器的最大输入是多少？

python

来源：https://stackoverflow.com/questions/75795474/why-did-the-bart-large-cnn-summarization-model-giving-funny-output-with-differen

1条答案

按热度按时间

5cg8jx4n1#

Q2：上面这条消息后，发布了一个2211字的摘要，怎么会这样？

A：模型看到的长度不是字符数，所以Q2是超范围问题。确定模型的输出是否短于输入的子单词标记数更合适。

我们人类如何决定单词的数量与模型如何看到令牌的数量有点不同，即

from transformers import AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("facebook/bart-large-cnn")

text = """We see ChatGPT as an engine that will eventually power human interactions with computer systems in a familiar, natural, and intuitive way. As ChatGPT stated, large language models can be put to work as a communication engine in a variety of applications across a number of vertical markets. Glaringly absent in its answer is the use of ChatGPT in search engines. Microsoft, which is an investor in OpenAI, is integrating ChatGPT into its Bing search engine. The use of a large language model enables more complex and more natural searches and extract deeper meaning and better context from source material. This is ultimately expected to deliver more robust and useful results. Is AI coming for your job? Every wave of new and disruptive technology has incited fears of mass job losses due to automation, and we are already seeing those fears expressed relative to AI generally and ChatGPT specifically. The year 1896, when Henry Ford rolled out his first automobile, was probably not a good year for buggy whip makers. When IBM introduced its first mainframe, the System/360, in 1964, office workers feared replacement by mechanical brains that never made mistakes, never called in sick, and never took vacations. There are certainly historical cases of job displacement due to new technology adoption, and ChatGPT may unseat some office workers or customer service reps. However, we think AI tools broadly will end up as part of the solution in an economy that has more job openings than available workers. However, economic history shows that technology of any sort (i.e., manufacturing technology, communications technology, information technology) ultimately makes productive workers more productive and is net additive to employment and economic growth. How big is the opportunity? The broad AI hardware and services market was nearly USD 36bn in 2020, based on IDC and Bloomberg Intelligence data. We expect the market to grow by 20% CAGR to reach USD 90bn by 2025. Given the relatively early monetization stage of conversational AI, we estimate that the segment accounted for 10% of the broader AI’s addressable market in 2020, predominantly from enterprise and consumer subscriptions. That said, user adoption is rapidly rising. ChatGPT reached its first 1 million user milestone in a week, surpassing Instagram to become the quickest application to do so. Similarly, we see strong interest from enterprises to integrate conservational AI into their existing ecosystem. As a result, we believe conversational AI’s share in the broader AI’s addressable market can climb to 20% by 2025 (USD 18–20bn). Our estimate may prove to be conservative; they could be even higher if conversational AI improvements (in terms of computing power, machine learning, and deep learning capabilities), availability of talent, enterprise adoption, spending from governments, and incentives are stronger than expected. How to invest in AI? We see artificial intelligence as a horizontal technology that will have important use cases across a number of applications and industries. From a broader perspective, AI, along with big data and cybersecurity, forms what we call the ABCs of technology. We believe these three major foundational technologies are at inflection points and should see faster adoption over the next few years as enterprises and governments increase their focus and investments in these areas. Conservational AI is currently in its early stages of monetization and costs remain high as it is expensive to run. Instead of investing directly in such platforms, interested investors in the short term can consider semiconductor companies, and cloud-service providers that provides the infrastructure needed for generative AI to take off. In the medium to long term, companies can integrate generative AI to improve margins across industries and sectors, such as within healthcare and traditional manufacturing. Outside of public equities, investors can also consider opportunities in private equity (PE). We believe the tech sector is currently undergoing a new innovation cycle after 12–18 months of muted activity, which provides interesting and new opportunities that PE can capture through early-stage investments."""

tokenized_text = tokenizer(text)

print(len(tokenized_text['input_ids']))

[out]：

我们看到，示例中的输入文本有800个输入子字标记，而不是300个单词。

Q1：下面是什么意思？`Your max_length is set to 1000 ...`

警告消息如下：
Your max_length is set to 1000, but you input_length is only 856. You might consider decreasing max_length manually, e.g. summarizer(‘…’, max_length=428)

让我们首先尝试将输入放入模型中，并查看其输出的token数量（无管道）

【验证码】：

from transformers import AutoTokenizer, AutoModelForSeq2SeqLM

tokenizer = AutoTokenizer.from_pretrained("facebook/bart-large-cnn")
model = AutoModelForSeq2SeqLM.from_pretrained("facebook/bart-large-cnn")

text = """We see ChatGPT as an engine that will eventually power human interactions with computer systems in a familiar, natural, and intuitive way. As ChatGPT stated, large language models can be put to work as a communication engine in a variety of applications across a number of vertical markets. Glaringly absent in its answer is the use of ChatGPT in search engines. Microsoft, which is an investor in OpenAI, is integrating ChatGPT into its Bing search engine. The use of a large language model enables more complex and more natural searches and extract deeper meaning and better context from source material. This is ultimately expected to deliver more robust and useful results. Is AI coming for your job? Every wave of new and disruptive technology has incited fears of mass job losses due to automation, and we are already seeing those fears expressed relative to AI generally and ChatGPT specifically. The year 1896, when Henry Ford rolled out his first automobile, was probably not a good year for buggy whip makers. When IBM introduced its first mainframe, the System/360, in 1964, office workers feared replacement by mechanical brains that never made mistakes, never called in sick, and never took vacations. There are certainly historical cases of job displacement due to new technology adoption, and ChatGPT may unseat some office workers or customer service reps. However, we think AI tools broadly will end up as part of the solution in an economy that has more job openings than available workers. However, economic history shows that technology of any sort (i.e., manufacturing technology, communications technology, information technology) ultimately makes productive workers more productive and is net additive to employment and economic growth. How big is the opportunity? The broad AI hardware and services market was nearly USD 36bn in 2020, based on IDC and Bloomberg Intelligence data. We expect the market to grow by 20% CAGR to reach USD 90bn by 2025. Given the relatively early monetization stage of conversational AI, we estimate that the segment accounted for 10% of the broader AI’s addressable market in 2020, predominantly from enterprise and consumer subscriptions. That said, user adoption is rapidly rising. ChatGPT reached its first 1 million user milestone in a week, surpassing Instagram to become the quickest application to do so. Similarly, we see strong interest from enterprises to integrate conservational AI into their existing ecosystem. As a result, we believe conversational AI’s share in the broader AI’s addressable market can climb to 20% by 2025 (USD 18–20bn). Our estimate may prove to be conservative; they could be even higher if conversational AI improvements (in terms of computing power, machine learning, and deep learning capabilities), availability of talent, enterprise adoption, spending from governments, and incentives are stronger than expected. How to invest in AI? We see artificial intelligence as a horizontal technology that will have important use cases across a number of applications and industries. From a broader perspective, AI, along with big data and cybersecurity, forms what we call the ABCs of technology. We believe these three major foundational technologies are at inflection points and should see faster adoption over the next few years as enterprises and governments increase their focus and investments in these areas. Conservational AI is currently in its early stages of monetization and costs remain high as it is expensive to run. Instead of investing directly in such platforms, interested investors in the short term can consider semiconductor companies, and cloud-service providers that provides the infrastructure needed for generative AI to take off. In the medium to long term, companies can integrate generative AI to improve margins across industries and sectors, such as within healthcare and traditional manufacturing. Outside of public equities, investors can also consider opportunities in private equity (PE). We believe the tech sector is currently undergoing a new innovation cycle after 12–18 months of muted activity, which provides interesting and new opportunities that PE can capture through early-stage investments."""

tokenized_text = tokenizer(text, return_tensors="pt")

outputs = model.generate(tokenized_text['input_ids'])

tokenizer.decode(outputs[0], skip_special_tokens=True)

[stderr]：

/usr/local/lib/python3.9/dist-packages/transformers/generation/utils.py:1288: 

UserWarning: Using `max_length`'s default (142) to control the generation length. This behaviour is deprecated and will be removed from the config in v5 of Transformers -- we recommend using `max_new_tokens` to control the maximum length of the generation.

[stdout]：
ChatGPT是一种引擎，最终将以熟悉，自然和直观的方式为人类与计算机系统的交互提供动力。微软是OpenAI的投资者，正在将ChatGPT集成到其Bing搜索引擎中。根据IDC和Bloomberg Intelligence的数据，2020年广泛的AI硬件和服务市场接近360亿美元。
检查令牌的输出数量：

print(outputs.shape)

print(len(tokenizer.decode(outputs[0], skip_special_tokens=True)))

[out]：

torch.Size([1, 73])
343

因此，该模型将800个子字标记输入汇总为由343个字符组成的73个子字的输出

不知道你是怎么得到2k+ chars的输出的，所以让我们试试pipeline。
【验证码】：

from transformers import pipeline

summarizer = pipeline("summarization", model="facebook/bart-large-cnn")

text = """We see ChatGPT as an engine that will eventually power human interactions with computer systems in a familiar, natural, and intuitive way. As ChatGPT stated, large language models can be put to work as a communication engine in a variety of applications across a number of vertical markets. Glaringly absent in its answer is the use of ChatGPT in search engines. Microsoft, which is an investor in OpenAI, is integrating ChatGPT into its Bing search engine. The use of a large language model enables more complex and more natural searches and extract deeper meaning and better context from source material. This is ultimately expected to deliver more robust and useful results. Is AI coming for your job? Every wave of new and disruptive technology has incited fears of mass job losses due to automation, and we are already seeing those fears expressed relative to AI generally and ChatGPT specifically. The year 1896, when Henry Ford rolled out his first automobile, was probably not a good year for buggy whip makers. When IBM introduced its first mainframe, the System/360, in 1964, office workers feared replacement by mechanical brains that never made mistakes, never called in sick, and never took vacations. There are certainly historical cases of job displacement due to new technology adoption, and ChatGPT may unseat some office workers or customer service reps. However, we think AI tools broadly will end up as part of the solution in an economy that has more job openings than available workers. However, economic history shows that technology of any sort (i.e., manufacturing technology, communications technology, information technology) ultimately makes productive workers more productive and is net additive to employment and economic growth. How big is the opportunity? The broad AI hardware and services market was nearly USD 36bn in 2020, based on IDC and Bloomberg Intelligence data. We expect the market to grow by 20% CAGR to reach USD 90bn by 2025. Given the relatively early monetization stage of conversational AI, we estimate that the segment accounted for 10% of the broader AI’s addressable market in 2020, predominantly from enterprise and consumer subscriptions. That said, user adoption is rapidly rising. ChatGPT reached its first 1 million user milestone in a week, surpassing Instagram to become the quickest application to do so. Similarly, we see strong interest from enterprises to integrate conservational AI into their existing ecosystem. As a result, we believe conversational AI’s share in the broader AI’s addressable market can climb to 20% by 2025 (USD 18–20bn). Our estimate may prove to be conservative; they could be even higher if conversational AI improvements (in terms of computing power, machine learning, and deep learning capabilities), availability of talent, enterprise adoption, spending from governments, and incentives are stronger than expected. How to invest in AI? We see artificial intelligence as a horizontal technology that will have important use cases across a number of applications and industries. From a broader perspective, AI, along with big data and cybersecurity, forms what we call the ABCs of technology. We believe these three major foundational technologies are at inflection points and should see faster adoption over the next few years as enterprises and governments increase their focus and investments in these areas. Conservational AI is currently in its early stages of monetization and costs remain high as it is expensive to run. Instead of investing directly in such platforms, interested investors in the short term can consider semiconductor companies, and cloud-service providers that provides the infrastructure needed for generative AI to take off. In the medium to long term, companies can integrate generative AI to improve margins across industries and sectors, such as within healthcare and traditional manufacturing. Outside of public equities, investors can also consider opportunities in private equity (PE). We believe the tech sector is currently undergoing a new innovation cycle after 12–18 months of muted activity, which provides interesting and new opportunities that PE can capture through early-stage investments."""

output = summarizer(text)

print(output)

[out]：

[{'summary_text': 'ChatGPT is an engine that will eventually power human interactions with computer systems in a familiar, natural, and intuitive way. Microsoft, which is an investor in OpenAI, is integrating ChatGPT into its Bing search engine. The broad AI hardware and services market was nearly USD 36bn in 2020, based on IDC and Bloomberg Intelligence data.'}]

检查输出的大小：

print(output[0]['summary_text'])

[out]：

这与我们如何使用没有管道的模型，343个字符的摘要是一致的。

Q：是不是不用设置`max_new_tokens`了？

是的，你不需要做任何事情，因为摘要已经比输入文本短了。

Q：设置`max_new_tokens`有什么作用？

我们知道默认的输出摘要给了我们73个token。让我们试着看看如果我们把它设置为30个token会发生什么！

from transformers import AutoTokenizer, AutoModelForSeq2SeqLM

tokenizer = AutoTokenizer.from_pretrained("facebook/bart-large-cnn")
model = AutoModelForSeq2SeqLM.from_pretrained("facebook/bart-large-cnn")

text = """We see ChatGPT as an engine that will eventually power human interactions with computer systems in a familiar, natural, and intuitive way. As ChatGPT stated, large language models can be put to work as a communication engine in a variety of applications across a number of vertical markets. Glaringly absent in its answer is the use of ChatGPT in search engines. Microsoft, which is an investor in OpenAI, is integrating ChatGPT into its Bing search engine. The use of a large language model enables more complex and more natural searches and extract deeper meaning and better context from source material. This is ultimately expected to deliver more robust and useful results. Is AI coming for your job? Every wave of new and disruptive technology has incited fears of mass job losses due to automation, and we are already seeing those fears expressed relative to AI generally and ChatGPT specifically. The year 1896, when Henry Ford rolled out his first automobile, was probably not a good year for buggy whip makers. When IBM introduced its first mainframe, the System/360, in 1964, office workers feared replacement by mechanical brains that never made mistakes, never called in sick, and never took vacations. There are certainly historical cases of job displacement due to new technology adoption, and ChatGPT may unseat some office workers or customer service reps. However, we think AI tools broadly will end up as part of the solution in an economy that has more job openings than available workers. However, economic history shows that technology of any sort (i.e., manufacturing technology, communications technology, information technology) ultimately makes productive workers more productive and is net additive to employment and economic growth. How big is the opportunity? The broad AI hardware and services market was nearly USD 36bn in 2020, based on IDC and Bloomberg Intelligence data. We expect the market to grow by 20% CAGR to reach USD 90bn by 2025. Given the relatively early monetization stage of conversational AI, we estimate that the segment accounted for 10% of the broader AI’s addressable market in 2020, predominantly from enterprise and consumer subscriptions. That said, user adoption is rapidly rising. ChatGPT reached its first 1 million user milestone in a week, surpassing Instagram to become the quickest application to do so. Similarly, we see strong interest from enterprises to integrate conservational AI into their existing ecosystem. As a result, we believe conversational AI’s share in the broader AI’s addressable market can climb to 20% by 2025 (USD 18–20bn). Our estimate may prove to be conservative; they could be even higher if conversational AI improvements (in terms of computing power, machine learning, and deep learning capabilities), availability of talent, enterprise adoption, spending from governments, and incentives are stronger than expected. How to invest in AI? We see artificial intelligence as a horizontal technology that will have important use cases across a number of applications and industries. From a broader perspective, AI, along with big data and cybersecurity, forms what we call the ABCs of technology. We believe these three major foundational technologies are at inflection points and should see faster adoption over the next few years as enterprises and governments increase their focus and investments in these areas. Conservational AI is currently in its early stages of monetization and costs remain high as it is expensive to run. Instead of investing directly in such platforms, interested investors in the short term can consider semiconductor companies, and cloud-service providers that provides the infrastructure needed for generative AI to take off. In the medium to long term, companies can integrate generative AI to improve margins across industries and sectors, such as within healthcare and traditional manufacturing. Outside of public equities, investors can also consider opportunities in private equity (PE). We believe the tech sector is currently undergoing a new innovation cycle after 12–18 months of muted activity, which provides interesting and new opportunities that PE can capture through early-stage investments."""

tokenized_text = tokenizer(text, return_tensors="pt")

outputs = model.generate(tokenized_text['input_ids'], max_new_tokens=30)

[stderr]：

ValueError                                Traceback (most recent call last)
<ipython-input-26-665cd5fbe802> in <module>
      3 tokenized_text = tokenizer(text, return_tensors="pt")
      4 
----> 5 model.generate(tokenized_text['input_ids'], max_new_tokens=30)

1 frames
/usr/local/lib/python3.9/dist-packages/transformers/generation/utils.py in generate(self, inputs, generation_config, logits_processor, stopping_criteria, prefix_allowed_tokens_fn, synced_gpus, **kwargs)
   1304 
   1305         if generation_config.min_length is not None and generation_config.min_length > generation_config.max_length:
-> 1306             raise ValueError(
   1307                 f"Unfeasible length constraints: the minimum length ({generation_config.min_length}) is larger than"
   1308                 f" the maximum length ({generation_config.max_length})"

ValueError: Unfeasible length constraints: the minimum length (56) is larger than the maximum length (31)

啊哈，有一个最小长度，模型希望输出为摘要！

所以让我们试着把它设置为60

tokenized_text = tokenizer(text, return_tensors="pt")

outputs = model.generate(tokenized_text['input_ids'], max_new_tokens=60)

print(tokenizer.decode(outputs[0], skip_special_tokens=True))

[out]：

ChatGPT is an engine that will eventually power human interactions with computer systems in a familiar, natural, and intuitive way. Microsoft, which is an investor in OpenAI, is integrating ChatGPT into its Bing search engine. The broad AI hardware and services market was nearly USD 36bn

我们看到现在汇总输出比默认输出73短，符合我们设置的60 max_new_tokens限制。

如果我们检查print(len(outputs[0]))，我们会得到61个子单词标记，max_new_tokens中的另一个标记是用于解释句子结束符号。如果打印outputs，您会看到第一个标记id是2，由</s>标记表示。
当您指定skip_special_tokens=True时，它将删除</s>标记，以及句子开始标记<s>。

Q4：min_length和max_length实际上是如何工作的（它似乎没有遵循给定的限制）？

在上面的例子中，min_length实际上很难确定，因为模型必须决定获得良好摘要输出所需的最小子字标记。还记得Unfeasible length constraints: the minimum length (56) ...警告吗？

Q5：我实际上可以给予这个摘要器的最大输入是多少？

合理的max_length或更恰当的max_new_tokens最有可能低于您的输入长度，如果有某种UI限制或计算/延迟限制，最好将其保持在低水平并接近所需的任何长度。
也就是说，要设置max_new_tokens，只需确保它低于输入文本中的token数量，并且对您的应用程序足够敏感。如果您想知道一个大致的数量，请尝试不设置限制的模型，看看汇总输出是否是您期望的模型行为，然后进行适当的调整。
如烹饪时的调味料，***“根据需要添加/减少max_new_tokens”***

Q3：在上述2211个字符中，前933个字符是来自文本的有效内容，但随后会发布文本，如...

当将min_length设置为某个任意大的数字时，远大于模型的默认输出，即73个子字，

print(summarizer(text, max_length=900, min_length=300, do_sample=False))

print(summarizer(text, max_length=900, min_length=500, do_sample=False))

它就会警告你，
[sterr]：

Your max_length is set to 900, but you input_length is only 800. You might consider decreasing max_length manually, e.g. summarizer('...', max_length=400)

它会开始产生幻觉，超过前300个子词标记。可能，模型认为超过300个子词，输入文本中的其他内容都不重要。
输出类似于：

[{'summary_text': 'ChatGPT is an engine that will eventually power human interactions with computer systems in a familiar, natural, and intuitive way. Microsoft, which is an investor in OpenAI, is integrating ChatGPT into its Bing search engine. ... They recommend semiconductor companies, cloud-service providers that provides the infrastructure needed for generative AI to take off, and private equity firms that provide the infrastructure for cloud-based services. They also suggest investors can consider opportunities in private equity (PE) to invest in AI platforms in the short-term and in the medium to long-term.'}]

[{'summary_text': "ChatGPT is an engine that will eventually power human interactions with computer systems in a familiar, natural, and intuitive way. Microsoft, which is an investor in OpenAI, is integrating ChatGPT into its Bing search engine. ... They say AI tools broadly will end up as part of the solution in an economy that has more job openings than available workers. The technology of any sort (i.e., manufacturing technology, communications technology, information technology) ultimately makes productive workers more productive and is net additive to employment and economic growth, they say. The authors believe the tech sector is currently undergoing a new innovation cycle after 12–18 months of muted activity, which provides interesting and new opportunities that PE can capture through early-stage investments. They recommend semiconductor companies, cloud-service providers that provides the infrastructure needed for generative AI to take off, and private equity firms that provide the infrastructure for cloud-based services. They also suggest investors can consider opportunities in private equity (PE) to invest in AI platforms in the short-term and in the medium to long-term, such as within healthcare and traditional manufacturing. The author's firm is based in New York and they have worked with Microsoft, Google, Facebook, and others on AI projects in the past. The firm has also worked with Google, Microsoft, Facebook and others to develop AI products and services in the U.S. and abroad. For confidential support, call the National Suicide Prevention Lifeline at 1-800-273-8255 or visit http://www.suicidepreventionlifeline.org/. For confidential. support on suicide matters call the Samaritans on 08457 90 90 90 or visit a local Samaritans branch or click here for details. In the UK, contact Samaritans at 08457 909090 or visit\xa0the Samaritans’\xa0online helpline at http:// www.samaritans.org\xa0or\xa0click\xa0here for details on how to get involved in the UK’s national suicide prevention Lifeline (in the UK or the UK). For confidential help in the United States, call\xa0the National suicide Prevention Line at\xa0800\xa0273\xa08255."}]

问：为什么模型开始产生超过300个子词的幻觉？

很好的问题，也是一个活跃的研究领域，请参阅https://aclanthology.org/2022.naacl-main.387/，在该领域还有更多。

[意见]：就个人而言，Hunch说，这很可能是因为模型从文本中学习的大部分数据都是800多个子词，它训练的摘要长度在80-300个子词之间。训练数据点在摘要中有300-500个子词，它总是包含SOS帮助热线。因此，每当模型达到min_length〉300时，它就开始过拟合。

为了证明这个猜测，尝试另一个随机的800多个子字的文本，然后再次将min_length设置为500，它很可能会再次幻觉SOS句子超过300个子字。

赞(0）回复(0）举报 2023-03-21

我来回答

python 为什么bart-large-cnn摘要模型会在不同的长度设置下给出有趣的输出？

Q1：以下警告消息是什么意思？`Your max_length is set to 1000, ...`

Q2：上面这条消息后，发布了一个2211字的摘要，怎么会这样？

Q3：在上述2211个字符中，前933个字符是来自文本的有效内容，但随后发布的文本如下

Q4：min_length和max_length实际上是如何工作的（它似乎没有遵循给定的限制）？

1条答案

Q2：上面这条消息后，发布了一个2211字的摘要，怎么会这样？

我们人类如何决定单词的数量与模型如何看到令牌的数量有点不同，即

我们看到，示例中的输入文本有800个输入子字标记，而不是300个单词。

Q1：下面是什么意思？`Your max_length is set to 1000 ...`

让我们首先尝试将输入放入模型中，并查看其输出的token数量（无管道）

因此，该模型将800个子字标记输入汇总为由343个字符组成的73个子字的输出

Q：是不是不用设置`max_new_tokens`了？

Q：设置`max_new_tokens`有什么作用？

啊哈，有一个最小长度，模型希望输出为摘要！

我们看到现在汇总输出比默认输出73短，符合我们设置的60 max_new_tokens限制。

Q4：min_length和max_length实际上是如何工作的（它似乎没有遵循给定的限制）？

Q5：我实际上可以给予这个摘要器的最大输入是多少？

Q3：在上述2211个字符中，前933个字符是来自文本的有效内容，但随后会发布文本，如...

问：为什么模型开始产生超过300个子词的幻觉？

相关问题

热门标签

最新问答

python 为什么bart-large-cnn摘要模型会在不同的长度设置下给出有趣的输出？

Q1：以下警告消息是什么意思？Your max_length is set to 1000, ...

Q2：上面这条消息后，发布了一个2211字的摘要，怎么会这样？

Q3：在上述2211个字符中，前933个字符是来自文本的有效内容，但随后发布的文本如下

Q4：min_length和max_length实际上是如何工作的（它似乎没有遵循给定的限制）？

1条答案

Q2：上面这条消息后，发布了一个2211字的摘要，怎么会这样？

我们人类如何决定单词的数量与模型如何看到令牌的数量有点不同，即

我们看到，示例中的输入文本有800个输入子字标记，而不是300个单词。

Q1：下面是什么意思？Your max_length is set to 1000 ...

让我们首先尝试将输入放入模型中，并查看其输出的token数量（无管道）

因此，该模型将800个子字标记输入汇总为由343个字符组成的73个子字的输出

Q：是不是不用设置max_new_tokens了？

Q：设置max_new_tokens有什么作用？

啊哈，有一个最小长度，模型希望输出为摘要！

我们看到现在汇总输出比默认输出73短，符合我们设置的60 max_new_tokens限制。

Q4：min_length和max_length实际上是如何工作的（它似乎没有遵循给定的限制）？

Q5：我实际上可以给予这个摘要器的最大输入是多少？

Q3：在上述2211个字符中，前933个字符是来自文本的有效内容，但随后会发布文本，如...

问：为什么模型开始产生超过300个子词的幻觉？

相关问题

热门标签

最新问答

Q1：以下警告消息是什么意思？`Your max_length is set to 1000, ...`

Q1：下面是什么意思？`Your max_length is set to 1000 ...`

Q：是不是不用设置`max_new_tokens`了？

Q：设置`max_new_tokens`有什么作用？