CTranslate2 在束搜索实现中的其他差异？

8aqjt8rx 于 2个月前发布在其他

关注(0)|答案(5)|浏览(43)

为了确保这个问题不会被忽略，我在这里重复我的回答：
我遇到了同样的问题。然而，当我阅读这个回复时，我认为如果我不使用任何特殊的生成参数(如 no_repeat_ngram_size ),我会得到相同的结果。不幸的是，似乎在束搜索实现中还有其他差异——或者我漏掉了什么？
要重现：
软件包版本：transformers==4.34.0 , ctranslate2==3.20.0(这里使用的版本)

转换模型

ct2-transformers-converter --model "google/flan-t5-base" --output_dir "ct2-t5-base"

代码片段：

import torch

from transformers import T5ForConditionalGeneration, AutoTokenizer
import ctranslate2

device = torch.device("cuda")

model_name = "google/flan-t5-base"
hf_model = T5ForConditionalGeneration.from_pretrained(model_name).eval().to(device)
tokenizer = AutoTokenizer.from_pretrained(model_name)

fast_model =  ctranslate2.Translator("ct2-t5-base", device="cuda")

text = "translate English to German: physician assistants are medical providers who are licensed to diagnose and treat illness and disease and to prescribe medication"

def get_out(inp, model):
    inputs = tokenizer(inp, return_tensors="pt")
    ids = model.generate(**inputs.to(device),
                         num_beams=3,
                         min_length=0,
                         max_length=1024,
                         )
    return tokenizer.batch_decode(ids, skip_special_tokens=True)[0]

def get_out_fast(inp, model):
    source = tokenizer.encode(inp)
    source = tokenizer.convert_ids_to_tokens(source)
    results = model.translate_batch([source],
                                    beam_size=3,
                                    min_decoding_length=0,
                                    max_decoding_length=1024)
    target = results[0].hypotheses[0]
    return tokenizer.decode(tokenizer.convert_tokens_to_ids(target), skip_special_tokens=True)

res_vanilla = get_out(text, hf_model)
res_fast = get_out_fast(text, fast_model)

print("Vanilla output:", res_vanilla)
print("Ctranslate output:", res_fast)

输出：

Vanilla output: physician assistants sind medical providers, die zu Diagnose und Behandlung von Krankheiten und Krankheiten und zu Verknüpfen von Medikamenten zu ermitteln.
Ctranslate output: physician assistants sind medical providers, die zu Diagnose und Behandlung von Krankheiten und Krankheiten und zu Verknüpfen von Medikamenten zu kaufen sind.

CTranslate2

来源：https://github.com/OpenNMT/CTranslate2/issues/1740

5条答案

按热度按时间

t5zmwmid1#

如果使用beam=1运行相同的内容会怎样？

赞(0）回复(0）举报 2个月前

o2rvlv0m2#

输出将是相同的：

Vanilla output: physician assistants sind medizinische Versorgungsträger, die ärztlichen Versorgungskräfte benötigen, um Krankheiten und Krankheiten zu diagnostischen und zu behandeln und zu prescriben Medikamenten.
Ctranslate output: physician assistants sind medizinische Versorgungsträger, die ärztlichen Versorgungskräfte benötigen, um Krankheiten und Krankheiten zu diagnostischen und zu behandeln und zu prescriben Medikamenten.

赞(0）回复(0）举报 2个月前

nzrxty8p3#

如果你有时间，可以在这些更改之前测试CT2 2.24:
CTranslate2/CHANGELOG.md
39f48f2中的第551行到第552行
| | * 移除选项normalize_scores:分数现在总是被pow(length, length_penalty)除以length_penalty,默认为1 |
| | * 移除选项allow_early_exit:只有在不使用惩罚时，波束搜索才会提前退出 |
并在有/没有allow_early_exit和length_penalty的情况下进行测试。

赞(0）回复(0）举报 2个月前

nle07wnf4#

好的，我需要使用另一个模型，因为ctranslate2==2.24.0不支持t5,这里是我在实验中得到的结果：
转换模型：

ct2-transformers-converter --model "beogradjanka/bart_finetuned_keyphrase_extraction" --output_dir "ct2-bart"

新的代码片段：

from itertools import product

import torch

from transformers import BartForConditionalGeneration, AutoTokenizer
import ctranslate2

device = torch.device("cuda")

model_name = "beogradjanka/bart_finetuned_keyphrase_extraction"
hf_model = BartForConditionalGeneration.from_pretrained(model_name).eval().to(device)
tokenizer = AutoTokenizer.from_pretrained(model_name)

fast_model =  ctranslate2.Translator("ct2-bart", device="cuda")


text = (
    "The core CTranslate2 implementation is framework agnostic. The logic that is specific to each framework is moved "
    "to a conversion step that loads supported models into a unified representation. The weights are then optionally "
    "quantized and saved into an optimized binary format."
)

def get_out(inp, model, length_penalty=1.0, allow_early_exit=None):
    inputs = tokenizer(inp, return_tensors="pt")
    ids = model.generate(**inputs.to(device),
                         num_beams=5,
                         min_length=0,
                         max_length=1024,
                         length_penalty=length_penalty
                         )
    return tokenizer.batch_decode(ids, skip_special_tokens=True)[0]

def get_out_fast(inp, model, length_penalty=1.0, allow_early_exit=False):
    source = tokenizer.encode(inp)
    source = tokenizer.convert_ids_to_tokens(source)
    results = model.translate_batch([source],
                                    beam_size=5,
                                    min_decoding_length=0,
                                    max_decoding_length=1024,
                                    length_penalty=length_penalty,
                                    allow_early_exit=allow_early_exit,
                                    )
    target = results[0].hypotheses[0]
    return tokenizer.decode(tokenizer.convert_tokens_to_ids(target), skip_special_tokens=True)

for lp, aes in product([1.0, 3.0], [False, True]):
    res_vanilla = get_out(text, hf_model, lp, aes)
    res_fast = get_out_fast(text, fast_model, lp, aes)

    print("Predictions are equal:", res_vanilla == res_fast, f", when length_penalty={lp} and allow_early_exit={aes}")
    print("Vanilla output:", res_vanilla)
    print("Ctranslate output:", res_fast)
    print("========================================")

输出：

Predictions are equal: False , when length_penalty=1.0 and allow_early_exit=False
Vanilla output: ctranslate2, framework agnostic, platform agnostic
Ctranslate output: ctranslate2, framework agnostic, framework agnostic
========================================
Predictions are equal: False , when length_penalty=1.0 and allow_early_exit=True
Vanilla output: ctranslate2, framework agnostic, platform agnostic
Ctranslate output: ctranslate2, framework agnostic, framework agnostic
========================================
Predictions are equal: False , when length_penalty=3.0 and allow_early_exit=False
Vanilla output: ctranslate2, framework agnostic, model validation, model conversion
Ctranslate output: ctranslate2, framework agnostic, framework agnostic, model conversion
========================================
Predictions are equal: False , when length_penalty=3.0 and allow_early_exit=True
Vanilla output: ctranslate2, framework agnostic, model validation, model conversion
Ctranslate output: ctranslate2, ctranslate2, framework agnostic, platform agnostic, framework agnostic

如果我没有理解您所说的“with/without allow_early_exit and length_penalty”是什么意思，请纠正我。

赞(0）回复(0）举报 2个月前

hiz5n14c5#

正如吉约姆之前提到的，框架之间的束搜索方法通常存在微妙的差异。在我看来，这可能会导致轻微的差异，在两种情况下看起来都很好。

赞(0）回复(0）举报 2个月前