CTranslate2 在束搜索实现中的其他差异?

8aqjt8rx  于 2个月前  发布在  其他
关注(0)|答案(5)|浏览(42)

为了确保这个问题不会被忽略,我在这里重复我的回答:
我遇到了同样的问题。然而,当我阅读这个回复时,我认为如果我不使用任何特殊的生成参数(如 no_repeat_ngram_size ),我会得到相同的结果。不幸的是,似乎在束搜索实现中还有其他差异——或者我漏掉了什么?
要重现:
软件包版本:transformers==4.34.0 , ctranslate2==3.20.0(这里使用的版本)

  1. 转换模型
ct2-transformers-converter --model "google/flan-t5-base" --output_dir "ct2-t5-base"
  1. 代码片段:
import torch

from transformers import T5ForConditionalGeneration, AutoTokenizer
import ctranslate2

device = torch.device("cuda")

model_name = "google/flan-t5-base"
hf_model = T5ForConditionalGeneration.from_pretrained(model_name).eval().to(device)
tokenizer = AutoTokenizer.from_pretrained(model_name)

fast_model =  ctranslate2.Translator("ct2-t5-base", device="cuda")

text = "translate English to German: physician assistants are medical providers who are licensed to diagnose and treat illness and disease and to prescribe medication"

def get_out(inp, model):
    inputs = tokenizer(inp, return_tensors="pt")
    ids = model.generate(**inputs.to(device),
                         num_beams=3,
                         min_length=0,
                         max_length=1024,
                         )
    return tokenizer.batch_decode(ids, skip_special_tokens=True)[0]

def get_out_fast(inp, model):
    source = tokenizer.encode(inp)
    source = tokenizer.convert_ids_to_tokens(source)
    results = model.translate_batch([source],
                                    beam_size=3,
                                    min_decoding_length=0,
                                    max_decoding_length=1024)
    target = results[0].hypotheses[0]
    return tokenizer.decode(tokenizer.convert_tokens_to_ids(target), skip_special_tokens=True)

res_vanilla = get_out(text, hf_model)
res_fast = get_out_fast(text, fast_model)

print("Vanilla output:", res_vanilla)
print("Ctranslate output:", res_fast)

输出:

Vanilla output: physician assistants sind medical providers, die zu Diagnose und Behandlung von Krankheiten und Krankheiten und zu Verknüpfen von Medikamenten zu ermitteln.
Ctranslate output: physician assistants sind medical providers, die zu Diagnose und Behandlung von Krankheiten und Krankheiten und zu Verknüpfen von Medikamenten zu kaufen sind.
t5zmwmid

t5zmwmid1#

如果使用beam=1运行相同的内容会怎样?

o2rvlv0m

o2rvlv0m2#

输出将是相同的:

Vanilla output: physician assistants sind medizinische Versorgungsträger, die ärztlichen Versorgungskräfte benötigen, um Krankheiten und Krankheiten zu diagnostischen und zu behandeln und zu prescriben Medikamenten.
Ctranslate output: physician assistants sind medizinische Versorgungsträger, die ärztlichen Versorgungskräfte benötigen, um Krankheiten und Krankheiten zu diagnostischen und zu behandeln und zu prescriben Medikamenten.
nzrxty8p

nzrxty8p3#

如果你有时间,可以在这些更改之前测试CT2 2.24:
CTranslate2/CHANGELOG.md
39f48f2中的第551行到第552行
| | * 移除选项normalize_scores:分数现在总是被pow(length, length_penalty)除以length_penalty,默认为1 |
| | * 移除选项allow_early_exit:只有在不使用惩罚时,波束搜索才会提前退出 |
并在有/没有allow_early_exit和length_penalty的情况下进行测试。

nle07wnf

nle07wnf4#

好的,我需要使用另一个模型,因为ctranslate2==2.24.0不支持t5,这里是我在实验中得到的结果:
转换模型:

ct2-transformers-converter --model "beogradjanka/bart_finetuned_keyphrase_extraction" --output_dir "ct2-bart"

新的代码片段:

from itertools import product

import torch

from transformers import BartForConditionalGeneration, AutoTokenizer
import ctranslate2

device = torch.device("cuda")

model_name = "beogradjanka/bart_finetuned_keyphrase_extraction"
hf_model = BartForConditionalGeneration.from_pretrained(model_name).eval().to(device)
tokenizer = AutoTokenizer.from_pretrained(model_name)

fast_model =  ctranslate2.Translator("ct2-bart", device="cuda")


text = (
    "The core CTranslate2 implementation is framework agnostic. The logic that is specific to each framework is moved "
    "to a conversion step that loads supported models into a unified representation. The weights are then optionally "
    "quantized and saved into an optimized binary format."
)

def get_out(inp, model, length_penalty=1.0, allow_early_exit=None):
    inputs = tokenizer(inp, return_tensors="pt")
    ids = model.generate(**inputs.to(device),
                         num_beams=5,
                         min_length=0,
                         max_length=1024,
                         length_penalty=length_penalty
                         )
    return tokenizer.batch_decode(ids, skip_special_tokens=True)[0]

def get_out_fast(inp, model, length_penalty=1.0, allow_early_exit=False):
    source = tokenizer.encode(inp)
    source = tokenizer.convert_ids_to_tokens(source)
    results = model.translate_batch([source],
                                    beam_size=5,
                                    min_decoding_length=0,
                                    max_decoding_length=1024,
                                    length_penalty=length_penalty,
                                    allow_early_exit=allow_early_exit,
                                    )
    target = results[0].hypotheses[0]
    return tokenizer.decode(tokenizer.convert_tokens_to_ids(target), skip_special_tokens=True)

for lp, aes in product([1.0, 3.0], [False, True]):
    res_vanilla = get_out(text, hf_model, lp, aes)
    res_fast = get_out_fast(text, fast_model, lp, aes)

    print("Predictions are equal:", res_vanilla == res_fast, f", when length_penalty={lp} and allow_early_exit={aes}")
    print("Vanilla output:", res_vanilla)
    print("Ctranslate output:", res_fast)
    print("========================================")

输出:

Predictions are equal: False , when length_penalty=1.0 and allow_early_exit=False
Vanilla output: ctranslate2, framework agnostic, platform agnostic
Ctranslate output: ctranslate2, framework agnostic, framework agnostic
========================================
Predictions are equal: False , when length_penalty=1.0 and allow_early_exit=True
Vanilla output: ctranslate2, framework agnostic, platform agnostic
Ctranslate output: ctranslate2, framework agnostic, framework agnostic
========================================
Predictions are equal: False , when length_penalty=3.0 and allow_early_exit=False
Vanilla output: ctranslate2, framework agnostic, model validation, model conversion
Ctranslate output: ctranslate2, framework agnostic, framework agnostic, model conversion
========================================
Predictions are equal: False , when length_penalty=3.0 and allow_early_exit=True
Vanilla output: ctranslate2, framework agnostic, model validation, model conversion
Ctranslate output: ctranslate2, ctranslate2, framework agnostic, platform agnostic, framework agnostic

如果我没有理解您所说的“with/without allow_early_exit and length_penalty”是什么意思,请纠正我。

hiz5n14c

hiz5n14c5#

正如吉约姆之前提到的,框架之间的束搜索方法通常存在微妙的差异。在我看来,这可能会导致轻微的差异,在两种情况下看起来都很好。

相关问题