ludwig 当使用标注器解码器时,< PAD>不会用作填充标记,

pcrecxhr  于 2个月前  发布在  其他
关注(0)|答案(4)|浏览(31)

描述错误

当使用 generator 解码器时,预测输出包含 <PAD> 关键字,这些关键字可以很容易地去除。但是当使用 tagger 解码器时,输入字符串中的一个字符被用作填充令牌,很难区分实际数据和填充令牌之间的区别。
由于 #1130 的原因,我不能使用最新版本,而必须使用 0.2.2.8 版本,它不会出现 0.3.3 给出的错误。
以下是我的 model_definition.yaml 文件:

training:
    epochs: 50
    early_stop: 10
    batch_size: 128

input_features:
    -
        name: column2
        type: text
        level: word
        encoder: rnn
        cell_type: lstm
        num_layers: 4
        reduce_output: null
        preprocessing:
            word_tokenizer: space
            padding_symbol: <PAD>

output_features:
    -
        name: column1
        type: text
        level: word
        decoder: tagger
        cell_type: lstm
        loss:
            type: sampled_softmax_cross_entropy

我的训练数据集的一个小型子集:

column1,column2
k k klk k hjkj hg k kg h k jlk k kj kg hk k k k k k k k klk kjh jkj hg ghk kj kh khgh hg,N S SHL S LHHL LL H SL H H LHL S SL HL HH S S S S S S S SHL SLL HHL LL SHH SL HL HLLH SL
hk lk kh klk l lmlk lml mn m klm mn m ml lj kl klk kjhj jh h h h klm l lm l l l l lk lmkl lkk hjh h h klk kj k klm mlkj kl ml lk lk m lk jkjh jh k k hkh hg hk lm kj gh hg hjk jh,NH HL SL HHL H SHLL HHL HH L LHH SH L SL SL HH LHL SLLH SL S S S HHH L SH L S S S SL HHLH SLS LHL S S HHL SL H SHH SLLL HH HL SL HL H LL LHLL HL H S LHL SL HH HH LL LH SL HHH LL
kj klkjkjh ghg hj j j jh jk hj ghjh hg g fg g g g hjhg hjh gf gh hkjklkjh hjhg hg,NL HHLLHLL LHL HH S S SL HH LH LHHL SL S LH S S S HHLL HHL LL HH SHLHHLLL SHLL HL
g j k l k h k k g g g k k kj g hk k kj h kj h g g,N H H H L L H S L S S H S SL L HH S SL L HL L L S
hkj k k k k kkk kh kl kmlkjk kj h hl l l lk kmlm jlkk j hl l l lk k k kmlm k k k kmlk k k k k kml k kkk hjkjhj jh jkl ljl lmlkj kjhg hg kk hk h kkkh jkl lmlk lkj,NHL H S S S SSS SL HH LHLLLH SL L SH S S SL SHLH LHLS L LH S S SL S S SHLH L S S SHLL S S S S SHL L SSS LHHLLH SL HHH SLH SHLLL HLLL HL HS LH L HSSL HHH SHLL HLL
e e e e de dcded edb cb dc de dc bcdc cb c ac c c bcdcb d dededc bc dcbc ba,N S S S LH LLHHL HLL HL HL HH LL LHHL SL H LH S S LHHLL H SHLHLL LH HLLH LL
g j k l l l k h l k j g h g f gh h k k jk h h g g,N H H H S S L L H L L L H L L HH S H S LH L S L S
f fedf d dfe fg g g ggf ed df ef d dhj h hgf g f efg fe de dd c ed d cd d df ghgfg e,N SLLH L SHL HH S S SSL LL SH LH L SHH L SLL H L LHH LL LH LS L HL S LH S SH HHLLH L
d fgh g g g g gh g g g ge g fgh fe dfc de fefed dc f ghg g gh g g gh g ghkjh jkjh g ghjhgh hg h gf g h kl kjkl lk h gf hkj klk kjh jh hg hg gh g g g fgh fe dfc de fefed g ghgf gh g ghkjh jkjh ghjhgh hg,N HHH L S S S SH L S S SL H LHH LL LHL HH HLHLL SL H HHL S SH L S SH L SHHLL HHLL L SHHLLH SL H LL H H HH LLHH SL L LL HHL HHL SLL HL SL HL SH L S S LHH LL LHL HH HLHLL H SHLL HH L SHHLL HHLL LHHLLH SL

我的训练命令:

ludwig train --experiment_name tagger_model --data_csv training_file.csv --model_definition_file model_definition.yaml --output_directory results

我的测试命令:

ludwig test --data_csv small_test.csv --model_path results\\tagger_model_run\\model --output_directory results\\prediction\\tagger_model_run

从上面的预测结果来看:

k,g,h,kl,l,k,l,m,l,k,l,l,j,l,j,k,hg,hg,fg,hk,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h

g,j,k,lm,l,mn,m,l,k,lm,l,l,kl,kj,hg,hg,f,h,kh,j,h,g,g,g,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h

f,f,f,e,g,h,f,fe,dc,c,f,f,f,f,e,g,h,h,hg,gf,f,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h

ghk,h,gf,gh,g,g,k,k,jh,k,l,lm,lm,kj,lm,h,jkl,k,j,kj,h,gf,f,g,g,g,gh,g,fe,def,fe,f,fg,gf,fe,de,d,d,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h,h

可以看到在这里使用了字符 h 作为填充。
但是如果我使用以下模型的 generator 解码器:

training:
   epochs: 50
   early_stop: 30
   batch_size: 128

input_features:
   -
       name: column2
       type: text
       level: word
       encoder: rnn
       cell_type: lstm
       num_layers: 4
       reduce_output: null
       preprocessing:
           word_tokenizer: space
           padding_symbol: <PAD>

output_features:
   -
       name: column1
       type: text
       level: word
       decoder: generator
       attention: bahdanau
       cell_type: lstm
       loss:
           type: sampled_softmax_cross_entropy

那么预测输出将使用正确的 <PAD> 令牌,如生成的输出所示。

k,g,h,kl,l,k,l,m,l,k,l,l,h,l,j,k,hg,hg,fg,hk,h,h

g,j,k,lm,l,mn,m,k,lm,m,l,l,l,j,l,j,k,hg,f,hk,kj,h,g,g

f,f,f,e,g,h,f,fe,dc,c,f,f,f,f,e,g,h,h,hg,gf,f

ghk,h,gf,gh,g,g,h,klm,l,l,k,lm,l,kj,hk,k,k,k,l,k,lm,l,kj,klk,h,h,h,h,k,l,k,jk,lk,h,gf,gh,h,g,<PAD>,<PAD>,<PAD>,<PAD>,<PAD>,<PAD>,<PAD>,<PAD>,<PAD>,<PAD>,<PAD>,<PAD>,<PAD>,<PAD>,<PAD>,<PAD>,<PAD>,<PAD>,<PAD>,<PAD>,<PAD>,<PAD>,<PAD>,<PAD>,<PAD>,<PAD>,<PAD>,<PAD>,<PAD>,<PAD>,<PAD>,<PAD>,<PAD>,<PAD>,<PAD>,<PAD>,<PAD>,<PAD>,<PAD>,<PAD>,<PAD>,<PAD>,<PAD>,<PAD>,<PAD>,<PAD>,<PAD>,<PAD>,<PAD>,<PAD>,<PAD>,<PAD>,<PAD>,<PAD>,<PAD>,<PAD>,<PAD>,<PAD>,<PAD>,<PAD>,<PAD>,<PAD>,<PAD>,<PAD>,<PAD>,<PAD>,<PAD>,<PAD>,<PAD>,<PAD>,<PAD>,<PAD>,<PAD>,<PAD>,<PAD>,<PAD>,<PAD>,<PAD>,<PAD>,<PAD>,<PAD>,<PAD>,<PAD>,<PAD>,<PAD>,<PAD>,<PAD>,<PAD>,<PAD>,<PAD>,<PAD>,<PAD>,<PAD>,<PAD>,<PAD>,<PAD>,<PAD>,<PAD>,<PAD>,<PAD>,<PAD>,<PAD>,<PAD>,<PAD>,<PAD>,<PAD>,<PAD>,<PAD>,<PAD>,<PAD>,<PAD>,<PAD>,<PAD>,<PAD>,<PAD>,<PAD>,<PAD>,<PAD>,<PAD>,<PAD>,<PAD>,<PAD>,<PAD>,<PAD>,<PAD>,<PAD>,<PAD>,<PAD>,<PAD>,<PAD>,<PAD>,<PAD>,<PAD>,<PAD>,<PAD>,<PAD>,<PAD>,<PAD>,<PAD>,<PAD>,<PAD>,<PAD>,<PAD>,<PAD>
cetgtptt

cetgtptt1#

关于标记器解码器,你不必使用它来确定序列长度。当返回预测时,序列的长度作为元组中的第二个元素返回。对于标记器,使用长度Tensor来确定输出序列。
希望这回答了你的问题。

5hcedyr0

5hcedyr02#

关于标记器解码器,您不必使用它来确定序列长度。当返回预测时,序列的长度作为元组中的第二个元素返回。对于标记器,请使用长度Tensor来确定输出序列。
希望这回答了您的问题。
谢谢Jim,但看起来我可以通过ludwig python api看到序列的长度。我正在使用控制台运行ludwig,这是否可能通过命令行实现,还是我必须使用api?
谢谢

txu3uszq

txu3uszq3#

目前需要使用API。

gywdnpxw

gywdnpxw4#

@farazk86,检查是否已使用@jimthompson5802的建议解决此问题?

相关问题