keras 如何从tensorflow hub设置BERT预处理层的输出形状?

fruv7luv  于 2022-11-13  发布在  其他
关注(0)|答案(1)|浏览(178)

我正在使用tensorflow hub构建一个用于文本分类的简单BERT模型。

import tensorflow as tf
import tensorflow_hub as tf_hub

bert_preprocess = tf_hub.KerasLayer("https://tfhub.dev/tensorflow/bert_en_uncased_preprocess/3")
bert_encoder = tf_hub.KerasLayer("https://tfhub.dev/tensorflow/bert_en_uncased_L-12_H-768_A-12/4")

text_input = tf.keras.layers.Input(shape=(), dtype=tf.string, name='text')
preprocessed_text = bert_preprocess(text_input)
encoded_input = bert_encoder(preprocessed_text)

l1 = tf.keras.layers.Dropout(0.3, name="dropout1")(encoded_input['pooled_output'])
l2 = tf.keras.layers.Dense(1, activation='sigmoid', name="output")(l1)

model = tf.keras.Model(inputs=[text_input], outputs = [l2])

model.summary()

在分析bert_preprocess步骤的输出时,我注意到它们是长度为128的数组。我的文本平均比128个标记短得多,因此,我的意图是减小这个长度参数,以便预处理只产生长度为40的数组。但是,我不知道如何将此max_lengthoutput_shape参数传递给bert_preprocess
打印的型号摘要:

__________________________________________________________________________________________________
 Layer (type)                   Output Shape         Param #     Connected to                     
==================================================================================================
 text (InputLayer)              [(None,)]            0           []                               
                                                                                                  
 keras_layer_16 (KerasLayer)    {'input_word_ids':   0           ['text[0][0]']                   
                                (None, 128),                                                      
                                 'input_type_ids':                                                
                                (None, 128),                                                      
                                 'input_mask': (Non                                               
                                e, 128)}                                                          
                                                                                                  
 keras_layer_17 (KerasLayer)    {'sequence_output':  109482241   ['keras_layer_16[0][0]',         
                                 (None, 128, 768),                'keras_layer_16[0][1]',         
                                 'default': (None,                'keras_layer_16[0][2]']         
                                768),                                                             
                                 'encoder_outputs':                                               
                                 [(None, 128, 768),                                               
                                 (None, 128, 768),                                                
                                 (None, 128, 768),                                                
                                 (None, 128, 768),                                                
                                 (None, 128, 768),                                                
                                 (None, 128, 768),                                                
                                 (None, 128, 768),                                                
...
Total params: 109,483,010
Trainable params: 769
Non-trainable params: 109,482,241

在查看文档时,我发现tf_hub.KerasLayer有一个output_shape参数,因此我尝试传递以下参数:
第一个
但是,在这两种情况下,下面得行都会引发错误:

bert_preprocess(["we have a very sunny day today don't you think so?"])

错误:

ValueError                                Traceback (most recent call last)
~\AppData\Local\Temp\ipykernel_23952\4048288771.py in <module>
----> 1 bert_preprocess("we have a very sunny day today don't you think so?")

~\AppData\Roaming\Python\Python37\site-packages\keras\utils\traceback_utils.py in error_handler(*args, **kwargs)
     65     except Exception as e:  # pylint: disable=broad-except
     66       filtered_tb = _process_traceback_frames(e.__traceback__)
---> 67       raise e.with_traceback(filtered_tb) from None
     68     finally:
     69       del filtered_tb

c:\Users\username\AppData\Local\Programs\Python\Python37\lib\site-packages\tensorflow_hub\keras_layer.py in call(self, inputs, training)
    237       result = smart_cond.smart_cond(training,
    238                                      lambda: f(training=True),
--> 239                                      lambda: f(training=False))
    240 
    241     # Unwrap dicts returned by signatures.

c:\Users\username\AppData\Local\Programs\Python\Python37\lib\site-packages\tensorflow_hub\keras_layer.py in <lambda>()
    237       result = smart_cond.smart_cond(training,
    238                                      lambda: f(training=True),
--> 239                                      lambda: f(training=False))
    240 
    241     # Unwrap dicts returned by signatures.
...
  Keyword arguments: {}

Call arguments received:
  • inputs="we have a very sunny day today don't you think so?"
  • training=False
vngu2lb8

vngu2lb81#

你需要去更低的层次,以实现这一点。你的目标是显示在页面的预处理层,但没有适当的介绍。
您可以将您的意图 Package 到一个自定义TF层中:

class ModifiedBertPreprocess(tf.keras.layers.Layer):
    def __init__(self, max_len):
        super(ModifiedBertPreprocess, self).__init__()
        
        preprocessor = tf_hub.load(
                    "https://tfhub.dev/tensorflow/bert_en_uncased_preprocess/3")
        
        self.tokenizer = tf_hub.KerasLayer(preprocessor.tokenize, name="tokenizer")
        
        self.prep_layer = tf_hub.KerasLayer(
                             preprocessor.bert_pack_inputs,
                             arguments={"seq_length":max_len})
        
    def call(self, inputs, training):
        tokenized = [self.tokenizer(seq) for seq in inputs]
        return self.prep_layer(tokenized)

基本上,你将自己标记和准备你的输入。预处理器有一个名为bert_pack_inputs的方法,它将让你指定输入的max_len
由于某种原因,self.tokenizer期望输入是列表格式,这很可能允许它接受多个输入。
您的模型应如下所示:

bert_encoder = tf_hub.KerasLayer("https://tfhub.dev/tensorflow/bert_en_uncased_L-12_H-768_A-12/4")

text_input = [tf.keras.layers.Input(shape=(), dtype=tf.string, name='text')]

bert_seq_changed = ModifiedBertPreprocess(max_len=40)

encoder_inputs = bert_seq_changed(text_input)

encoded_input = bert_encoder(encoder_inputs)

l1 = tf.keras.layers.Dropout(0.3, name="dropout1")(encoded_input['pooled_output'])
l2 = tf.keras.layers.Dense(1, activation='sigmoid', name="output")(l1)

model = tf.keras.Model(inputs=[text_input], outputs = [l2])

请注意,text_input层现在位于列表内部,因为self.tokenizer's输入签名需要列表。
以下是模型摘要:

Model: "model"
__________________________________________________________________________________________________
 Layer (type)                   Output Shape         Param #     Connected to                     
==================================================================================================
 text (InputLayer)              [(None,)]            0           []                               
                                                                                                  
 modified_bert_preprocess (Modi  {'input_type_ids':   0          ['text[0][0]']                   
 fiedBertPreprocess)            (None, 40),                                                       
                                 'input_word_ids':                                                
                                (None, 40),                                                       
                                 'input_mask': (Non                                               
                                e, 40)}                                                           
                                                                                                  
 keras_layer (KerasLayer)       {'encoder_outputs':  109482241   ['modified_bert_preprocess[0][0]'
                                 [(None, 40, 768),               , 'modified_bert_preprocess[0][1]
                                 (None, 40, 768),                ',                               
                                 (None, 40, 768),                 'modified_bert_preprocess[0][2]'
                                 (None, 40, 768),                ]                                
                                 (None, 40, 768),                                                 
                                 (None, 40, 768),                                                 
                                 (None, 40, 768),                                                 
                                 (None, 40, 768),                                                 
                                 (None, 40, 768),                                                 
                                 (None, 40, 768),                                                 
                                 (None, 40, 768),                                                 
                                 (None, 40, 768)],                                                
                                 'default': (None,                                                
                                768),                                                             
                                 'pooled_output': (                                               
                                None, 768),                                                       
                                 'sequence_output':                                               
                                 (None, 40, 768)}                                                 
                                                                                                  
 dropout1 (Dropout)             (None, 768)          0           ['keras_layer[0][13]']           
                                                                                                  
 output (Dense)                 (None, 1)            769         ['dropout1[0][0]']               
                                                                                                  
==================================================================================================
Total params: 109,483,010
Trainable params: 769
Non-trainable params: 109,482,241

调用自定义预处理层时:

bert_seq_changed([tf.convert_to_tensor(["we have a very sunny day today don't you think so?"], dtype=tf.string)])

注意,输入应该在一个列表中。调用模型可以通过两种方式完成:

model([tf.convert_to_tensor(["we have a very sunny day today don't you think so?"], dtype=tf.string)])

model(tf.convert_to_tensor(["we have a very sunny day today don't you think so?"], dtype=tf.string))

相关问题