我正在使用tensorflow hub构建一个用于文本分类的简单BERT模型。
import tensorflow as tf
import tensorflow_hub as tf_hub
bert_preprocess = tf_hub.KerasLayer("https://tfhub.dev/tensorflow/bert_en_uncased_preprocess/3")
bert_encoder = tf_hub.KerasLayer("https://tfhub.dev/tensorflow/bert_en_uncased_L-12_H-768_A-12/4")
text_input = tf.keras.layers.Input(shape=(), dtype=tf.string, name='text')
preprocessed_text = bert_preprocess(text_input)
encoded_input = bert_encoder(preprocessed_text)
l1 = tf.keras.layers.Dropout(0.3, name="dropout1")(encoded_input['pooled_output'])
l2 = tf.keras.layers.Dense(1, activation='sigmoid', name="output")(l1)
model = tf.keras.Model(inputs=[text_input], outputs = [l2])
model.summary()
在分析bert_preprocess
步骤的输出时,我注意到它们是长度为128的数组。我的文本平均比128个标记短得多,因此,我的意图是减小这个长度参数,以便预处理只产生长度为40的数组。但是,我不知道如何将此max_length
或output_shape
参数传递给bert_preprocess
。
打印的型号摘要:
__________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
==================================================================================================
text (InputLayer) [(None,)] 0 []
keras_layer_16 (KerasLayer) {'input_word_ids': 0 ['text[0][0]']
(None, 128),
'input_type_ids':
(None, 128),
'input_mask': (Non
e, 128)}
keras_layer_17 (KerasLayer) {'sequence_output': 109482241 ['keras_layer_16[0][0]',
(None, 128, 768), 'keras_layer_16[0][1]',
'default': (None, 'keras_layer_16[0][2]']
768),
'encoder_outputs':
[(None, 128, 768),
(None, 128, 768),
(None, 128, 768),
(None, 128, 768),
(None, 128, 768),
(None, 128, 768),
(None, 128, 768),
...
Total params: 109,483,010
Trainable params: 769
Non-trainable params: 109,482,241
在查看文档时,我发现tf_hub.KerasLayer
有一个output_shape
参数,因此我尝试传递以下参数:
第一个
但是,在这两种情况下,下面得行都会引发错误:
bert_preprocess(["we have a very sunny day today don't you think so?"])
错误:
ValueError Traceback (most recent call last)
~\AppData\Local\Temp\ipykernel_23952\4048288771.py in <module>
----> 1 bert_preprocess("we have a very sunny day today don't you think so?")
~\AppData\Roaming\Python\Python37\site-packages\keras\utils\traceback_utils.py in error_handler(*args, **kwargs)
65 except Exception as e: # pylint: disable=broad-except
66 filtered_tb = _process_traceback_frames(e.__traceback__)
---> 67 raise e.with_traceback(filtered_tb) from None
68 finally:
69 del filtered_tb
c:\Users\username\AppData\Local\Programs\Python\Python37\lib\site-packages\tensorflow_hub\keras_layer.py in call(self, inputs, training)
237 result = smart_cond.smart_cond(training,
238 lambda: f(training=True),
--> 239 lambda: f(training=False))
240
241 # Unwrap dicts returned by signatures.
c:\Users\username\AppData\Local\Programs\Python\Python37\lib\site-packages\tensorflow_hub\keras_layer.py in <lambda>()
237 result = smart_cond.smart_cond(training,
238 lambda: f(training=True),
--> 239 lambda: f(training=False))
240
241 # Unwrap dicts returned by signatures.
...
Keyword arguments: {}
Call arguments received:
• inputs="we have a very sunny day today don't you think so?"
• training=False
1条答案
按热度按时间vngu2lb81#
你需要去更低的层次,以实现这一点。你的目标是显示在页面的预处理层,但没有适当的介绍。
您可以将您的意图 Package 到一个自定义TF层中:
基本上,你将自己标记和准备你的输入。预处理器有一个名为
bert_pack_inputs
的方法,它将让你指定输入的max_len
。由于某种原因,
self.tokenizer
期望输入是列表格式,这很可能允许它接受多个输入。您的模型应如下所示:
请注意,
text_input
层现在位于列表内部,因为self.tokenizer's
输入签名需要列表。以下是模型摘要:
调用自定义预处理层时:
注意,输入应该在一个列表中。调用模型可以通过两种方式完成:
或