Keras输入过程中使用DataFrame可变长度的字符串列表

wmomyfyw 于 2023-01-17 发布在其他

关注(0)|答案(1)|浏览(118)

我正在尝试建立一个TF/Keras模型，该模型包含顺序特征和标量特征。训练数据来自Pandas DataFrame。在一个示例中，顺序特征可以被视为字符串列表（或不同长度的单词）。单词本身可以被看作是分类的，唯一字的数量是有限的。2我想知道处理这类数据的正确顺序和方法是什么？3可能的步骤包括将字符串Map为整数，填充/截断为固定长度
我计划将序列特征和标量特征转换为https://www.tensorflow.org/tutorials/structured_data/preprocessing_layers下的Tensor，然后将序列特征放入LSTM，将标量特征放入MLP，并使用FCN合并它们的输出。
我试过使用keras.layers.StringLookup将字符串列表特性转换为整数列表。但是它抱怨nparray不能转换为Tensor。我想知道我是否应该先将字符串列表转换为字符串Tensor，然后再将其转换为整数Tensor？处理这类数据的正确顺序和方法是什么？

keras

来源：https://stackoverflow.com/questions/75123551/keras-input-process-with-dataframe-variable-length-list-of-strings

1条答案

按热度按时间

g0czyy6m1#

是的，首先你可以把你的字符串列表转换成Tensor。要把一个字符串转换成Tensor，你可以使用"tf.constant"函数。例如：

import tensorflow as tf
s = ["dog", "cat"]
ts = tf.constant(s)
print(ts)

您将获得：

tf.Tensor([b'dog' b'cat'], shape=(2,), dtype=string)

然后，您可以像在www.example.com上的get_category_encoding_layer（）函数中那样使用字符串查找和类别编码https://www.tensorflow.org/tutorials/structured_data/preprocessing_layers#categorical_columns

赞(0）回复(0）举报 2023-01-17

我来回答

Keras输入过程中使用DataFrame可变长度的字符串列表

1条答案

相关问题

热门标签

最新问答