keras 三维Tensor的降三维度

ivqmmu1c 于 2023-06-23 发布在其他

关注(0)|答案(2)|浏览(197)

我正在做土地覆盖分类，我正在测试预先训练好的模型。首先，我使用EuroSAT数据集的RGB图像jpg，我没有问题。但是，现在我面临一个tif文件的问题，当我得到13个乐队的图像。我不能在预训练模型中输入它们，因为需要input_shape维度，当我有13个通道时，它被限制为3个通道。有没有办法解决这个问题？一个想法是用pca来降低维度，但是tf.pca函数只将2DTensor作为输入。在我的例子中，我想通过从输入形状（64，64，13）转换为（64，64，3）来减少第三维度。如果你有其他的想法来解决这个问题，他们非常欢迎。
我根据一个存储图像路径和标签的数据集分割训练和测试

ds_train = tf.data.Dataset.from_tensor_slices((df_train_strat['images_path'], df_train_strat["encoded_label"]))
ds_train = ds_train.map(lambda x, y : tf.py_function(parse_image, [x, y], [tf.float32, tf.int64]))
ds_train = ds_train.map(_fixup_shape)
ds_train = ds_train.batch(BATCH_SIZE)

ds_test = tf.data.Dataset.from_tensor_slices((df_test_strat['images_path'], df_test_strat["encoded_label"]))
ds_test = ds_test.map(lambda x, y : tf.py_function(parse_image, [x, y], [tf.float32, tf.int64]))
ds_test = ds_test.map(_fixup_shape)
ds_test = ds_test.batch(BATCH_SIZE)

并使用以下函数解析图像：

def parse_image(img_path: str, label: str):
    # Cast the Tensor to numpy and decode the string
    img_path = img_path.numpy().decode('utf-8')

    with rasterio.open(img_path) as src:
        img = src.read()
        # Channels last
        img = np.moveaxis(img, 0, 2)
        # Images normalization
        array_min, array_max = img.min(), img.max()
        img = (img - array_min)/(array_max - array_min)

    return img, label

这给了我在输入1和2中显示的Tensor
My inputs 1
My inputs 2
Tensor的形状是（64，64，64，13），因为我将图像批处理为64
然后我简单地调用一个预训练模型：

base_model = EfficientNetB0(weights='imagenet', include_top=False, input_shape=(64, 64, 3))

不幸的是，我不能在这里调整input_shape，这将抛出一个错误。

model = models.Sequential()
model.add(base_model)
model.add(layers.GlobalMaxPooling2D())

model.add(layers.Dense(512, activation='relu', kernel_initializer="he_normal"))
model.add(layers.BatchNormalization())
model.add(layers.Dropout(0.5))
model.add(layers.Dense(128, activation='relu', kernel_initializer="he_normal"))
model.add(layers.BatchNormalization())
model.add(layers.Dropout(0.5))

model.add(layers.Dense(10, activation = 'softmax', kernel_initializer="glorot_normal"))

model.summary()

early_stop = callbacks.EarlyStopping(monitor = 'val_loss', mode = 'min',
            patience = 3, restore_best_weights = True, verbose = 1)

reduce_lr = callbacks.ReduceLROnPlateau(monitor = 'val_loss', mode = 'min',
            patience = 2, factor = 0.5, min_lr = 1e-06, verbose = 1)

    
model.compile(optimizer=adam, loss='categorical_crossentropy',
              metrics=['accuracy'])

history = model.fit(ds_train, validation_data=ds_test, epochs=20, batch_size = 64,
                    callbacks=[reduce_lr, early_stop])

然后我适合它，但Traceback显示在下面的链接。
My outputs 1
My outputs 2
whole traceback

keras

来源：https://stackoverflow.com/questions/76461763/reducing-third-dimension-of-3d-tensor

2条答案

按热度按时间

xwmevbvl1#

正如问题所暗示的，你有一个形状为(64, 64, 13)的Tensor，前两个是宽度和高度，最后一个是图像中的层数或带数;给定给具有输入(64, 64, 3)的预定形状的模型。
你有两个选择：
1.对Tensor的最后一个维度执行切片。看看这个教程：https://www.tensorflow.org/guide/tensor_slicing。
对于您当前的示例，您可能已经知道哪些切片（图像带）对于预测任务可能很重要。拿着这些，忽略其他的。
1.正如你所说的，使用PCA，PCA也适用于大于2的维度，不用担心。您可以将数据从4、5、6任何维度降低到比开始时更小的任何数字，但这也需要权衡。
您依赖于降维机制从数据集中降低最有用的特征。您可能需要依赖另一种技术来确定您真正需要多少特征，例如累积解释方差。
选项2不是你的模型能接受的，所以选项1是要走的路。

赞(0）回复(0）举报 2023-06-23

k4ymrczo2#

您正在使用的图像可能是Sentinel-2图像。
光谱中可见光、近红外和短波红外部分的13个波段的多光谱数据
因此，你需要找出这13个波段中的哪些波段属于可见光谱。RGB。RGB波段分别位于波段索引3、2和1处（详情请参阅链接），因此您可以获得如下RGB图像：

#Get RGB image
rgb = img[[3,2,1]].transpose(1,2,0)

总的来说，用于获取RGB图像的parse_image函数应该如下所示：

# Clip values which have very high pixel values
    img = np.clip(img/10000,0,1)
    
    # Get RGB channels
    rgb = img[[3,2,1]]

    #clip rgb values
    # For getting better images clip the range of pixel values for each channel. 
    # We take only the pixel values from 0 to 0.3 and scale them back to between 0 and 1
    rgb[0] = np.clip(rgb[0],0,0.3)/0.3
    rgb[1] = np.clip(rgb[1],0,0.3)/0.3
    rgb[2] = np.clip(rgb[2],0,0.3)/0.3

    rgb = rgb.transpose(1,2,0)

赞(0）回复(0）举报 2023-06-23

我来回答

keras 三维Tensor的降三维度

2条答案

相关问题

热门标签

最新问答