tensorflow DSP溢出 - 在运行于DSP时,高像素值被夹紧

2w3kk1z5  于 4个月前  发布在  其他
关注(0)|答案(2)|浏览(62)

你好,
我训练了一个Keras模型来提取灰度分割图。
我将模型转换为TFLite并量化了模型。
如果像素值不是很高(<<1),量化模型在CPU和DSP硬件上产生的结果相似。
如果分割图产生像素预测值较高且接近1,那么在DSP上运行时,这些值似乎会被夹紧,而在CPU上则会产生合理的Map。
我在2.2、2.4和2.7版本上重复了测试,结果都相同。
我该如何改变这种情况?
我应该改变Map的动态范围吗?

ffscu2ro

ffscu2ro1#

为了加快故障排除过程,你能提供完整的代码和数据集来重现在这里报告的问题吗?

3duebb1j

3duebb1j2#

  1. I saved the kears model + tflite file + quantized tflite file here: https://drive.google.com/drive/folders/1ZIsPCiBPCzDmFcTbVAokjM7CtzLJO8y7?usp=sharing
  2. Conversion code
    _# Quntization script

read an hdf5 keras model and convert to tflite with quantization

import tensorflow as tf
import numpy as np
import cv2
import sys
import os
def rep_data_gen0():
a = []
for i in range(128):
img = 5*np.fromfile('norm_images/image_'+str(i)+".bin", dtype=np.float32)
img = img.reshape(224, 224, 3)
a.append(img)

a = np.array(a)
print(a.shape)
img = tf.data.Dataset.from_tensor_slices(a).batch(1)
for i in img.take(128):
    yield [i]

def convert_and_quantize(model):
converter = tf.lite.TFLiteConverter.from_keras_model(model)
converter.optimizations = [tf.lite.Optimize.DEFAULT]
converter.representative_dataset = rep_data_gen0
converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8]

print("Converting to TfLite using full integer quantization")
quant_model = converter.convert()
return quant_model

def main_func():
print (f'Reading keras model from {sys.argv[1]}')
keras_model = tf.keras.models.load_model(sys.argv[1], compile = False)
input_name = keras_model.input_names[0]
index = keras_model.input_names.index(input_name)
keras_model.inputs[index].set_shape([1, 224, 224, 3])
keras_model.trainable = False
quant_model = convert_and_quantize(keras_model)

print (f'Writing quantized tflite model to {sys.argv[2]}')
f = open(sys.argv[2],'wb')
f.write(quant_model)
f.close()

if name == "main":
print(tf.version)
if len(sys.argv) != 3:
print('Wrong command line arguments')
sys.exit(1)
main_func()
print("Done")_

  1. Example dataset (I train on 5M images, so I just uploaded 400 images as example). My task is a segmentation task and the trained maps are binary maps.
    https://drive.google.com/drive/folders/1Dn0olSmZ9HFGwAIViq1cfl9kfuvC5Qs_?usp=sharing
  2. representative dataset as bins saved in
    https://drive.google.com/drive/folders/1Q0iqM1qHdWFdW_PyA0BEQtzde9dmU7S5?usp=sharing
  3. Main training code details:
    _trainDataset = tf.data.Dataset.from_tensor_slices(dataPaths['train']) #same for validation
    batchSize = 32
    model = Unet('mobilenetv2', input_shape=inputShape, encoder_weights='imagenet', encoder_freeze=False,
    decoder_block_type='transpose', decoder_filters=decoderFilters) # Deconvolution
    model = utils.set_regularization(model, kernel_regularizer=keras.regularizers.l2(0.001),
    bias_regularizer=keras.regularizers.l2(0.001))
    lr = keras.callbacks.LearningRateScheduler(scheduler)
    loss = losses.binary_focal_dice_loss
    checkpoint = keras.callbacks.ModelCheckpoint(weightsPath, monitor='val_loss', verbose=0,
    save_best_only=True, save_weights_only=False,
    mode='auto', save_freq='epoch')
    callbacks_list = [lr, checkpoint, historyLogger]

compile

adam = keras.optimizers.Adam(lr=0.01, beta_1=0.9, beta_2=0.999, epsilon=1e-07, decay=0.0, amsgrad=False)
model.compile(loss=loss, optimizer=adam)
history = model.fit(x=trainDataset,
batch_size=batchSize,
epochs=numEpochs,
verbose=1, # show progress bar
validation_data=valDataset,
initial_epoch=initialEpoch,
steps_per_epoch=stepsPerEpoch,
validation_steps=validationSteps,
callbacks=callbacks_list)_

相关问题