tensorflow 自定义损耗函数:无效参数错误:基质大小不兼容:在[0]中:[1,3],在[1]中:[64,2] [操作:物料混合]

mspsb9vt  于 2023-03-19  发布在  其他
关注(0)|答案(1)|浏览(108)

我正在尝试使用自定义梯度编写自定义损失函数。虽然我尚未实现梯度,但Tensorflow在处理损失函数的输出时遇到困难(是因为形状吗?)。以下是错误消息:tensorflow.python.framework.errors_impl.InvalidArgumentError: Matrix size-incompatible: In[0]: [1,3], In[1]: [64,2] [Op:MatMul]
以下是不完整的“培训”循环:

def main():
    inputs = tf.keras.Input(shape=(2,))
    x1 = tf.keras.layers.Dense(64, activation="relu")(inputs)
    x2 = tf.keras.layers.Dense(64, activation="relu")(x1)
    outputs = tf.keras.layers.Dense(2)(x2)

    model = tf.keras.Model(inputs=inputs, outputs=outputs, name="pulse_model")

    # Input: lists of 2 floats
    # Output: lists of 2 complex numbers
    data = gen_hadamard_data(10)

    # Arbitrary batch size
    data = data.batch(batch_size=3)

    epochs = 2
    for epoch in range(epochs):
        print(f"\nStart of Epoch {epoch}")

        for step, (x_batch_train, y_batch_train) in enumerate(data):

            print(f"{x_batch_train=}")
            print(f"{y_batch_train=}")

            with tf.GradientTape() as tape:
                logits = model(tf.constant(x_batch_train.numpy().tolist()), training=True)

                loss_fn = make_fidelity_cost(x_batch_train)

                loss_value = loss_fn(logits, y_batch_train)

            grads = tape.gradient(loss_value, model.trainable_weights)

            print(f"Prediction: {logits}")
            print(f"Loss value: {loss_value}")
            print(f"Gradients: {grads}")

下面是损失函数:

def make_fidelity_cost(initial_states, backend=FakeArmonk()):
    @tf.custom_gradient
    def fidelity_cost(y_pred, y_actual):
        fidelity_list = []
        for in_state, pred, actual in zip(initial_states.numpy(),
                                          y_pred.numpy(),
                                          y_actual.numpy()):
            init_state = [np.cos(in_state[0] / 2),
                          np.exp(in_state[1] * 1.j) * np.sin(in_state[0] / 2)]
            job = run_gaussian(duration=16,
                               amp=pred[0],
                               sigma=pred[1],
                               init_state=init_state,
                               backend=backend)
            result = job.result()
            sv = result.get_statevector()
            actual_sv = Statevector(actual.tolist() + [0])
            # This is the actual calculation that gets returned as the loss
            # state_fidelity returns a scalar
            fidelity_list.append(state_fidelity(sv, actual_sv))

        def grad(upstream):
            # Don't know what I need to do here quite yet
            print(f"{upstream=}")
            return upstream, upstream
        return tf.Variable([fidelity_list]), grad
    return fidelity_cost

一些注意事项:

  • 我以前也发布过这方面的文章,但意识到它实际上是不可读的,所以我把它简化为正在发生的事情的基本内容
  • 主要的损失输出来自state_fidelity,这里的输出是一个标量,它被附加到一个列表中,然后这个列表作为返回值被传递给tf.constant
  • 虽然for循环中的代码可能来自不太常见的库,但唯一重要的一行是状态保真度附加到保真度列表的位置
  • 我不确定函数所需的矩阵大小,因此如果有人也能教我,我将不胜感激
qaxu7uf2

qaxu7uf21#

我试图按照你提供的代码,发现有一些我做的概念,真实的的部分和想象的部分可以单独处理和矢量化。

for in_state, pred, actual in zip(initial_states, y_pred, y_actual):
    real = np.cos(in_state[0] / 2)
    imagination = np.exp(np.asarray(in_state[1], dtype=np.csingle) * 
    np.asarray(complex('1.j'), dtype=np.csingle)) * np.sin(in_state[0] / 2) #  
    [0.+0.j] ... [0.+0.j]]
    init_state = [ real, imagination ]
    👉👉👉 ### filter_result = gaussian_filter(np.asarray(real, dtype=np.float64), sigma=5, truncate=0  )
    👉👉👉 ### job = run_gaussian(duration=16,
                           amp=pred[0],
                           sigma=pred[1],
                           init_state=init_state,
                           backend=backend)

    ### that is because it is not support imagnication number in calculation I can 
    do it by each type of real and imagination that can create a vector from their 
    result.
    ### append state vector
    fidelity_list.append(state_fidelity(current_vector, actual_vector))

loss_fn = make_fidelity_cost(image) 
# Convert to numpy array 
loss_value = loss_fn(model, label)

It is also using blur and contrast application

相关问题