如何在Tensorflow中为2-D MultiHeadAttention遮罩？

m1m5dgzv 于 2022-11-30 发布在其他

关注(0)|答案(1)|浏览(173)

有人能帮助我理解在MultiHeadAttention中遮罩3D输入（技术上是4D）吗？
我的原始数据集由以下形式的时间序列组成：
输入：(samples, horizon, features)~〉(8, 4, 2)~〉K, V, Q during inference
目标：(samples, horizon, features)~〉(8, 4, 2)~〉Q during training
Labels：健康生活篇
实际上，我取了时间序列数据的8个样本，最终以相同的格式输出1个样本。目标是输入的水平移位值，并馈入仅编码器的Transformer模型（如上所示的Q, K, V）。
为了最佳地近似单个输出样本（这与Targets中的最后一个样本相同），我需要对每个样本的水平线和样本之间的因果关系进行充分关注。一旦数据通过编码器运行，它将被发送到EinsumDense层，该层将(8, 4, 2)编码器输出减少为(1, 4, 2)。为了使所有这些工作正常进行，我需要在我的数据上注入第四维，因此Inputs和Targets的格式为(1, 8, 4, 2)。
那么，我的实际问题是，如何为编码器生成掩蔽？在对错误进行了一些挖掘后，我注意到MHA用于掩蔽softmax的Tensor形状的格式为(1, 1, 8, 4, 8, 4)，这使我相信它是(B, H, TS, TH, SS, SH)，其中：
B =批次
H =磁头
TS =目标样本
TH =目的层位
SS =源样本
SH =震源层位
我从the docs中得到这个概念只是因为attention_output的描述：
...其中T代表靶序列形状
假设情况是这样，下面是一个合理的掩码，还是有更合适的方法：

sample_mask = tf.linalg.band_part(tf.ones((samples, samples)), -1, 0)
horizon_mask = tf.ones((horizon, horizon))

encoder_mask = (
    sample_mask[:, tf.newaxis, :, tf.newaxis]
    * horizon_mask[tf.newaxis, :, tf.newaxis, :]
)

tensorflow

来源：https://stackoverflow.com/questions/74586701/how-do-i-mask-for-2-d-multiheadattention-in-tensorflow

1条答案

按热度按时间

fslejnso1#

它是掩蔽的，你可以想象它，因为数据包含在许多时尚没有错，但我试图使用Tensorflow方法，请查看结果，他们是在相同的维度。Tensorflow Masking layer
样本：简单相同的掩蔽值与目标形状你成为解决方案的观察者，证明与眼睛时尚改善治理.

import tensorflow as tf
import matplotlib.pyplot as plt

start = 3
limit = 25
delta = 3
sample = tf.range(start, limit, delta)
sample = tf.cast( sample, dtype=tf.int64 )
sample = tf.constant( sample, shape=( 8, 1 ) )

horizon = tf.random.uniform(shape=[1, 4], minval=5, maxval=10, dtype=tf.int64)
features = tf.random.uniform(shape=[1, 1, 2], minval=-5, maxval=+5, dtype=tf.int64)

temp = tf.math.multiply(sample, horizon)
temp = tf.expand_dims(temp, axis=2)
input = tf.math.multiply( temp, features )

print( "input: " )
print( input )

n_samples = 8
n_horizon = 4
n_features = 2
sample_mask = tf.linalg.band_part(tf.ones((n_samples, n_samples)), -1, 0)
horizon_mask = tf.ones((n_horizon, n_horizon))

encoder_mask = (
    sample_mask[:, tf.newaxis, :, tf.newaxis]
    * horizon_mask[tf.newaxis, :, tf.newaxis, :]
)

print( encoder_mask )

masking_layer = tf.keras.layers.Masking(mask_value=50, input_shape=(n_horizon, n_features))
print( masking_layer(input) )

img_1 = tf.keras.preprocessing.image.array_to_img(
        tf.constant( tf.constant( input[:,:,1], shape=(8, 4, 1) ), shape=(8, 4, 1) ),
        data_format=None,
        scale=True
    )
    

img_2 = tf.keras.preprocessing.image.array_to_img(
        tf.constant( masking_layer(input)[:,:,0], shape=(8, 4, 1) ),
        data_format=None,
        scale=True
    )

plt.figure(figsize=(1, 2))
plt.title("🧸")
plt.subplot(1, 2, 1)
plt.xticks([])
plt.yticks([])
plt.grid(False)
plt.imshow(img_1)
plt.xlabel("Input (8, 4, 2), left")

plt.subplot(1, 2, 2)
plt.xticks([])
plt.yticks([])
plt.grid(False)
plt.imshow(img_2)
plt.xlabel("Masks (8, 4, 2), left")

plt.show()

输出：输入我们从表匹配特征创建的Tensor。

[[ -960     0]
  [-1080     0]
  [ -960     0]
  [ -960     0]]], shape=(8, 4, 2), dtype=int64)

输出：问题掩码方法.

[[1. 1. 1. 1.]
   [1. 1. 1. 1.]
   [1. 1. 1. 1.]
   ...
   [1. 1. 1. 1.]
   [1. 1. 1. 1.]
   [1. 1. 1. 1.]]]], shape=(8, 4, 8, 4), dtype=float32)

输出：掩膜层= tf. keras. layers.掩膜（掩膜值= 50，输入形状=（n_horizon，n_features））

[[ -840     0]
  [ -945     0]
  [ -840     0]
  [ -840     0]]

 [[ -960     0]
  [-1080     0]
  [ -960     0]
  [ -960     0]]], shape=(8, 4, 2), dtype=int64)

第一次

赞(0）回复(0）举报 2022-11-30

我来回答

如何在Tensorflow中为2-D MultiHeadAttention遮罩？

1条答案

相关问题

热门标签

最新问答