tensorflow tf.math.cumsum() propagates Inf or NaN when axis is ragged

jogvjijk  于 5个月前  发布在  其他
关注(0)|答案(6)|浏览(59)

问题类型

Bug

来源

二进制文件

Tensorflow版本

tf 2.11

自定义代码

OS平台和发行版

Linux Ubuntu 22.04

移动设备

  • 无响应*

Python版本

3.10.6

Bazel版本

  • 无响应*

GCC/编译器版本

  • 无响应*

CUDA/cuDNN版本

CUDA 11.7, cuDNN 8.2.4

GPU型号和内存大小

RTX 3090 24GiB

当前行为?

When using `tf.math.cumsum(ragged_tensor, axis=axis)`, if `axis` is ragged and `ragged_tensor` contains `inf` or `nan`, the output will be `nan` for *all* the following flat values, even for those that are not supposed to be summed with those `nan`s. This doesn't happen for regular `Tensor`s. 

Moreover, the behavior persists when `exlusive=True` or `reverse=True` is passed to `cumsum()`. In the latter case, additional `nan`s occur before the problematic value, instead of after it.

重现问题的独立代码

import tensorflow as tf 
import numpy as np 

# RaggedTensor without NaNs or Infs
rt = tf.ragged.constant([[3, 1, 4], [1, 5], [9, 2], [6, 5, 3]], dtype=tf.float32)
print('tf.math.cumsum(rt, axis=-1) = ', tf.math.cumsum(rt, axis=-1))

# RaggedTensor with Inf
rt2 = tf.ragged.constant([[3, 1, 4], [1, np.inf], [9, 2], [6, 5, 3]], dtype=tf.float32)
print('tf.math.cumsum(rt2, axis=-1) = ', tf.math.cumsum(rt2, axis=-1))
print('tf.math.cumsum(rt2.to_tensor(), axis=-1) = ', tf.math.cumsum(rt2.to_tensor(), axis=-1))
print('tf.math.cumsum(rt2, axis=-1, exclusive=True) = ', tf.math.cumsum(rt2, axis=-1, exclusive=True))
print('tf.math.cumsum(rt2.to_tensor(), axis=-1, exclusive=True) = ', tf.math.cumsum(rt2.to_tensor(), axis=-1, exclusive=True))
print('tf.math.cumsum(rt2, axis=-1, reverse=True) = ', tf.math.cumsum(rt2, axis=-1, reverse=True))
print('tf.math.cumsum(rt2.to_tensor(), axis=-1, reverse=True) = ', tf.math.cumsum(rt2.to_tensor(), axis=-1, reverse=True))

"""Expected output
tf.math.cumsum(rt, axis=-1) =  <tf.RaggedTensor [[3.0, 4.0, 8.0], [1.0, 6.0], [9.0, 11.0], [6.0, 11.0, 14.0]]>
tf.math.cumsum(rt2, axis=-1) =  <tf.RaggedTensor [[3.0, 4.0, 8.0], [1.0, inf],  [9.0, 11.0], [6.0, 11.0, 14.0]]>
tf.math.cumsum(rt2.to_tensor(), axis=-1) =  tf.Tensor(
[[ 3.  4.  8.]
[ 1. inf inf]
[ 9. 11. 11.]
[ 6. 11. 14.]], shape=(4, 3), dtype=float32)
tf.math.cumsum(rt2, axis=-1, exclusive=True) =  <tf.RaggedTensor [[0.0, 3.0, 4.0], [0.0, 1.0], [0.0, 9.0], [0.0, 6.0, 11.0]]>
tf.math.cumsum(rt2.to_tensor(), axis=-1, exclusive=True) =  tf.Tensor(
[[ 0.  3.  4.]
[ 0.  1. inf]
[ 0.  9. 11.]
[ 0.  6. 11.]], shape=(4, 3), dtype=float32)
tf.math.cumsum(rt2, axis=-1, reverse=True) =  <tf.RaggedTensor [[8.0, 5.0, 4.0], [inf, inf], [11.0, 2.0], [14.0, 8.0, 3.0]]>
tf.math.cumsum(rt2.to_tensor(), axis=-1, reverse=True) =  tf.Tensor(
[[ 8.  5.  4.]
[inf inf  0.]
[11.  2.  0.]
[14.  8.  3.]], shape=(4, 3), dtype=float32)
"""

"""Actual output
tf.math.cumsum(rt, axis=-1) =  <tf.RaggedTensor [[3.0, 4.0, 8.0], [1.0, 6.0], [9.0, 11.0], [6.0, 11.0, 14.0]]>
tf.math.cumsum(rt2, axis=-1) =  <tf.RaggedTensor [[3.0, 4.0, 8.0], [1.0, inf], [nan, nan], [nan, nan, nan]]>
tf.math.cumsum(rt2.to_tensor(), axis=-1) =  tf.Tensor(
[[ 3.  4.  8.]
[ 1. inf inf]
[ 9. 11. 11.]
[ 6. 11. 14.]], shape=(4, 3), dtype=float32)
tf.math.cumsum(rt2, axis=-1, exclusive=True) =  <tf.RaggedTensor [[0.0, 3.0, 4.0], [0.0, 1.0], [nan, nan], [nan, nan, nan]]>
tf.math.cumsum(rt2.to_tensor(), axis=-1, exclusive=True) =  tf.Tensor(
[[ 0.  3.  4.]
[ 0.  1. inf]
[ 0.  9. 11.]
[ 0.  6. 11.]], shape=(4, 3), dtype=float32)
tf.math.cumsum(rt2, axis=-1, reverse=True) =  <tf.RaggedTensor [[nan, nan, nan], [inf, inf], [11.0, 2.0], [14.0, 8.0, 3.0]]>
tf.math.cumsum(rt2.to_tensor(), axis=-1, reverse=True) =  tf.Tensor(
[[ 8.  5.  4.]
[inf inf  0.]
[11.  2.  0.]
[14.  8.  3.]], shape=(4, 3), dtype=float32)
"""

相关日志输出

  • 无响应*
ybzsozfc

ybzsozfc1#

我猜想,在处理不规则维度的 cumsum 时,求和实际上是在平坦值的整个范围内进行的,然后基于 row_splits 的一些减法将长 cumsum 列表转换为每个维度的结果。用一个例子来解释:

rt = tf.ragged.constant([[3, 1, 4], [1, 5], [9, 2], [6, 5, 3]], dtype=tf.float32)  
# flat values [3, 1, 4, 1, 5, 9, 2, 6, 5, 3]
# row splits [0, 3, 5, 7, 10] 

tf.math.cumsum(rt, axis=-1)  
# first step: cum sum all flat values [3, 4, 8, 9, 14, 23, 25, 31, 36, 39]
# second step: for each row, subtract the last element of the previous row
# [3, 4, 8], [9-8, 14-8], [23-14, 25-14], [31-25, 36-25, 39-25]
# => [3, 4, 8], [1, 6], [9, 11], [6, 11, 14]

这样,当出现 inf 时,我们得到 inf - inf => nan

x6yk4ghg

x6yk4ghg2#

你好,
有任何更新吗?我也想知道一个不需要调用 rt.to_tensor() 的推荐解决方案。

t3irkdon

t3irkdon3#

@MKimiSH 我尝试在colab上使用TF v2.11复现这个问题,并没有遇到报告的错误。
请查看gist here,并告知我是否遗漏了复现问题所需的信息。
谢谢!

cyvaqqii

cyvaqqii4#

@MKimiSH 我尝试在colab上使用TF v2.11复现这个问题,但没有遇到报告的错误。请查看gist here,并告知我是否遗漏了复现问题所需的信息。谢谢!
感谢您对此问题的关注。看起来您得到了与我相同的输出,实际上证实了这个bug的存在。
我认为我没有正确使用“预期输出”这个术语,原始描述中写的“预期输出”是实际存在错误的输出。我已经更新了这个问题,将“预期输出”更名为“实际输出”,并插入了一个新的“预期输出”段落,参考了非零Tensor操作输出。

sg2wtvxw

sg2wtvxw5#

我猜想,在处理不规则维度的 cumsum 时,求和实际上是在平坦值的整个范围内进行的,然后基于 row_splits 的一些减法将长 cumsum 列表转换为每个维度的结果。用一个例子来解释:

rt = tf.ragged.constant([[3, 1, 4], [1, 5], [9, 2], [6, 5, 3]], dtype=tf.float32)  
# flat values [3, 1, 4, 1, 5, 9, 2, 6, 5, 3]
# row splits [0, 3, 5, 7, 10] 

tf.math.cumsum(rt, axis=-1)  
# first step: cum sum all flat values [3, 4, 8, 9, 14, 23, 25, 31, 36, 39]
# second step: for each row, subtract the last element of the previous row
# [3, 4, 8], [9-8, 14-8], [23-14, 25-14], [31-25, 36-25, 39-25]
# => [3, 4, 8], [1, 6], [9, 11], [6, 11, 14]

这样,当出现 inf 时,我们得到 inf - inf => nan
经过更多的检查,我认为这并不像这么简单。如果它真的只是对 float_values 求和,那么还会有溢出问题。但看起来没问题(实际上不是,参见更新):

import tensorflow as tf 
import numpy as np

rt = tf.ragged.constant([[2147483647, 1, 2, 3], [1, 5], [9, 2], [6, 5, 3]], dtype=tf.int32)
print(tf.math.cumsum(rt, axis=-1))
"""output: overflowing does not happen across the non-ragged dimension. 
<tf.RaggedTensor [[2147483647, -2147483648, -2147483646, -2147483643], [1, 6], [9, 11],
[6, 11, 14]]>
"""

rt = tf.ragged.constant([[2147483647, 1, 2, 3], [1, 5], [9, 2], [6, 5, 3]], dtype=tf.float32)
print(tf.math.cumsum(rt, axis=-1))
"""output: float32 is fine (NO!)
UPDATE: actually not fine, no cumsum is done for the latter three sub-lists!
<tf.RaggedTensor [[2147483600.0, 2147483600.0, 2147483600.0, 2147483600.0], [1.0, 5.0],
[9.0, 2.0], [6.0, 5.0, 3.0]]>
"""

rt = tf.ragged.constant([[2147483647, 1, 2, np.inf], [1, 5], [9, 2], [6, 5, 3]], dtype=tf.float32)
print(tf.math.cumsum(rt, axis=-1))
"""output: Inf causing problems
<tf.RaggedTensor [[2147483600.0, 2147483600.0, 2147483600.0, inf], [nan, nan], [nan, nan],
[nan, nan, nan]]>
"""

更新:等等,float32 似乎有问题。

ffx8fchx

ffx8fchx6#

@sachinprasadhs 我能够复现这个问题,请查看附件中的gist here
谢谢!

相关问题