The docs are actually helpful here, I was confused too. The section at the top titled "All Known Implementing Classes" lists all the implementations. RowData and GenericRowData are described as internal data structures. If you can use a POJO, then great. But if you need something that implements RowData , take a look at BinaryRowData , BoxedWrapperRowData , ColumnarRowData , NestedRowData , or any of the implementations there that aren't listed as internal. I'm personally using NestedRowData to map a DataStream[Row] into a DataStream[RowData] and I'm not at all sure that's a good idea :) Especially since I can't seem to add a string attribute
2条答案
按热度按时间ccgok5k51#
一般来说,DataStream API在记录类型方面非常灵活。POJO类型可能是最方便的类型。基本上可以使用任何Java类,但您需要检查通过反射提取的是哪个
TypeInformation
。有时需要手动覆盖它。对于
Row
,您将始终必须手动提供类型,因为反射不能基于类签名做很多事情。应该避免使用
GenericRowData
,它是一个内部类,有很多注意事项(字符串必须是StringData
,数组处理不简单)。此外,GenericRowData
在反序列化后会变为BinaryRowData
。TLDR此类型用于SQL引擎。e4yzc0pl2#
The docs are actually helpful here, I was confused too.
The section at the top titled "All Known Implementing Classes" lists all the implementations.
RowData
andGenericRowData
are described as internal data structures. If you can use a POJO, then great. But if you need something that implementsRowData
, take a look atBinaryRowData
,BoxedWrapperRowData
,ColumnarRowData
,NestedRowData
, or any of the implementations there that aren't listed as internal.I'm personally using
NestedRowData
to map aDataStream[Row]
into aDataStream[RowData]
and I'm not at all sure that's a good idea :) Especially since I can't seem to add astring
attribute