我正在创建一个pysparkDataframe,从kafka主题消息中读取它,这是一个复杂的json消息-
{
"paymentEntity": {
"id": 3081458,
"details": {
"values": [
{
"CardType": "VisaDebit"
},
{
"CardNumber": "********8759"
},
{
"WorldPayMasterId": "c670b980c50eb50373f66a1fe2bf8e70d320a0f7"
}
]
}}}
将其读入Dataframe后,其shcema和数据如下所示-
root
|-- details: struct (nullable = true)
| |-- values: array (nullable = true)
| | |-- element: struct (containsNull = true)
| | | |-- CardNumber: string (nullable = true)
| | | |-- CardType: string (nullable = true)
| | | |-- WorldPayMasterId: string (nullable = true)
|-- id: long (nullable = true)
+-----------------------------------------------------------------------------------+-------+
|details |id |
+-----------------------------------------------------------------------------------+-------+
|[[[, VisaDebit,], [********8759,,], [,, c670b980c50eb50373f66a1fe2bf8e70d320a0f7]]]|3081458|
+-----------------------------------------------------------------------------------+-------+
如果我用下面的代码转换
jsonDF = jsonDF.withColumn("paymentEntity-details-
values",explode(col('paymentEntity.details.values'))) \
.withColumn('id',col('paymentEntity.id')).drop('paymentEntity')
然后输出如下
root
|-- paymentEntity-details-values: struct (nullable = true)
| |-- CardNumber: string (nullable = true)
| |-- CardType: string (nullable = true)
| |-- WorldPayMasterId: string (nullable = true)
|-- id: long (nullable = true)
+---------------------------------------------+-------+
|paymentEntity-details-values |id |
+---------------------------------------------+-------+
|[, VisaDebit,] |3081458|
|[********8759,,] |3081458|
|[,, c670b980c50eb50373f66a1fe2bf8e70d320a0f7]|3081458|
+---------------------------------------------+-------+
我想处理它和转换Dataframe输出如下,而不爆炸数组字段-
+------------+---------+---------------------------------------------------+-------+
|cardnumber |CardType |WorldPayMasterId |id |
+------------+---------+---------------------------------------------------+-------+
|********8759|VisaDebit|c670b980c50eb50373f66a1fe2bf8e70d320a0f7 |3081458|
+------------+---------+---------------------------------------------------+-------+
请任何人建议如何得到相同的,任何帮助是感激的。
暂无答案!
目前还没有任何答案,快来回答吧!