这个问题在这里已经有了答案:
spark scalaDataframe:将多列合并为单列(1个答案)
7个月前关门了。
我有一个Dataframe
+------------------+-------------------+--------------------+
| name| sku| description|
+------------------+-------------------+--------------------+
| Mary Rodriguez| hand-couple-manage|Senior word socia...|
| Jose Henderson| together-table-oil|Apply girl treatm...|
| Karen Villegas| child-somebody|Every tell serve....|
| Olivia Lynch|forget-matter-avoid|Perhaps environme...|
| Whitney Wiley| side-blue-dream|Quickly short soc...|
| Brittany Johnson| east-pretty|Indicate view sim...|
| Paul Morris| radio-window-us|Society month sho...|
| Jason Patterson| night-art-be-act|Entire around pla...|
| Kiara Gentry| compare-politics|Air my kind staff...|
架构
root
|-- sku: string (nullable = true)
|-- name_description: array (nullable = true)
| |-- element: string (containsNull = true)
如何按列分组 sku
以及从 name
以及 description
获取列 name_description
将值作为 JSON
格式 [{"name":..., "description":...}, {"name":..., "description":...}, ....]
对于中的每个值 sku
在Pypark?
1条答案
按热度按时间bbuxkriu1#
检查以下代码。