将多个数据集行合并为一行作为Map

mrphzbgm  于 2021-07-13  发布在  Spark
关注(0)|答案(0)|浏览(424)

我有一个传入的数据集。

Source Schema:
---------------------------------
 - Id:String 
 - date:Date 
 - time:Long 
 - values:List<Object>

Output Schema
----------------------------------------
 - Id: String 
 - valuesMap: Map<Key, List<Object>>

键对象包含日期+时间。
你能用spark做这个吗?用sparksql???

入职日期

Id    Date.         Time.    Array of Data Objects
--------------------------------------------------------------------------
   1001   12/12/2019  121234  ["123", "ASD", "RET", "affg", "455"],["445","RR","hhggvv", "980"]
   1001  12/12/2019   233667  ["125", "AID", "RUT", "akfg", "451"],["440","HH","hhffgvv", "1002"]
   1001  12/15/2019   122212  ["129", "ALD", "DDT", "akfg", "458"],["441","JJ","hhffgvv", "1002"]
   1002  01/01/2019  766612  ["129", "ALD", "DDT", "akfg", "458"],["441","JJ","hhffgvv", "1002"]

传出数据

Id      Map Key Date  Map Key Time          Map Values
--------------------------------------------------------------------------
    1001    12/12/2019    766612    ["123","ASD","RET","affg","455"],["445","RR","hhggvv","980"]    
            12/15/2019.   122212.   ["129","ALD","DDT","akfg","458"],["441","JJ","hhffgvv", "1002"]
    1002    01/01/2019    766612    ["129", "ALD", "DDT", "akfg", "458"],["441","JJ","hhffgvv", "1002"]

暂无答案!

目前还没有任何答案,快来回答吧!

相关问题