我有一个传入的数据集。
Source Schema:
---------------------------------
- Id:String
- date:Date
- time:Long
- values:List<Object>
Output Schema
----------------------------------------
- Id: String
- valuesMap: Map<Key, List<Object>>
键对象包含日期+时间。
你能用spark做这个吗?用sparksql???
入职日期
Id Date. Time. Array of Data Objects
--------------------------------------------------------------------------
1001 12/12/2019 121234 ["123", "ASD", "RET", "affg", "455"],["445","RR","hhggvv", "980"]
1001 12/12/2019 233667 ["125", "AID", "RUT", "akfg", "451"],["440","HH","hhffgvv", "1002"]
1001 12/15/2019 122212 ["129", "ALD", "DDT", "akfg", "458"],["441","JJ","hhffgvv", "1002"]
1002 01/01/2019 766612 ["129", "ALD", "DDT", "akfg", "458"],["441","JJ","hhffgvv", "1002"]
传出数据
Id Map Key Date Map Key Time Map Values
--------------------------------------------------------------------------
1001 12/12/2019 766612 ["123","ASD","RET","affg","455"],["445","RR","hhggvv","980"]
12/15/2019. 122212. ["129","ALD","DDT","akfg","458"],["441","JJ","hhffgvv", "1002"]
1002 01/01/2019 766612 ["129", "ALD", "DDT", "akfg", "458"],["441","JJ","hhffgvv", "1002"]
暂无答案!
目前还没有任何答案,快来回答吧!