我是spark&scala新手,我正在尝试使用另一列中的值来增加列中的键值对的值。
下面是输入Dataframe。
val inputDF = Seq(
(1, "Visa", 1, None),
(2, "MC", 2, Some("Visa -> 1")),
(3, "Amex", 1, None),
(4, "Amex", 3, Some("Visa -> 1, MC -> 1")),
(5, "Amex", 4, Some("Visa -> 2, MC -> 1")),
(6, "MC", 1, None),
(7, "Visa", 5, Some("Visa -> 2, MC -> 1, Amex -> 1")),
(8, "Visa", 6, Some("Visa -> 2, MC -> 2, Amex -> 1")),
(9, "MC", 1, None),
(10, "MC", 2, Some("Amex -> 1"))).toDF("person_id", "card_type", "number_of_cards", "card_type_details")
+---------+---------+---------------+-----------------------------+
|person_id|card_type|number_of_cards|card_type_details |
+---------+---------+---------------+-----------------------------+
|1 |Visa |1 |null |
|2 |MC |2 |Visa -> 1 |
|3 |Amex |1 |null |
|4 |Amex |3 |Visa -> 1, MC -> 1 |
|5 |Amex |4 |Visa -> 2, MC -> 1 |
|6 |MC |1 |null |
|7 |Visa |5 |Visa -> 2, MC -> 1, Amex -> 1|
|8 |Visa |6 |Visa -> 2, MC -> 2, Amex -> 1|
|9 |MC |1 |null |
|10 |MC |2 |Amex -> 1 |
+---------+---------+---------------+-----------------------------+
现在,根据上面的输入,如果card\u type\u details的值为空,则从card\u type获取值并添加->1(如第一行)。
如果card\u type\u details的值不为空,则检查card\u type是否已作为密钥存在于card\u type\u details中。如果是,则将相应键的值增加1,否则,添加一个新的键-值对(如第二行和第七行)。
以下是预期输出:
val expectedOutputDF = Seq(
(1, "Visa", 1, Some("Visa -> 1")),
(2, "MC", 2, Some("Visa -> 1, MC -> 1")),
(3, "Amex", 1, Some("Amex -> 1")),
(4, "Amex", 3, Some("Visa -> 1, MC -> 1, Amex -> 1")),
(5, "Amex", 4, Some("Visa -> 2, MC -> 1, Amex -> 1")),
(6, "MC", 1, Some("MC -> 1")),
(7, "Visa", 5, Some("Visa -> 3, MC -> 1, Amex -> 1")),
(8, "Visa", 6, Some("Visa -> 3, MC -> 2, Amex -> 1")),
(9, "MC", 1, Some("MC -> 1")),
(10, "MC", 2, Some("Amex -> 1, MC -> 1"))).toDF("person_id", "card_type", "number_of_cards", "card_type_details")
+---------+---------+---------------+-----------------------------+
|person_id|card_type|number_of_cards|card_type_details |
+---------+---------+---------------+-----------------------------+
|1 |Visa |1 |Visa -> 1 |
|2 |MC |2 |Visa -> 1, MC -> 1 |
|3 |Amex |1 |Amex -> 1 |
|4 |Amex |3 |Visa -> 1, MC -> 1, Amex -> 1|
|5 |Amex |4 |Visa -> 2, MC -> 1, Amex -> 1|
|6 |MC |1 |MC -> 1 |
|7 |Visa |5 |Visa -> 3, MC -> 1, Amex -> 1|
|8 |Visa |6 |Visa -> 3, MC -> 2, Amex -> 1|
|9 |MC |1 |MC -> 1 |
|10 |MC |2 |Amex -> 1, MC -> 1 |
+---------+---------+---------------+-----------------------------+
关于如何提取这个有什么建议吗?
1条答案
按热度按时间6jygbczu1#
假设
card_type_details
属于类型map
. 检查以下代码。创建表达式。