我想计算利润的平均值。
Spark2.4.4
Dataframe外观like:-
+----------------+----------------+-------------------+
| Customer |CustomerCount |profit|
+----------------+----------------+-------------------+
|Customer_162 | 8| 0.28|
|Customer_2634 | 1|0.31|
|Customer_1482 | 8|0.28 |
+----------------+----------------+-------------------+
Code:
newdf.select("Customer","CustomerCount","profit")
.agg(sum("profit")
.alias("sum"),
count("CustomerCount").alias("count"))
.withColumn("Mean", round(col("sum") / sum("count").over(),2))
.show()
Current Output shows like
+----------------+-----+----+
| sum|count|Mean|
+----------------+-----+----+
但我想得到这样的输出
+----------------+----------------+--------------+
| Customer |CustomerCount |profit| Mean
+----------------+----------------+---------------+
|Customer_162 | 8| 0.28 |0.29
|Customer_2634 | 1|0.31 |0.29
|Customer_1482 | 8|0.28 |0.29
+----------------+----------------+--------+
谨致问候
1条答案
按热度按时间o75abkj41#
下面的代码可能会有所帮助。