我有一个类似于下面的数据集
df =
+-------------+---------------+-----------+
date delivery Value
+-------------+---------------+-----------+
01/01/2018 yes 0
02/01/2018 no 3
03/01/2018 yes 3
04/01/2018 no 0
01/02/2018 yes 3
02/02/2018 yes 0
03/02/2018 yes 0
04/02/2018 yes 2
01/03/2018 no 0
02/03/2018 yes 0
03/03/2018 no 3
04/03/2018 no 2
我总是尝试在每次运行时插入代码的输出,并将当前标记作为新行。目前我尝试:
total = df.count()
df2 = df.filter(df.Value==0).groupBy("delivery")\
.count()\
.withColumn("percent",(F.col('cnt_grp')/total)*100) \
.withColumn("date",current_timestamp())
但是每次我运行这个我只得到两行,而不是每次运行两个新行。我期望的输出应该类似于
+-------------+---------------+----------------------+----------------------+
date delivery valuewithzero percentage
+-------------+---------------+----------------------+----------------------+
19/2021 yes 4 33.3%
19/2021 no 2 16.6%
20/2021 yes 4 33.3%
20/2021 no 2 16.6%
21/2021 yes 4 33.3%
21/2021 no 2 16.6%
暂无答案!
目前还没有任何答案,快来回答吧!