我有3个sparkDataframe,想把它们加入baq列,并从division创建2个新列( dtfAvgEnd("avg(AAG)")/dtfAvgWeek("avg(AAG)"
)以及 dtfAvgLong("avg(AAG)")/dtfAvgWeek("avg(AAG)")
).
scala> dtfAvgWeek.filter("BAQ='3310101041401034198668'").show(10,false)
+----------------------+-----------------+ ]
|BAQ |avg(AAG) |
+----------------------+-----------------+
|3310101041401034198668|147.6660606060606|
+----------------------+-----------------+
scala> dtfAvgEnd.filter("BAQ='3310101041401034198668'").show(10,false)
+----------------------+------------------+ ]
|BAQ |avg(AAG) |
+----------------------+------------------+
|3310101041401034198668|58.360833333333325|
+----------------------+------------------+
scala> dtfAvgLong.filter("BAQ='3310101041401034198668'").show(10,false)
+----------------------+------------------+ 1]
|BAQ |avg(AAG) |
+----------------------+------------------+
|3310101041401034198668|121.46857142857144|
+----------------------+------------------+
scala> val dtfRatiConsSing=dtfAvgWeek.
| filter("BAQ='3310101041401034198668'").
| join(dtfAvgEnd,Seq("BAQ"),"inner").
| join(dtfAvgLong,Seq("BAQ"),"inner").
| withColumn("Rati_End",dtfAvgEnd("avg(AAG)")/dtfAvgWeek("avg(AAG)")).
| withColumn("Rati_long",dtfAvgLong("avg(AAG)")/dtfAvgWeek("avg(AAG)"));
dtfRatiConsSing: org.apache.spark.sql.DataFrame = [BAQ: string, avg(AAG): double ... 4 more fields]
我得到了这个。所以rati d u很早就解决了,但最后没有。我不明白出了什么问题。
scala> dtfRatiConsSing.
| show(20,false);
+----------------------+------------------+------------------+------------------+--------+------------------+
|BAQ |avg(AAG) |avg(AAG) |avg(AAG) |Rati_End|Rati_long |
+----------------------+------------------+------------------+------------------+--------+------------------+
|3310101041401034198668|147.66606060606063|58.360833333333346|121.46857142857142|1.0 |0.8225896386077629|
1条答案
按热度按时间njthzxwz1#
我重新命名了avg列,它成功了。