从spark.sql.row获取第一个值

klr1opcd  于 2023-03-19  发布在  Apache
关注(0)|答案(9)|浏览(281)

我有下面的json格式:

{"Request": {"TrancheList": {"Tranche": [{"TrancheId": "500192163","OwnedAmt": "26500000",    "Curr": "USD" }, {  "TrancheId": "500213369", "OwnedAmt": "41000000","Curr": "USD"}]},"FxRatesList": {"FxRatesContract": [{"Currency": "CHF","FxRate": "0.97919983706115"},{"Currency": "AUD", "FxRate": "1.2966804979253"},{ "Currency": "USD","FxRate": "1"},{"Currency": "SEK","FxRate": "8.1561012531034"},{"Currency": "NOK", "FxRate": "8.2454981641398"},{"Currency": "JPY","FxRate": "111.79999785344"},{"Currency": "HKD","FxRate": "7.7568025218916"},{"Currency": "GBP","FxRate": "0.69425159677867"}, {"Currency": "EUR","FxRate": "0.88991723769689"},{"Currency": "DKK", "FxRate": "6.629598372301"}]},"isExcludeDeals": "true","baseCurrency": "USD"}}

json从hdfs中读取:

val hdfsRequest = spark.read.json("hdfs://localhost/user/request.json")
val baseCurrency = hdfsRequest.select("Request.baseCurrency").map(_.getString(0)).collect.headOption
var fxRates = hdfsRequest.select("Request.FxRatesList.FxRatesContract")
val fxRatesDF = fxRates.select(explode(fxRates("FxRatesContract"))).toDF("FxRatesContract").select("FxRatesContract.Currency", "FxRatesContract.FxRate").filter($"Currency"===baseCurrency.get)
fxRatesDF.show()

我得到的fxRatesDF的输出是:

fxRatesDF: org.apache.spark.sql.Dataset[org.apache.spark.sql.Row] = [Currency: string, FxRate: string]
+--------+------+
|Currency|FxRate|
+--------+------+
|     USD|     1|

我如何获取外汇汇率列的第一行的值?

z3yyvxxp

z3yyvxxp1#

您可以使用

fxRatesDF.select(col("FxRate")).first().FxRate
x6h2sr28

x6h2sr282#

下面是您需要使用的函数
像这样使用:

fxRatesDF.first().FxRate
qmelpv7a

qmelpv7a3#

或许是这样:

fxRatesDF.take(1)[0][1]

fxRatesDF.collect()[0][1]

fxRatesDF.first()[1]
ix0qys7i

ix0qys7i4#

我知道这是一个老职位,但我得到了它的工作方式fxRatesDF.first()[0]

5vf7fwbs

5vf7fwbs5#

一个简单的方法是使用索引只选择行和列。

+-----+
|count|
+-----+
|    0|
+-----+

代码:

count = df.collect()[0][0]
print(count)
if count == 0:
    print("First row and First column value is 0")

输出:

0
First row and First column value is 0
jm2pwxwz

jm2pwxwz6#

更新其中一个答案。

from pyspark.sql.functions import col
fxRatesDF.select(col("FxRate")).first()[0]
x6h2sr28

x6h2sr287#

它应该像这样简单:

display(fxRatesDF.select($"FxRate").limit(1))
bq8i3lrv

bq8i3lrv8#

只需要一行一个字就可以解决这个要求。

fxRates.first()(1)


一行两个字

fxRates.first().getString(1)
5rgfhyps

5rgfhyps9#

您可以尝试以下方法:

fxRatesDF.select("FxRate").rdd.map{case Row(i:Int)=> i}.first()

相关问题