scala上用map函数变换元组的问题

uxhixvfz  于 2021-05-29  发布在  Hadoop
关注(0)|答案(1)|浏览(775)

我有这个元组

val tuple_test = ("NCA-15","select count(*) from table")

我想要的是转换元组以保持第一个值 NCA-15 并执行查询 select count(*) from table 这就是我想要的结果

(NCA-15,8)

其中8是查询的结果
我试过这个:

val resultat = tuple_test
    .productIterator
    .map {
       case(x: String, y: String) => (x, spark.sql(y.toString))
    }

但它又回来了

resultat = non-empty iterator
drkbr07n

drkbr07n1#

select "NCA-15",count(*) from table 将给予 NCA-15,8 以Dataframe和 .rdd 制造 Rdd[Row]Rdd[Row] 你可以做一个元组。
看下面我的甜甜圈例子。。。既然我没有Hive,我就用诱惑来模拟

package com.examples

import org.apache.log4j.Level
import org.apache.spark.sql.{Row, SparkSession}

/**
  * Created by Ram Ghadiyaram
  */
object RDDOfTupleExample {
  org.apache.log4j.Logger.getLogger("org").setLevel(Level.ERROR)

  def main(args: Array[String]) {

    val spark = SparkSession.builder.
      master("local")
      .appName(this.getClass.getName)
      .getOrCreate()

    val donuts = Seq(("plain donut", 1.50), ("plain donut", 1.50)
      , ("vanilla donut", 2.0), ("vanilla donut", 2.0)
      , ("glazed donut", 2.50))
    val df = spark
      .createDataFrame(donuts)
      .toDF("Donut_Name", "Price")
    //lets suppose this is your hive table since i dont have hive i simulated with temp table
    df.createOrReplaceTempView("mydonuts")
    spark.sql("select \"NCA-15\" as mylabel, count(Donut_Name) as mydonutcount from mydonuts")
      .rdd.map((x: Row) => (x.get(0), x.get(1)))
      .foreach(println)
  }
}

结果:

(NCA-15,5)

相关问题