java代码修改scala从customerid到用行数替换

5gfr0r5j 于 2021-05-29 发布在 Spark

关注(0)|答案(0)|浏览(274)

val indexedRowRDD = dfFeaturesUpdated
      .select(customerIdColName, "mergedAllFeatures")
      .rdd
      .map { case row => {
        val accountIdRow = row.getAs[Integer](customerIdColName).toLong    //customerIdColName
        val featuresRow = row.getAs[org.apache.spark.mllib.linalg.Vector]("mergedAllFeatures")
        IndexedRow(accountIdRow, featuresRow)
      }}

在这个代码中 Val accountIdRow = row.getAs[Integer](customerIdColName).toLong 正在获取customerid列值。问题是，即使有两行客户id分别为“144500001”和“14450002”，它也会按原样处理这些数字，这会导致下一次计算出现问题。需要的是，如果有两行，则应将值设为2或Dataframe中的行数。欢迎修改中的任何建议。

Java scala apache-spark apache-spark-mllib

来源：https://stackoverflow.com/questions/62501567/code-modification-scala-from-customerid-to-replacing-with-count-of-rows

暂无答案！

目前还没有任何答案，快来回答吧！

我来回答

java代码修改scala从customerid到用行数替换

暂无答案！

相关问题

热门标签

最新问答