scala 如何对一个类的多个字段进行汇总?

wtzytmuj  于 2022-11-09  发布在  Scala
关注(0)|答案(1)|浏览(158)

我有一个类Dimensions(Int,Int,Int)和一个Shape(字符串名称),放入一个Tuple(Shape,Dimensions)
我的数据集是:

(Cube, Dimensions(5,5,5))
(Sphere, Dimensions(5,10,15))
(Cube, Dimensions(3,3,3))

我需要退还这个:

(Cube, Dimensions(8,8,8))
(Sphere, Dimensions(5,10,15))

其中,我按形状的名称分组,然后将所有尺寸值相加。目前我能够Map到(name,Int,Int,Int),但我不确定如何将其 Package 回Dimension对象。

data.map(_._2.map(x => (x.length,x.width,x.height)))

如有任何帮助,我们将不胜感激

fykwrbwg

fykwrbwg1#

假设没有非常特定的特殊情况,并且您有RDD。你只需要一台aggregateByKey

case class Dimensions(i1: Int, i2: Int, i3: Int)

val initialRdd: RDD[(Shape, Dimensions)] = ???

def combineDimensions(dimensions1: Dimensions, dimensions2: Dimensions): Dimensions =
  Dimensions(
    dimensions1.i1 + dimensions2.i1,
    dimensions1.i2 + dimensions2.i2,
    dimensions1.i3 + dimensions2.i3
  )

val finalRdd: RDD[(Shape, Dimensions)] =
  initialRdd
    .aggregateByKey(Dimensions(0, 0, 0))(
      { case (accDimensions, dimensions) =>
        combineDimensions(accDimensions, dimensions)
      },
      { case (partitionDimensions1, partitionDimensions2) =>
        combineDimensions(partitionDimensions1, partitionDimensions1)
      }
    )

相关问题