scala—如何使用scalamock来评估使用特定sparkDataframe参数调用的函数,并获得有用的输出

66bbxpm5  于 2021-07-12  发布在  Spark
关注(0)|答案(1)|浏览(345)

我一直在看:
http://scalamock.org/user-guide/advanced_topics/
https://scalamock.org/user-guide/matching/
https://scalamock.org/quick-start/
但是还没有得到我想要的结果,基本上我做了这个测试

scenario("myFunction reads parquet and writes to db") {
          var mockUtil: UtilitiesService = stub[UtilitiesService]
          val service = new myService(mockUtil)

          val expectedParquetDf = Seq(
              (999, "testData")
          ).toDF("number", "word")

          (mockUtil.getDataFrameFromParquet _).when("myParquetPath") returns Right(expectedParquetDf)
          service.publishToDatabase()
          (mockUtil.insertDataFrameIntoDb_).verify(expectedParquetDf,"myTable").once()        
      }

但是如果测试失败(由于Dataframe不匹配),那么输出就不理想了,简单地说就是这样

[info]   Expected:
[info]   inAnyOrder {
[info]     <stub-4> UtilitiesService.getDataFrameFromParquet(path) any number of times (called once)
[info]     <stub-4> UtilitiesService.insertDataFrameIntoPostgres[number: int, word: string] once (never called - UNSATISFIED)
[info]   }
[info]   
[info]   Actual:
[info]     <stub-4> UtilitiesService.getDataFrameFromParquet(oath)
[info]     <stub-4> UtilitiesService.insertDataFrameIntoPostgres([number: int, word: string], "myTable" (myFile.scala:28)

字符串部分是现场,但Dataframe部分;只有在列被删除的情况下才有用,如果有坏的行等情况下就不那么有用了。有什么好的方法来改进这个吗?
目前我的兔子洞已经引导我到下面,它仍然不工作和“Assert”函数返回真,使“&&”部分的工作都觉得必须有一个更好的方法。是否有一些比较器功能,我可以覆盖在标准验证??:

def assertStringsAreEqual(expectedPath:String, actualPath:String) : Boolean = {
          assert(actualPath == expectedPath)
          true
      }

      def assertDataFramesAreEqual(expected: DataFrame, actual: DataFrame) : Boolean = {
          AssertHelpers.assertDataEqual(expected, actual) //verbos info, asserts on each row etc
          true
      }

      scenario("myFunction reads parquet and writes to db") {
          var mockUtil: UtilitiesService = stub[UtilitiesService]
          val service = new myService(mockUtil)
          val expectedParquetDf = Seq(
              (999, "testData"),
              (898, "wrongData"),
              (999, "extraRow")
          ).toDF("number", "word")

          val incorrectExample = Seq(
              (999, "testData"),
              (999, "testData")
          ).toDF("number", "word")

          (mockUtil.getDataFrameFromParquet _).when("myParquetPath") returns Right(incorrectExample) //forced to incorrect for now
          (_mockUtilService.insertDataFrameIntoPostgres _).
              expects(where { {
                      (actualDf, path) => assertDataFramesAreEqual(expectedParquetDf, actualDf) && assertStringsAreEqual(path, "ExpectedTable")
              }  })
              .once()

          service.publishToDb()

      }

作为参考,我的目标是在某个地方出现这样的东西:

Expected:
Dataframe:
[number, word]
[999, "testData"]
[898, "wrongData"]
[999, "extraRow"]

Actual:
Dataframe
[number, word]
[999, "testData"]
[999, "testData"]
sg3maiej

sg3maiej1#

所以这仍然不理想,但是使用“expects.oncall”可以得到想要的输出

scenario("myFunction reads parquet and writes to db") {
      var mockUtil: UtilitiesService = mock[UtilitiesService]
      val service = new myService(mockUtil)
      val expectedParquetDf = Seq(
          (999, "testData"),
          (898, "wrongData"),
          (999, "extraRow")
      ).toDF("number", "word")

      val incorrectExample = Seq(
          (999, "testData"),
          (999, "testData")
      ).toDF("number", "word")

     //set up expectations
      (mockUtilService.insertDataFrameIntoPostgres _).expects(*,"ExpectedTable").onCall( { (df: DataFrame, path:String) =>
        AssertHelpers.assertDataEqual(df, expectedParquetDf)
        Right(sxDbData)
        })

      (mockUtil.getDataFrameFromParquet _).when("myParquetPath") returns Right(incorrectExample) //forced to incorrect for now

      service.publishToDb()

  }

希望有人有一个更干净的解决方案

相关问题