scala—如何使用scalamock来评估使用特定sparkDataframe参数调用的函数,并获得有用的输出

66bbxpm5  于 2021-07-12  发布在  Spark
关注(0)|答案(1)|浏览(370)

我一直在看:
http://scalamock.org/user-guide/advanced_topics/
https://scalamock.org/user-guide/matching/
https://scalamock.org/quick-start/
但是还没有得到我想要的结果,基本上我做了这个测试

  1. scenario("myFunction reads parquet and writes to db") {
  2. var mockUtil: UtilitiesService = stub[UtilitiesService]
  3. val service = new myService(mockUtil)
  4. val expectedParquetDf = Seq(
  5. (999, "testData")
  6. ).toDF("number", "word")
  7. (mockUtil.getDataFrameFromParquet _).when("myParquetPath") returns Right(expectedParquetDf)
  8. service.publishToDatabase()
  9. (mockUtil.insertDataFrameIntoDb_).verify(expectedParquetDf,"myTable").once()
  10. }

但是如果测试失败(由于Dataframe不匹配),那么输出就不理想了,简单地说就是这样

  1. [info] Expected:
  2. [info] inAnyOrder {
  3. [info] <stub-4> UtilitiesService.getDataFrameFromParquet(path) any number of times (called once)
  4. [info] <stub-4> UtilitiesService.insertDataFrameIntoPostgres[number: int, word: string] once (never called - UNSATISFIED)
  5. [info] }
  6. [info]
  7. [info] Actual:
  8. [info] <stub-4> UtilitiesService.getDataFrameFromParquet(oath)
  9. [info] <stub-4> UtilitiesService.insertDataFrameIntoPostgres([number: int, word: string], "myTable" (myFile.scala:28)

字符串部分是现场,但Dataframe部分;只有在列被删除的情况下才有用,如果有坏的行等情况下就不那么有用了。有什么好的方法来改进这个吗?
目前我的兔子洞已经引导我到下面,它仍然不工作和“Assert”函数返回真,使“&&”部分的工作都觉得必须有一个更好的方法。是否有一些比较器功能,我可以覆盖在标准验证??:

  1. def assertStringsAreEqual(expectedPath:String, actualPath:String) : Boolean = {
  2. assert(actualPath == expectedPath)
  3. true
  4. }
  5. def assertDataFramesAreEqual(expected: DataFrame, actual: DataFrame) : Boolean = {
  6. AssertHelpers.assertDataEqual(expected, actual) //verbos info, asserts on each row etc
  7. true
  8. }
  9. scenario("myFunction reads parquet and writes to db") {
  10. var mockUtil: UtilitiesService = stub[UtilitiesService]
  11. val service = new myService(mockUtil)
  12. val expectedParquetDf = Seq(
  13. (999, "testData"),
  14. (898, "wrongData"),
  15. (999, "extraRow")
  16. ).toDF("number", "word")
  17. val incorrectExample = Seq(
  18. (999, "testData"),
  19. (999, "testData")
  20. ).toDF("number", "word")
  21. (mockUtil.getDataFrameFromParquet _).when("myParquetPath") returns Right(incorrectExample) //forced to incorrect for now
  22. (_mockUtilService.insertDataFrameIntoPostgres _).
  23. expects(where { {
  24. (actualDf, path) => assertDataFramesAreEqual(expectedParquetDf, actualDf) && assertStringsAreEqual(path, "ExpectedTable")
  25. } })
  26. .once()
  27. service.publishToDb()
  28. }

作为参考,我的目标是在某个地方出现这样的东西:

  1. Expected:
  2. Dataframe:
  3. [number, word]
  4. [999, "testData"]
  5. [898, "wrongData"]
  6. [999, "extraRow"]
  7. Actual:
  8. Dataframe
  9. [number, word]
  10. [999, "testData"]
  11. [999, "testData"]
sg3maiej

sg3maiej1#

所以这仍然不理想,但是使用“expects.oncall”可以得到想要的输出

  1. scenario("myFunction reads parquet and writes to db") {
  2. var mockUtil: UtilitiesService = mock[UtilitiesService]
  3. val service = new myService(mockUtil)
  4. val expectedParquetDf = Seq(
  5. (999, "testData"),
  6. (898, "wrongData"),
  7. (999, "extraRow")
  8. ).toDF("number", "word")
  9. val incorrectExample = Seq(
  10. (999, "testData"),
  11. (999, "testData")
  12. ).toDF("number", "word")
  13. //set up expectations
  14. (mockUtilService.insertDataFrameIntoPostgres _).expects(*,"ExpectedTable").onCall( { (df: DataFrame, path:String) =>
  15. AssertHelpers.assertDataEqual(df, expectedParquetDf)
  16. Right(sxDbData)
  17. })
  18. (mockUtil.getDataFrameFromParquet _).when("myParquetPath") returns Right(incorrectExample) //forced to incorrect for now
  19. service.publishToDb()
  20. }

希望有人有一个更干净的解决方案

展开查看全部

相关问题