dataproc spark作业失败,io异常,历史服务器尝试访问gcs bucket时无法初始化spark上下文

mw3dktmi  于 2021-05-17  发布在  Spark
关注(0)|答案(0)|浏览(369)

我们已经用图1.5-debian10创建了gcpdataproc集群。群集创建成功。
spark版本是 2.4.7 并在图像上方的dataproc集群上运行。
dataproc群集软件配置:
群集属性:

  1. softwareConfig:
  2. imageVersion: 1.5.23-debian10
  3. properties:
  4. capacity-scheduler:yarn.scheduler.capacity.maximum-am-resource-percent: '0.9'
  5. capacity-scheduler:yarn.scheduler.capacity.resource-calculator: org.apache.hadoop.yarn.util.resource.DominantResourceCalculator
  6. capacity-scheduler:yarn.scheduler.capacity.root.default.ordering-policy: fair
  7. core:fs.gs.block.size: '134217728'
  8. core:fs.gs.metadata.cache.enable: 'false'
  9. core:hadoop.ssl.enabled.protocols: TLSv1,TLSv1.1,TLSv1.2
  10. dataproc:dataproc.conscrypt.provider.enable: 'false'
  11. dataproc:job.history.to-gcs.enabled: 'true'
  12. mapred:mapreduce.jobhistory.done-dir: gs://</bucketName>/mapreduce/mapreduce-job-history/done
  13. mapred:mapreduce.jobhistory.intermediate-done-dir: gs://<bucketName>/mapreduce-job-history/intermediate-don
  14. spark:spark.eventLog.dir: gs://<bucketname>/spark-job-history/events
  15. spark:spark.executor.cores: '8'
  16. spark:spark.executor.instances: '2'
  17. spark:spark.executor.memory: 8379m
  18. spark:spark.executorEnv.OPENBLAS_NUM_THREADS: '1'
  19. spark:spark.history.fs.logDirectory: gs://<bucketname>/spark-job-history
  20. spark:spark.scheduler.mode: FAIR
  21. spark:spark.sql.cbo.enabled: 'true'

设置的变通方法 spark.eventLog.enabled false ,在spark submit命令中,作业可以正常工作。但我们以后将无法访问spark history server来分析日志。
错误:

  1. java.io.IOException: Error accessing gs://<bucketname>/spark-job-history/events
  2. at com.google.cloud.hadoop.gcs.com.google.cloud.hadoop.gcsio.GoogleCloudStorageImpl.getObject(GoogleCloudStorageImpl.java:1959)
  3. at com.google.cloud.hadoop.gcs.com.google.cloud.hadoop.gcsio.GoogleCloudStorageFileSystem.getFileInfo(GoogleCloudStorageFileSystem.java:1083)
  4. at com.google.cloud.hadoop.fs.gcs.GoogleHadoopFileSystemBase.getFileStatus(GoogleHadoopFileSystemBase.java:1079)
  5. at org.apache.spark.scheduler.EventLoggingListener.start(EventLoggingListener.scala:97)
  6. at org.apache.spark.SparkContext.<init>(SparkContext.scala:523)
  7. at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2520)
  8. at org.apache.spark.sql.SparkSession$Builder.$anonfun$getOrCreate$1(SparkSession.scala:930)
  9. at scala.Option.getOrElse(Option.scala:189)
  10. at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:921)
  11. at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
  12. ... 25 more
  13. Caused by: sun.security.validator.ValidatorException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target
  14. at sun.security.validator.PKIXValidator.doBuild(PKIXValidator.java:456)
  15. at sun.security.validator.PKIXValidator.engineValidate(PKIXValidator.java:323)
  16. at sun.security.validator.Validator.validate(Validator.java:271)
  17. sun.security.ssl.CertificateMessage$T12CertificateConsumer.checkServerCerts(CertificateMessage.java:638)
  18. ... 45 more
  19. Caused by: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target
  20. at sun.security.provider.certpath.SunCertPathBuilder.build(SunCertPathBuilder.java:141)
  21. at sun.security.provider.certpath.SunCertPathBuilder.engineBuild(SunCertPathBuilder.java:126)
  22. at java.security.cert.CertPathBuilder.build(CertPathBuilder.java:280)
  23. at sun.security.validator.PKIXValidator.doBuild(PKIXValidator.java:451)```

暂无答案!

目前还没有任何答案,快来回答吧!

相关问题