Spark-HDFS阅读-未启用SIMPLE身份验证

dgiusagp  于 2022-12-09  发布在  HDFS
关注(0)|答案(1)|浏览(227)

我正在尝试从HDFS读取一个文件到Spark中,并对它进行一些数据处理。我正在尝试从Kubernetes的一个pod中执行此操作。
这就是我运行的代码:

spark = SparkSession.builder.appName('myAppName').getOrCreate()
spark.conf.set('spark.yarn.keytab', '/data/conf/my.keytab')
spark.conf.set('spark.yarn.principal', 'datascience@Snir.LOCAL')
cat_df = spark.read.load("hdfs://data-main1.snir.network/process/extra/folder")

我得到错误:SIMPLE验证未启用。可用:[TOKEN,KERBEROS]

Py4JJavaError: An error occurred while calling o35.load.
: org.apache.hadoop.security.AccessControlException: org.apache.hadoop.security.AccessControlException:
 SIMPLE authentication is not enabled.  Available:[TOKEN, KERBEROS]
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
    at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
    at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:121)
    at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:88)
    at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1667)
    at org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1582)
    at org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1579)
    at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
    at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1594)
    at org.apache.hadoop.fs.FileSystem.isDirectory(FileSystem.java:1700)
    at o
... 23 more

我也试过

设置(“Hadoop安全验证”,“Kerberos”)
但运气不好。
有人处理过吗?

imzjd6km

imzjd6km1#

这些配置已更改:

*Spark2.xspark.yarn.keytabspark.yarn.principal
*Spark3.xspark.kerberos.keytabspark.kerberos.principal

此外,还需要在Spark会话初始化之前进行设置,即

spark = (SparkSession
  .builder
  .appName('myAppName')
  .config('spark.yarn.keytab', '/data/conf/my.keytab')
  .config('spark.yarn.principal', 'datascience@Snir.LOCAL')
  .getOrCreate()
)

相关问题