我正在尝试从HDFS读取一个文件到Spark中,并对它进行一些数据处理。我正在尝试从Kubernetes的一个pod中执行此操作。
这就是我运行的代码:
spark = SparkSession.builder.appName('myAppName').getOrCreate()
spark.conf.set('spark.yarn.keytab', '/data/conf/my.keytab')
spark.conf.set('spark.yarn.principal', 'datascience@Snir.LOCAL')
cat_df = spark.read.load("hdfs://data-main1.snir.network/process/extra/folder")
我得到错误:SIMPLE验证未启用。可用:[TOKEN,KERBEROS]
Py4JJavaError: An error occurred while calling o35.load.
: org.apache.hadoop.security.AccessControlException: org.apache.hadoop.security.AccessControlException:
SIMPLE authentication is not enabled. Available:[TOKEN, KERBEROS]
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:121)
at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:88)
at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1667)
at org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1582)
at org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1579)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1594)
at org.apache.hadoop.fs.FileSystem.isDirectory(FileSystem.java:1700)
at o
... 23 more
我也试过
设置(“Hadoop安全验证”,“Kerberos”)
但运气不好。
有人处理过吗?
1条答案
按热度按时间imzjd6km1#
这些配置已更改:
*Spark2.x:
spark.yarn.keytab
和spark.yarn.principal
*Spark3.x:
spark.kerberos.keytab
和spark.kerberos.principal
此外,还需要在Spark会话初始化之前进行设置,即