java.lang.nosuchmethoderror:org.apache.hadoop.security.providerutils.excludeUncompatibleCredentialProviders在从azure blob存储读取时

5vf7fwbs 于 2021-05-27 发布在 Spark

关注(0)|答案(1)|浏览(436)

我正在尝试读取存储在azure存储帐户中的csv文件。为此，我在虚拟机上安装了spark，并尝试从pyspark读取Dataframe中的csv文件。
我在某个地方读到了如何做到这一点，并按照步骤在我的/jar目录中复制了最新的hadoopazure和azure存储jar文件。然后，我想到了这个error:-
noclassdeffounderror:org/apache/hadoop/fs/streamcapabilities
我搜索了这个错误，发现我需要引用hadoop-azure-2.8.5.jar而不是最新的hadoop-azure-jar。所以，我用最新的hadoopazurejar替换了这个jar，并再次执行了pyspark代码。
执行代码后，我遇到另一个错误：-
：java.lang.nosuchmethoderror:org.apache.hadoop.security.providerutils.excludeCompatibleCredentialProviders（lorg/apache/hadoop/conf/configuration；ljava/lang/class；）lorg/apache/hadoop/conf/configuration；
另外，下面是我的Pypark代码：-

from pyspark import SparkContext
from pyspark.sql import SparkSession
from pyspark.sql import Window
from pyspark.sql.types import *
from pyspark.sql.functions import *

spark = SparkSession.builder.getOrCreate()
storage_account_name = "<storage_account_name>"
storage_account_access_key = "<storage_account_access_key>"
spark.conf.set("fs.azure.account.key." + storage_account_name + ".blob.core.windows.net",storage_account_access_key)

spark._jsc.hadoopConfiguration().set("fs.wasbs.impl","org.apache.hadoop.fs.azure.NativeAzureFileSystem")
spark._jsc.hadoopConfiguration().set("fs.azure", "org.apache.hadoop.fs.azure.NativeAzureFileSystem")
spark._jsc.hadoopConfiguration().set("fs.azure.account.key.my_account.blob.core.windows.net", "storage_account_access_key")

df = spark.read.format("csv").option("inferSchema", "true").load("wasbs://<container_name>@<storage_account_name>.blob.core.windows.net/<path_to_csv>/sample_file.csv")
df.show()

apache-spark pyspark Azure azure-storage-account

来源：https://stackoverflow.com/questions/63598186/java-lang-nosuchmethoderror-org-apache-hadoop-security-providerutils-excludeinc