spark:无法加载本机gpl库

zbwhf8kr  于 2021-06-04  发布在  Hadoop
关注(0)|答案(0)|浏览(312)

我在尝试运行非常简单的spark作业(在mllib中使用logistic回归和sgd)时出现以下错误:

  1. ERROR GPLNativeCodeLoader: Could not load native gpl library
  2. java.lang.UnsatisfiedLinkError: no gplcompression in java.library.path
  3. at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1738)
  4. at java.lang.Runtime.loadLibrary0(Runtime.java:823)
  5. at java.lang.System.loadLibrary(System.java:1028)
  6. at com.hadoop.compression.lzo.GPLNativeCodeLoader.<clinit>(GPLNativeCodeLoader.java:32)
  7. at com.hadoop.compression.lzo.LzoCodec.<clinit>(LzoCodec.java:71)
  8. at java.lang.Class.forName0(Native Method)
  9. at java.lang.Class.forName(Class.java:247)
  10. at org.apache.hadoop.conf.Configuration.getClassByNameOrNull(Configuration.java:1659)
  11. at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:1624)
  12. at org.apache.hadoop.io.compress.CompressionCodecFactory.getCodecClasses(CompressionCodecFactory.java:128)
  13. at org.apache.hadoop.io.compress.CompressionCodecFactory.<init>(CompressionCodecFactory.java:175)
  14. at org.apache.hadoop.mapred.TextInputFormat.configure(TextInputFormat.java:45)
  15. at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
  16. at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
  17. at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
  18. at java.lang.reflect.Method.invoke(Method.java:597)
  19. at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106)
  20. at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75)
  21. at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
  22. at org.apache.spark.rdd.HadoopRDD.getInputFormat(HadoopRDD.scala:155)
  23. at org.apache.spark.rdd.HadoopRDD$$anon$1.<init>(HadoopRDD.scala:187)
  24. at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:181)
  25. at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:93)
  26. at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262)
  27. at org.apache.spark.rdd.RDD.iterator(RDD.scala:229)
  28. at org.apache.spark.rdd.MappedRDD.compute(MappedRDD.scala:31)
  29. at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262)
  30. at org.apache.spark.rdd.RDD.iterator(RDD.scala:229)
  31. at org.apache.spark.rdd.MappedRDD.compute(MappedRDD.scala:31)
  32. at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262)
  33. at org.apache.spark.rdd.RDD.iterator(RDD.scala:229)
  34. at org.apache.spark.rdd.FilteredRDD.compute(FilteredRDD.scala:34)
  35. at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262)
  36. at org.apache.spark.rdd.RDD.iterator(RDD.scala:229)
  37. at org.apache.spark.rdd.MappedRDD.compute(MappedRDD.scala:31)
  38. at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262)
  39. at org.apache.spark.rdd.RDD.iterator(RDD.scala:229)
  40. at org.apache.spark.rdd.MappedRDD.compute(MappedRDD.scala:31)
  41. at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262)
  42. at org.apache.spark.rdd.RDD.iterator(RDD.scala:229)
  43. at org.apache.spark.rdd.FilteredRDD.compute(FilteredRDD.scala:34)
  44. at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262)
  45. at org.apache.spark.rdd.RDD.iterator(RDD.scala:229)
  46. at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:111)
  47. at org.apache.spark.scheduler.Task.run(Task.scala:51)
  48. at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:187)
  49. at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
  50. at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
  51. at java.lang.Thread.run(Thread.java:662)
  52. 14/08/06 20:32:11 ERROR LzoCodec: Cannot load native-lzo without native-hadoop

这是我用来提交作业的命令:

  1. ~/spark/spark-1.0.0-bin-hadoop2/bin/spark-submit \
  2. --class com.jk.sparktest.Test \
  3. --master yarn-cluster \
  4. --num-executors 40 \
  5. ~/sparktest-0.0.1-SNAPSHOT-jar-with-dependencies.jar

实际的java命令是:

  1. /usr/java/latest/bin/java -cp /apache/hadoop/share/hadoop/common/hadoop-common-2.2.0.2.0.6.0-61.jar:/apache/hadoop/lib/hadoop-lzo-0.6.0.jar::/home/jilei/spark/spark-1.0.0-bin-hadoop2/conf:/home/jilei/spark/spark-1.0.0-bin-hadoop2/lib/spark-assembly-1.0.0-hadoop2.2.0.jar:/home/jilei/spark/spark-1.0.0-bin-hadoop2/lib/datanucleus-rdbms-3.2.1.jar:/home/jilei/spark/spark-1.0.0-bin-hadoop2/lib/datanucleus-core-3.2.2.jar:/home/jilei/spark/spark-1.0.0-bin-hadoop2/lib/datanucleus-api-jdo-3.2.1.jar:/apache/hadoop/conf:/apache/hadoop/conf \
  2. -XX:MaxPermSize=128m \
  3. -Djava.library.path=
  4. -Xms512m -Xmx512m org.apache.spark.deploy.SparkSubmit \
  5. --class com.jk.sparktest.Test \
  6. --master yarn-cluster \
  7. --num-executors 40 \
  8. ~/sparktest-0.0.1-SNAPSHOT-jar-with-dependencies.jar

似乎 -Djava.library.path 未设置。我还尝试了上面的java命令,并将本机lib目录提供给 java.library.path ,但仍然有相同的错误。这是本机lib目录的内容:

  1. ls /apache/hadoop/lib/native/
  2. libgplcompression.a libgplcompression.so libgplcompression.so.0.0.0 libhadooppipes.a
  3. libhadoop.so.1.0.0 libhdfs.a libhdfs.so.0.0.0 libsnappy.so.1
  4. libgplcompression.la libgplcompression.so.0 libhadoop.a libhadoop.so
  5. libhadooputils.a libhdfs.so libsnappy.so libsnappy.so.1.1.4

你知道怎么了吗?谢谢。

暂无答案!

目前还没有任何答案,快来回答吧!

相关问题