hivecontext在spark上导致空指针

bpzcxfmw  于 2021-06-28  发布在  Hive
关注(0)|答案(0)|浏览(356)

我试图在spark任务的Dataframe上执行foreach循环。我用以下命令提交我的spark任务。

  1. spark-submit --class Hive_Cis.DataAnalyze --master local --deploy-mode client --executor-memory 1g --name DataAnalyze --conf "spark.app.id=DataAnalyze" Hive_CIS-1.0-SNAPSHOT-jar-with-dependencies.jar

现在我的类是spark任务的驱动程序,如下所示。

  1. enter public class DataAnalyze {
  2. public static void main(String[] args){
  3. SparkConf conf = new SparkConf().setAppName("DataAnalyze").setMaster("local");
  4. JavaSparkContext sc = new JavaSparkContext(conf);
  5. HiveContext hiveContext = new HiveContext(sc);
  6. DataFrame dataSrcsTable = hiveContext.sql("SELECT * FROM default.data_tables_metrics");
  7. dataSrcsTable.show();
  8. dataSrcsTable.foreach(new DataTableReader(hiveContext));
  9. }//END OF MAIN

那么扩展abstractfunction1的类是:

  1. public class DataTableReader extends AbstractFunction1 implements Serializable{
  2. private HiveContext hiveConnection;
  3. private static final long serialVersionUID = 1919222653470174456L;
  4. public DataTableReader(HiveContext hiveData){
  5. this.hiveConnection = hiveData;
  6. }
  7. @Override
  8. public BoxedUnit apply(Object o) {
  9. Row row = (Row)o;
  10. DataFrame examinetable;
  11. String tableName;
  12. String query;
  13. Timestamp lastCurrentDate;
  14. long count;
  15. if(!row.isNullAt(0)) {
  16. tableName = row.getString(0);
  17. count = row.getLong(1);
  18. lastCurrentDate = row.getTimestamp(2);
  19. System.out.println("\n"+ tableName + "\n");
  20. query = new String("SELECT * from default.");
  21. query = query + tableName;
  22. System.out.println("\n"+ query + "\n");
  23. //TODO this creates a null pointer issue
  24. //Not serializing correctly maybe ?
  25. examinetable = hiveConnection.sql(query);
  26. try {
  27. if (examinetable == null) {
  28. System.out.println("\n\n Error input is null \n\n");
  29. }
  30. if (count < examinetable.count()) {
  31. System.out.println("Count is low");
  32. }
  33. }catch (NullPointerException e){
  34. System.out.println("\n Null pointer at get table \n");
  35. }
  36. }
  37. return BoxedUnit.UNIT;
  38. }

}//下课
在todo点处,以下堆栈跟踪存在空指针问题。

  1. 16/09/01 07:35:38 ERROR executor.Executor: Exception in task 0.0 in stage 1.0 (TID 1)
  2. java.lang.NullPointerException
  3. at org.apache.spark.sql.SQLConf.setConf(SQLConf.scala:588)
  4. at org.apache.spark.sql.SQLContext.setConf(SQLContext.scala:128)
  5. at org.apache.spark.sql.hive.HiveContext.hiveconf$lzycompute(HiveContext.scala:555)
  6. at org.apache.spark.sql.hive.HiveContext.hiveconf(HiveContext.scala:553)
  7. at org.apache.spark.sql.hive.HiveContext.parseSql(HiveContext.scala:333)
  8. at org.apache.spark.sql.SQLContext.sql(SQLContext.scala:817)
  9. at Hive_Cis.DataTableReader.apply(DataTableReader.java:53)
  10. at Hive_Cis.DataTableReader.apply(DataTableReader.java:16)
  11. at scala.collection.Iterator$class.foreach(Iterator.scala:727)
  12. at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
  13. at org.apache.spark.rdd.RDD$$anonfun$foreach$1$$anonfun$apply$32.apply(RDD.scala:912)
  14. at org.apache.spark.rdd.RDD$$anonfun$foreach$1$$anonfun$apply$32.apply(RDD.scala:912)
  15. at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1869)
  16. at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1869)
  17. at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
  18. at org.apache.spark.scheduler.Task.run(Task.scala:89)
  19. at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
  20. at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
  21. at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
  22. at java.lang.Thread.run(Thread.java:745)
  23. 16/09/01 07:35:38 WARN scheduler.TaskSetManager: Lost task 0.0 in stage 1.0 (TID 1, localhost): java.lang.NullPointerException
  24. at org.apache.spark.sql.SQLConf.setConf(SQLConf.scala:588)
  25. at org.apache.spark.sql.SQLContext.setConf(SQLContext.scala:128)
  26. at org.apache.spark.sql.hive.HiveContext.hiveconf$lzycompute(HiveContext.scala:555)
  27. at org.apache.spark.sql.hive.HiveContext.hiveconf(HiveContext.scala:553)
  28. at org.apache.spark.sql.hive.HiveContext.parseSql(HiveContext.scala:333)
  29. at org.apache.spark.sql.SQLContext.sql(SQLContext.scala:817)
  30. at Hive_Cis.DataTableReader.apply(DataTableReader.java:53)
  31. at Hive_Cis.DataTableReader.apply(DataTableReader.java:16)
  32. at scala.collection.Iterator$class.foreach(Iterator.scala:727)
  33. at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
  34. at org.apache.spark.rdd.RDD$$anonfun$foreach$1$$anonfun$apply$32.apply(RDD.scala:912)
  35. at org.apache.spark.rdd.RDD$$anonfun$foreach$1$$anonfun$apply$32.apply(RDD.scala:912)
  36. at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1869)
  37. at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1869)
  38. at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
  39. at org.apache.spark.scheduler.Task.run(Task.scala:89)
  40. at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
  41. at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
  42. at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
  43. at java.lang.Thread.run(Thread.java:745)

但是,我可以看到正在连接到配置单元元存储:

  1. 16/09/01 07:35:37 INFO hive.HiveContext: Initializing execution hive, version 1.1.0
  2. 16/09/01 07:35:37 INFO client.ClientWrapper: Inspected Hadoop version: 2.6.0-cdh5.7.2
  3. 16/09/01 07:35:37 INFO client.ClientWrapper: Loaded org.apache.hadoop.hive.shims.Hadoop23Shims for Hadoop version 2.6.0-cdh5.7.2
  4. 16/09/01 07:35:38 INFO hive.metastore: Trying to connect to metastore with URI thrift://*************:9083
  5. 16/09/01 07:35:38 INFO hive.metastore: Opened a connection to metastore, current connections: 2
  6. 16/09/01 07:35:38 INFO hive.metastore: Connected to metastore.

任何帮助都将不胜感激。

暂无答案!

目前还没有任何答案,快来回答吧!

相关问题