hive 配置单元到ElasticSearch的接收问题

cu6pst1q  于 2022-11-05  发布在  Hive
关注(0)|答案(1)|浏览(228)

使用ElasticSearch版本6.8.0
对于单个格式错误的json记录,完成配置单元作业失败,我尝试更改'es.write.rest.error.handler.es.return. default'='PASS/HANDLED',但没有成功
参考:https://www.elastic.co/guide/en/elasticsearch/hadoop/master/errorhandlers.html
以下是在配置单元提示符下运行的DDL脚本,用于接收

  1. ADD JAR /home/smrafi/elasticsearch-hadoop-6.8.0/dist/elasticsearch-hadoop-6.8.0.jar;
  2. CREATE external TABLE hive_es_with_handler10( data STRING)
  3. STORED BY 'org.elasticsearch.hadoop.hive.EsStorageHandler'
  4. TBLPROPERTIES('es.resource' = 'test_eshadoop/healthCareProvider','es.nodes' = 'xyzpqr','es.input.json' = 'yes','es.index.auto.create' = 'true','es.write.operation'='upsert',
  5. 'es.nodes.wan.only' = 'true','es.port' = '443','es.net.ssl'='true','es.batch.size.entries'='1','es.mapping.id' ='id','es.batch.write.retry.count'='-1',
  6. 'es.batch.write.retry.wait'='60s',
  7. 'es.write.data.error.handlers' = 'es',
  8. 'es.write.rest.error.handler.es.client.nodes' = 'vpc-pid-pre-prod-es-cluster-b7thvqfj3tp45arxl34gge3yyi.us-east-2.es.amazonaws.com',
  9. 'es.write.rest.error.handler.es.client.port' = '443',
  10. 'es.write.rest.error.handler.es.client.resource'='error_es_index',
  11. 'es.write.rest.error.handler.es.return.default'='PASS',
  12. 'es.write.rest.error.handler.es.return.error'='PASS');
  13. insert into hive_es_with_handler10 select * from provider;

下面是异常错误跟踪,它失败并报告错误。处理程序索引不存在

  1. Caused by: org.elasticsearch.hadoop.EsHadoopIllegalArgumentException: Could not locate write resource for ES error handler.
  2. at org.elasticsearch.hadoop.util.Assert.hasText(Assert.java:30)
  3. at org.elasticsearch.hadoop.handler.impl.elasticsearch.ElasticsearchHandler.init(ElasticsearchHandler.java:145)
  4. at org.elasticsearch.hadoop.serialization.handler.write.impl.DelegatingErrorHandler.init(DelegatingErrorHandler.java:40)
  5. at org.elasticsearch.hadoop.handler.impl.AbstractHandlerLoader.loadHandlers(AbstractHandlerLoader.java:114)
  6. at org.elasticsearch.hadoop.serialization.bulk.BulkEntryWriter.<init>(BulkEntryWriter.java:56)
  7. at org.elasticsearch.hadoop.rest.RestRepository.lazyInitWriting(RestRepository.java:138)
  8. at org.elasticsearch.hadoop.rest.RestRepository.writeProcessedToIndex(RestRepository.java:185)
  9. at org.elasticsearch.hadoop.hive.EsHiveOutputFormat$EsHiveRecordWriter.write(EsHiveOutputFormat.java:64)
  10. at org.apache.hadoop.hive.ql.exec.FileSinkOperator.process(FileSinkOperator.java:762)
  11. at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:897)
  12. at org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:95)
  13. at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:897)
  14. at org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:130)
  15. at org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:148)
  16. at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:550)
  17. ... 9 more
5fjcxozz

5fjcxozz1#

以下是正确收集所有错误json记录错误的配置,配置单元仍然存在问题,配置单元不支持格式错误的json记录请检查此ElasticSearch hive SerializationError handler

  1. ADD JAR /home/smrafi/elasticsearch-hadoop-6.8.0/dist/elasticsearch-hadoop-6.8.0.jar;
  2. CREATE external TABLE hive_es_with_handler32( data STRING)
  3. STORED BY 'org.elasticsearch.hadoop.hive.EsStorageHandler'
  4. TBLPROPERTIES('es.resource' = 'test_eshadoop/healthCareProvider','es.nodes' = 'xyz','es.input.json' = 'yes','es.index.auto.create' = 'true','es.write.operation'='upsert',
  5. 'es.nodes.wan.only' = 'true','es.port' = '443','es.net.ssl'='true','es.batch.size.entries'='1','es.mapping.id' ='id','es.batch.write.retry.count'='-1',
  6. 'es.batch.write.retry.wait'='60s',
  7. 'es.write.rest.error.handlers' = 'es, ignoreBadRecords',
  8. 'es.write.data.error.handlers' = 'log, customLog, badJsonHandler',
  9. 'es.write.data.error.handler.customLog' = 'com.xyz.elshandler.CustomLogOnError',
  10. 'es.write.data.error.handler.badJsonHandler' = 'com.xyz.elshandler.BadJsonHandler',
  11. 'es.write.rest.error.handler.es.client.resource'="error_es_index/error",
  12. 'es.write.rest.error.handler.es.return.default'='HANDLED',
  13. 'es.write.rest.error.handler.log.logger.name' = 'BulkErrors',
  14. 'es.write.data.error.handler.log.logger.name' = 'SerializationErrors',
  15. 'es.write.rest.error.handler.ignoreBadRecords' = 'com.xyz.elshandler.IgnoreBadRecordHandler',
  16. 'es.write.rest.error.handler.es.return.error'='HANDLED');
展开查看全部

相关问题