设置
我正在使用elasticsearch spark将数据从elasticsearch2.4移动到elasticsearch2.4(不要问我为什么我仍然使用elastic 2.4:p)
下面是我的 build.gradle
代码段。我在用
sparkVersion=2.4.4 && scalaVersion=2.11 implementation "org.elasticsearch:elasticsearch-spark-20_${scalaVersion}:7.6.2"
我在启动spark作业之前创建了Map,以便在源和目标中具有完全相同的Map
问题
我有一个如下的Map:
"last_found_results": {
"type": "date",
"format": "dateOptionalTime"
}
此字段中的示例数据: 2017-11-18T22:41:06Z
.
当spark尝试写入时,它似乎自动将日期转换为epoch,并抛出以下错误。
Error sample (first [5] error messages):
org.elasticsearch.hadoop.rest.EsHadoopRemoteException: mapper_parsing_exception: failed to parse [last_found_results];org.elasticsearch.hadoop.rest.EsHadoopRemoteException: illegal_argument_exception: Invalid format: "1566573460000" is malformed at "0000"
Bailing out...
at org.elasticsearch.hadoop.rest.bulk.BulkProcessor.flush(BulkProcessor.java:519)
at org.elasticsearch.hadoop.rest.bulk.BulkProcessor.add(BulkProcessor.java:113)
at org.elasticsearch.hadoop.rest.RestRepository.doWriteToIndex(RestRepository.java:192)
at org.elasticsearch.hadoop.rest.RestRepository.writeToIndex(RestRepository.java:172)
at org.elasticsearch.spark.rdd.EsRDDWriter.write(EsRDDWriter.scala:74)
at org.elasticsearch.spark.sql.EsSparkSQL$$anonfun$saveToEs$1.apply(EsSparkSQL.scala:101)
at org.elasticsearch.spark.sql.EsSparkSQL$$anonfun$saveToEs$1.apply(EsSparkSQL.scala:101)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
at org.apache.spark.scheduler.Task.run(Task.scala:123)
at org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.apply(Executor.scala:408)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:414)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
我们非常感谢您对解决这个问题的任何帮助。
暂无答案!
目前还没有任何答案,快来回答吧!