sqoop作业通过dataproc失败

rqcrx0a6  于 2021-06-03  发布在  Sqoop
关注(0)|答案(1)|浏览(466)

这个问题在这里已经有了答案

dataproc上的sqoop无法将数据导出为avro格式(1个答案)
去年关门了。
我已经通过gcpdataproc集群提交了sqoop作业并进行了设置 --as-avrodatafile 配置参数,但失败,错误如下:

/08/12 22:34:34 INFO impl.YarnClientImpl: Submitted application application_1565634426340_0021
19/08/12 22:34:34 INFO mapreduce.Job: The url to track the job: http://sqoop-gcp-ingest-mzp-m:8088/proxy/application_1565634426340_0021/
19/08/12 22:34:34 INFO mapreduce.Job: Running job: job_1565634426340_0021
19/08/12 22:34:40 INFO mapreduce.Job: Job job_1565634426340_0021 running in uber mode : false
19/08/12 22:34:40 INFO mapreduce.Job:  map 0% reduce 0%
19/08/12 22:34:45 INFO mapreduce.Job: Task Id : attempt_1565634426340_0021_m_000000_0, Status : FAILED
Error: org.apache.avro.reflect.ReflectData.addLogicalTypeConversion(Lorg/apache/avro/Conversion;)V
19/08/12 22:34:50 INFO mapreduce.Job: Task Id : attempt_1565634426340_0021_m_000000_1, Status : FAILED
Error: org.apache.avro.reflect.ReflectData.addLogicalTypeConversion(Lorg/apache/avro/Conversion;)V
19/08/12 22:34:55 INFO mapreduce.Job: Task Id : attempt_1565634426340_0021_m_000000_2, Status : FAILED
Error: org.apache.avro.reflect.ReflectData.addLogicalTypeConversion(Lorg/apache/avro/Conversion;)V
19/08/12 22:35:00 INFO mapreduce.Job:  map 100% reduce 0%
19/08/12 22:35:01 INFO mapreduce.Job: Job job_1565634426340_0021 failed with state FAILED due to: Task failed task_1565634426340_0021_m_000000
Job failed as tasks failed. failedMaps:1 failedReduces:0

19/08/12 22:35:01 INFO mapreduce.Job: Counters: 11
    Job Counters 
        Failed map tasks=4
        Launched map tasks=4
        Other local map tasks=4
        Total time spent by all maps in occupied slots (ms)=41976
        Total time spent by all reduces in occupied slots (ms)=0
        Total time spent by all map tasks (ms)=13992
        Total vcore-milliseconds taken by all map tasks=13992
        Total megabyte-milliseconds taken by all map tasks=42983424
    Map-Reduce Framework
        CPU time spent (ms)=0
        Physical memory (bytes) snapshot=0
        Virtual memory (bytes) snapshot=0
19/08/12 22:35:01 WARN mapreduce.Counters: Group FileSystemCounters is deprecated. Use org.apache.hadoop.mapreduce.FileSystemCounter instead
19/08/12 22:35:01 INFO mapreduce.ImportJobBase: Transferred 0 bytes in 30.5317 seconds (0 bytes/sec)
19/08/12 22:35:01 INFO mapreduce.ImportJobBase: Retrieved 0 records.
19/08/12 22:35:01 DEBUG util.ClassLoaderStack: Restoring classloader: sun.misc.Launcher$AppClassLoader@61baa894
19/08/12 22:35:01 ERROR tool.ImportTool: Import failed: Import job failed!
19/08/12 22:35:01 DEBUG manager.OracleManager$ConnCache: Caching released connection for jdbc:oracle:thin:@10.25.42.52:1521/uataca.aaamidatlantic.com/GCPREADER
Job output is complete

未指定 --as-avrodatafile 它工作得很好。

du7egjpx

du7egjpx1#

要解决此问题,您需要设置 mapreduce.job.classloader 属性值到 true 提交作业时:

gcloud dataproc jobs submit hadoop --cluster="${CLUSTER_NAME}" \
    --class="org.apache.sqoop.Sqoop" \
    --properties="mapreduce.job.classloader=true" \
    . . .
    -- \
    --as-avrodatafile \
    . . .

相关问题