我有一个 Hive Table
在 ORC
格式,其中:
我的table是 partition
签署人: Year + Month + Day
hdfs中的总文件大小:10tb
总记录:210亿条记录
hdp中的数据节点数:8
我是通过 Beeline
:
SELECT `timestamp`, url FROM events WHERE id='0ef3c9ba6cb5' ORDER BY `timestamp` DESC;
但是,此查询失败,原因是:
INFO : Compiling command(queryId=hive_20200605073915_22eb45aa-25f6-419a-9b55-57a0d98e3dac): select `timestamp`, url from events where partyid='0:3pu60uagp0:db698229-272e-4a1c-a18a-0ef3c9ba6cb5' order by `timestamp` desc
INFO : Warning: Map Join MAPJOIN[16][bigTable=?] in task 'Map 1' is a cross product
INFO : Semantic Analysis Completed (retrial = false)
INFO : Returning Hive schema: Schema(fieldSchemas:[FieldSchema(name:timestamp, type:bigint, comment:null), FieldSchema(name:url, type:string, comment:null)], properties:null)
INFO : Completed compiling command(queryId=hive_20200605073915_22eb45aa-25f6-419a-9b55-57a0d98e3dac); Time taken: 0.724 seconds
INFO : Executing command(queryId=hive_20200605073915_22eb45aa-25f6-419a-9b55-57a0d98e3dac): select `timestamp`, url from events where partyid='0:3pu60uagp0:db698229-272e-4a1c-a18a-0ef3c9ba6cb5' order by `timestamp` desc
INFO : Query ID = hive_20200605073915_22eb45aa-25f6-419a-9b55-57a0d98e3dac
INFO : Total jobs = 1
INFO : Launching Job 1 out of 1
INFO : Starting task [Stage-1:MAPRED] in serial mode
INFO : Subscribed to counters: [] for queryId: hive_20200605073915_22eb45aa-25f6-419a-9b55-57a0d98e3dac
INFO : Tez session hasn't been created yet. Opening session
INFO : Dag name: select `timestamp`, url fr...desc (Stage-1)
INFO : Setting tez.task.scale.memory.reserve-fraction to 0.30000001192092896
INFO : Status: Running (Executing on YARN cluster with App id application_1586459578755_0105)
----------------------------------------------------------------------------------------------
VERTICES MODE STATUS TOTAL COMPLETED RUNNING PENDING FAILED KILLED
----------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------
VERTICES: 00/00 [>>--------------------------] 0% ELAPSED TIME: 9428.98 s
----------------------------------------------------------------------------------------------
ERROR : Status: Failed------------------------] 0% ELAPSED TIME: 9426.66 s
ERROR : Application application_1586459578755_0105 failed 2 times due to ApplicationMaster for attempt appattempt_1586459578755_0105_000002 timed out. Failing the application.
ERROR : FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.tez.TezTask. Application application_1586459578755_0105 failed 2 times due to ApplicationMaster for attempt appattempt_1586459578755_0105_000002 timed out. Failing the application.
INFO : Completed executing command(queryId=hive_20200605073915_22eb45aa-25f6-419a-9b55-57a0d98e3dac); Time taken: 9433.73 seconds
Error: Error while processing statement: FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.tez.TezTask. Application application_1586459578755_0105 failed 2 times due to ApplicationMaster for attempt appattempt_1586459578755_0105_000002 timed out. Failing the application. (state=08S01,code=2)
我在ambari上使用hdp-3.0.1.0。这是我当前的配置:
Tez Container Size: 3072 MB
HiveServer2 Heap Size: 4096 MB
Memory: 819.2 MB
Data per Reducer: 2042.9 MB
发生此错误是因为资源配置问题吗?有没有建议更改哪个配置以获得最佳性能?
暂无答案!
目前还没有任何答案,快来回答吧!