我是一个hive新手,在将一个大的(1tb)hdfs文件放入一个分区的hive管理表时遇到了很多问题。你能帮我解决这个问题吗?我觉得我有一个坏的配置某处,因为我不能完成减速机的工作。
我的问题是:
DROP TABLE IF EXISTS ts_managed;
SET hive.enforce.sorting = true;
CREATE TABLE IF NOT EXISTS ts_managed (
svcpt_id VARCHAR(20),
usage_value FLOAT,
read_time SMALLINT)
PARTITIONED BY (read_date INT)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ','
STORED AS ORC
TBLPROPERTIES("orc.compress"="snappy","orc.create.index"="true","orc.bloom.filter.columns"="svcpt_id");
SET hive.vectorized.execution.enabled = true;
SET hive.vectorized.execution.reduce.enabled = true;
SET set hive.cbo.enable=true;
SET hive.tez.auto.reducer.parallelism=true;
SET hive.exec.reducers.max=20000;
SET yarn.nodemanager.pmem-check-enabled = true;
SET optimize.sort.dynamic.partitioning=true;
SET hive.exec.max.dynamic.partitions=10000;
INSERT OVERWRITE TABLE ts_managed
PARTITION (read_date)
SELECT svcpt_id, usage, read_time, read_date
FROM ts_raw
DISTRIBUTE BY svcpt_id
SORT BY svcpt_id;
我的群集规格是:
虚拟机群集
共4个节点
4个数据节点
32芯
140 gb内存
hortonworks hdp 3.0版
apache-tez作为默认配置单元引擎
我是集群的唯一用户
我的Yarn配置是:
yarn.nodemanager.resource.memory-mb = 32GB
yarn.scheduler.minimum-allocation-mb = 512MB
yarn.scheduler.maximum-allocation-mb = 8192MB
yarn-heapsize = 1024MB
我的配置单元是:
hive.tez.container.size = 682MB
hive.heapsize = 4096MB
hive.metastore.heapsize = 1024MB
hive.exec.reducer.bytes.per.reducer = 1GB
hive.auto.convert.join.noconditionaltask.size = 2184.5MB
hive.tex.auto.reducer.parallelism = True
hive.tez.dynamic.partition.pruning = True
我的tez配置:
tez.am.resource.memory.mb = 5120MB
tez.grouping.max-size = 1073741824 Bytes
tez.grouping.min-size = 16777216 Bytes
tez.grouping.split-waves = 1.7
tez.runtime.compress = True
tez.runtime.compress.codec = org.apache.hadoop.io.compress.SnappyCodec
我尝试过无数种配置,包括:
分区日期
分区在日期,群集在svcpt\u id和bucket上
按日期分区,按svcpt上的bloom筛选器,按svcpt\u id排序
按日期分区,按svcpt上的bloom筛选器,按svcpt\u id分发和排序
我可以让我的Map顶点运行,但我还没有得到我的第一个还原顶点完成。下面是我最近从上述查询中得到的示例:
----------------------------------------------------------------------------------------------
VERTICES MODE STATUS TOTAL COMPLETED RUNNING PENDING FAILED KILLED
----------------------------------------------------------------------------------------------
Map 1 .......... container SUCCEEDED 1043 1043 0 0 0 0
Reducer 2 container RUNNING 9636 0 0 9636 1 0
Reducer 3 container INITED 9636 0 0 9636 0 0
----------------------------------------------------------------------------------------------
VERTICES: 01/03 [=>>-------------------------] 4% ELAPSED TIME: 6804.08 s
----------------------------------------------------------------------------------------------
错误是:
Error: Error while processing statement: FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.tez.TezTask. Vertex failed, vertexName=Reducer 2, vertexId=vertex_1537061583429_0010_2_01, diagnostics=[Task failed, taskId=task_1537061583429_0010_2_01_000070, diagnostics=[TaskAttempt 0 failed, info=[Error: Error while running task ( failure ) : java.lang.OutOfMemoryError: unable to create new native thread
我要么得到这个我似乎无法绕过的oom错误,要么我得到datanodes离线,无法满足我的复制因子要求。
在这一点上,我已经排除了两个多星期。任何接触专业顾问,我可以支付解决这个问题也将不胜感激。
提前谢谢!
1条答案
按热度按时间l3zydbqr1#
我和一个hortonworks的技术人员谈过之后解决了这个问题。结果是我把我的table分区过度了。在大约4年的时间里,我没有按天分区,而是按月分区,效果很好。