hive acid更新和删除错误

我用hive1.2.1和tez0.7做测试，但是用acid table做更新和删除的时候，出现了一些问题，下面是表的结构：

CREATE EXTERNAL TABLE IF NOT EXISTS working.dw_items_w
(
column defination
)
CLUSTERED BY (id) into 5000 buckets
STORED AS ORC
LOCATION '/sys/edw/working/dw_items_w2'
TBLPROPERTIES ("transactional"="true");

更新查询如下：

update working.dw_items_w
set 
PROCESS_FLAG =(case when (
(TGT_LSTG_STATUS_ID = 1 and (to_date(SALE_END) - to_date(TGT_AUCT_END_DT) ) <> 0 )
or  (TGT_LSTG_STATUS_ID in (1,2) and NEW_LSTG_STATUS_ID in (0,4) )   
) then  'D' 
when 
((TGT_LSTG_STATUS_ID =1 and NEW_LSTG_STATUS_ID = 1 and datediff(to_date(SALE_END) ,to_date(TGT_AUCT_END_DT) 
) = 0 )
or (TGT_LSTG_STATUS_ID = 2 and NEW_LSTG_STATUS_ID = 1)) then 'X' else PROCESS_FLAG end ),
NEW_LSTG_STATUS_ID = (case when TGT_LSTG_STATUS_ID = 0  AND NEW_LSTG_STATUS_ID = 0   AND to_date(SALE_END)
 <  date_sub(to_date( from_unixtime(unix_timestamp(),'yyyy-MM-dd') ), 92)
     AND to_date(SALE_END)  <> to_date('1969-12-31') then 1 else NEW_LSTG_STATUS_ID end) 
where PROCESS_FLAG = 'U';

问题如下：
位于org.apache.hadoop.hive.ql.exec.tez.tezprocessor.initializeandrunprocessor（tezprocessor）。java:171)在org.apache.hadoop.hive.ql.exec.tez.tezprocessor.run（tezprocessor。java:137)位于org.apache.tez.runtime.logicalioprocessorruntimetask.run（logicalioprocessorruntimetask）。java:344)在org.apache.tez.runtime.task.teztaskrunner$taskrunnercallable$1.run（teztaskrunner）。java:179)在org.apache.tez.runtime.task.teztaskrunner$taskrunnercallable$1.run（teztaskrunner）。java:171)位于javax.security.auth.subject.doas（subject）的java.security.accesscontroller.doprivileged（本机方法）。java:415)在org.apache.hadoop.security.usergroupinformation.doas（用户组信息。java:1650)位于org.apache.tez.runtime.task.teztaskrunner$taskrunnercallable.callinternal（teztaskrunner）。java:171)位于org.apache.tez.runtime.task.teztaskrunner$taskrunnercallable.callinternal（teztaskrunner）。java:167)在org.apache.tez.common.callablewithndc.call（callablewithndc。java:36)在java.util.concurrent.futuretask.run（futuretask。java:262)位于java.util.concurrent.threadpoolexecutor.runworker（threadpoolexecutor。java:1145)在java.util.concurrent.threadpoolexecutor$worker.run（threadpoolexecutor。java:615)在java.lang.thread.run（线程。java:745)原因：java.lang.runtimeexception:org.apache.hadoop.hive.ql.metadata.hiveexception:处理行（tag=0）{“key”：{“reducesinkkey0”：{“transactionid”：19，“bucketid”：471，“rowid”：0}}，“value”：在org.apache.hadoop.hive.ql.exec.tez.reducerecordsource.pushrecord（reducerecordsource）中被忽略。java:302)在org.apache.hadoop.hive.ql.exec.tez.reducerecordprocessor.run（reducerecordprocessor。java:249)位于org.apache.hadoop.hive.ql.exec.tez.tezprocessor.initializeandrunprocessor（tezprocessor）。java:148) ... 14个以上

将以下内容添加到hive-site.xml

<property>
    <name>hive.enforce.bucketing</name>
    <value>true</value>
</property>
<property>
    <name>hive.compactor.initiator.on</name>
    <value>true</value>
</property>
<property>
    <name>hive.support.concurrency</name>
    <value>true</value>
</property>
<property>
    <name>hive.compactor.worker.threads</name>
    <value>1</value>
</property>
<property>
    <name>hive.txn.manager</name>
    <value>org.apache.hadoop.hive.ql.lockmgr.DbTxnManager</value>
</property>

然后确保您正在创建一个orc表，其中bucketing在predict上：
创建表如果不存在foo.tableinfo（schema\u name varchar（32）、table\u name varchar（64）、department varchar（64）、country varchar（64）、state varchar（64）、city varchar（64）、granularity int、notes varchar（256））由（table\u name）聚集到4个存储桶中，存储为orc TBLProperty（“orc.compress”=“zlib”，'transactional'=“true”）；
然后以下步骤将起作用：
从foo.tableinfo中删除，其中table_name='foo'；

展开查看全部

hive acid更新和删除错误

1条答案

相关问题

热门标签

最新问答