hive acid更新和删除错误

mfuanj7w  于 2021-05-29  发布在  Hadoop
关注(0)|答案(1)|浏览(322)

我用hive1.2.1和tez0.7做测试,但是用acid table做更新和删除的时候,出现了一些问题,下面是表的结构:

  1. CREATE EXTERNAL TABLE IF NOT EXISTS working.dw_items_w
  2. (
  3. column defination
  4. )
  5. CLUSTERED BY (id) into 5000 buckets
  6. STORED AS ORC
  7. LOCATION '/sys/edw/working/dw_items_w2'
  8. TBLPROPERTIES ("transactional"="true");

更新查询如下:

  1. update working.dw_items_w
  2. set
  3. PROCESS_FLAG =(case when (
  4. (TGT_LSTG_STATUS_ID = 1 and (to_date(SALE_END) - to_date(TGT_AUCT_END_DT) ) <> 0 )
  5. or (TGT_LSTG_STATUS_ID in (1,2) and NEW_LSTG_STATUS_ID in (0,4) )
  6. ) then 'D'
  7. when
  8. ((TGT_LSTG_STATUS_ID =1 and NEW_LSTG_STATUS_ID = 1 and datediff(to_date(SALE_END) ,to_date(TGT_AUCT_END_DT)
  9. ) = 0 )
  10. or (TGT_LSTG_STATUS_ID = 2 and NEW_LSTG_STATUS_ID = 1)) then 'X' else PROCESS_FLAG end ),
  11. NEW_LSTG_STATUS_ID = (case when TGT_LSTG_STATUS_ID = 0 AND NEW_LSTG_STATUS_ID = 0 AND to_date(SALE_END)
  12. < date_sub(to_date( from_unixtime(unix_timestamp(),'yyyy-MM-dd') ), 92)
  13. AND to_date(SALE_END) <> to_date('1969-12-31') then 1 else NEW_LSTG_STATUS_ID end)
  14. where PROCESS_FLAG = 'U';

问题如下:
位于org.apache.hadoop.hive.ql.exec.tez.tezprocessor.initializeandrunprocessor(tezprocessor)。java:171)在org.apache.hadoop.hive.ql.exec.tez.tezprocessor.run(tezprocessor。java:137)位于org.apache.tez.runtime.logicalioprocessorruntimetask.run(logicalioprocessorruntimetask)。java:344)在org.apache.tez.runtime.task.teztaskrunner$taskrunnercallable$1.run(teztaskrunner)。java:179)在org.apache.tez.runtime.task.teztaskrunner$taskrunnercallable$1.run(teztaskrunner)。java:171)位于javax.security.auth.subject.doas(subject)的java.security.accesscontroller.doprivileged(本机方法)。java:415)在org.apache.hadoop.security.usergroupinformation.doas(用户组信息。java:1650)位于org.apache.tez.runtime.task.teztaskrunner$taskrunnercallable.callinternal(teztaskrunner)。java:171)位于org.apache.tez.runtime.task.teztaskrunner$taskrunnercallable.callinternal(teztaskrunner)。java:167)在org.apache.tez.common.callablewithndc.call(callablewithndc。java:36)在java.util.concurrent.futuretask.run(futuretask。java:262)位于java.util.concurrent.threadpoolexecutor.runworker(threadpoolexecutor。java:1145)在java.util.concurrent.threadpoolexecutor$worker.run(threadpoolexecutor。java:615)在java.lang.thread.run(线程。java:745)原因:java.lang.runtimeexception:org.apache.hadoop.hive.ql.metadata.hiveexception:处理行(tag=0){“key”:{“reducesinkkey0”:{“transactionid”:19,“bucketid”:471,“rowid”:0}},“value”:在org.apache.hadoop.hive.ql.exec.tez.reducerecordsource.pushrecord(reducerecordsource)中被忽略。java:302)在org.apache.hadoop.hive.ql.exec.tez.reducerecordprocessor.run(reducerecordprocessor。java:249)位于org.apache.hadoop.hive.ql.exec.tez.tezprocessor.initializeandrunprocessor(tezprocessor)。java:148) ... 14个以上

cgfeq70w

cgfeq70w1#

将以下内容添加到hive-site.xml

  1. <property>
  2. <name>hive.enforce.bucketing</name>
  3. <value>true</value>
  4. </property>
  5. <property>
  6. <name>hive.compactor.initiator.on</name>
  7. <value>true</value>
  8. </property>
  9. <property>
  10. <name>hive.support.concurrency</name>
  11. <value>true</value>
  12. </property>
  13. <property>
  14. <name>hive.compactor.worker.threads</name>
  15. <value>1</value>
  16. </property>
  17. <property>
  18. <name>hive.txn.manager</name>
  19. <value>org.apache.hadoop.hive.ql.lockmgr.DbTxnManager</value>
  20. </property>

然后确保您正在创建一个orc表,其中bucketing在predict上:
创建表如果不存在foo.tableinfo(schema\u name varchar(32)、table\u name varchar(64)、department varchar(64)、country varchar(64)、state varchar(64)、city varchar(64)、granularity int、notes varchar(256))由(table\u name)聚集到4个存储桶中,存储为orc TBLProperty(“orc.compress”=“zlib”,'transactional'=“true”);
然后以下步骤将起作用:
从foo.tableinfo中删除,其中table_name='foo';

展开查看全部

相关问题