hive修改分区表数据

gpnt7bae 于 2021-06-02 发布在 Hadoop

关注(0)|答案(2)|浏览(602)

问题：一列值为空。应该是ab。不幸的是我写了“ab”而不是“ab”。
我的表是分区表。有什么办法可以改变吗？
我找到了下面的方法。但似乎效率低下。
创建一个类似于我的表的临时表
使用插入覆盖。从旧表中读取数据并写入新表。我正在使用case语句将“”更改为“ab”
然后把我的临时表改成原来的表。
我正在寻找一个解决方案，如更新分区和msck的东西。有什么办法吗？

hadoop Hive

来源：https://stackoverflow.com/questions/41820514/hive-modify-partitioned-table-data

2条答案

按热度按时间

xfb7svmp1#

一个可能的解决办法是 update 在提供的表上，列既不是分区列，也不是bucketing列。

UPDATE tablename SET column = (CASE WHEN column = '' THEN 'ab' else column END) [WHERE expr if any];

更新：支持配置单元上的acid操作

SET hive.support.concurrency=true;
SET hive.enforce.bucketing=true;
SET hive.exec.dynamic.partition.mode=nonstrict;
SET hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager;
SET hive.compactor.initiator.on=true;
SET hive.compactor.worker.threads=1;

注：仅当配置单元>=0.14时工作

赞(0）回复(0）举报 2021-06-03

mnemlml82#

您可以这样覆盖单个分区：

set hive.exec.dynamic.partition=true;
set hive.exec.dynamic.partition.mode=nonstrict;

insert overwrite target_table partition (part_col)
select 
case when column ='' then 'ab' else column end as column ,
col2,    --select all the columns in the same order
col3,
part_col --partition column is the last one
from target_table where part_col='your_partition_value';

赞(0）回复(0）举报 2021-06-02

我来回答

hive修改分区表数据

2条答案

相关问题

热门标签

最新问答