hive面临的挑战

bvjveswy  于 2021-05-30  发布在  Hadoop
关注(0)|答案(4)|浏览(439)

有谁能指导我在做动态分区时哪里做错了。
--暂存表:

create table staging_peopledata
    (
    firstname string, 
    secondname string, 
    salary float, 
    country string
    state string
    ) 
    row format delimited fields terminated by ',' lines terminated by '\n';

--暂存表的数据:

John,David,30000,RUS,tnRUS
    John,David,30000,RUS,tnRUS
    Mary,David,5000,AUS,syAUS
    Mary,David,5000,AUS,syAUS
    Mary,David,5000,AUS,weAUS
    Pierre,Cathey,6000,RUS,kaRUS
    Pierre,Cathey,6000,RUS,kaRUS
    Ahmed,Talib,11000,US,bcUS
    Ahmed,Talib,11000,US,onUS
    Ahmed,Talib,11000,US,onUS
    kris,David,80000,UK,lnUK
    kris,David,80000,UK,soUK

--生产表:

create table Production_peopledata
    (
    firstname string, 
    lastname string, 
    salary float) 
    partitioned by (country string, state string) 
    row format delimited fields terminated by ',' lines terminated by '\n';

    SET hive.exec.dynamic.partition=true;
    SET hive.exec.dynamic.partition.mode=nonstrict;

    insert overwrite table Production_peopledata 
    partition(country,state) 
    select firstname, secondname, salary, country, state from staging_peopledata;

如果我执行上述命令,我得到如下错误。

FAILED: SemanticException [Error 10096]: Dynamic partition strict mode
     requires atleast one static partition column. To turn this off set
     hive.exec.dynamic.partition.mode=nonstrict

谁能告诉我我在哪里犯的错吗。

qmelpv7a

qmelpv7a1#

根据错误,似乎模式仍然严格,对于动态分区,需要将其设置为下面的非严格使用命令
hive>set hive.exec.dynamic.partition.mode=nonstrict;

sz81bmfz

sz81bmfz2#

您需要设置以下属性:

set hive.exec.dynamic.partition=true;  
set hive.exec.dynamic.partition.mode=nonstrict;

要分区的列名不应是表定义的一部分。因为分区列是动态生成的。在分区表中填充数据时,分区列应该来自源表。
假设我们有 EMP 以及 EMP1 table。 EMP1 将从中获取数据的分区表 EMP table。最初这两个表是相同的。所以首先我们需要创建一个分区列,即。 salpart . 然后我们将在源表中添加此列 EMP . 成功运行后,我们可以在user/hive/warehouse位置看到分区文件。以上说明如下:

load data local inpath '/home/cloudera/myemployeedata.txt' overwrite into table emp;

CREATE TABLE IF NOT EXISTS emp ( eid int, name String,
salary String, destination String,salpart string)
COMMENT "Employee details"
ROW FORMAT DELIMITED
FIELDS TERMINATED BY "\t"
LINES TERMINATED BY "\n"
STORED AS TEXTFILE;

CREATE TABLE IF NOT EXISTS emp1 ( eid int, name String,
salary String, destination String)
COMMENT "Employee details"
partitioned by (salpart string)  {this column will values will come from a seperate table }
ROW FORMAT DELIMITED
FIELDS TERMINATED BY "\t"
LINES TERMINATED BY "\n"
STORED AS TEXTFILE;

Dynamic Partition:
set hive.exec.dynamic.partition=true;  
set hive.exec.dynamic.partition.mode=nonstrict;  

insert overwrite table emp1 partition(salpart) select eid,name,salary,destination,salpart from emp;
8yoxcaq7

8yoxcaq73#

再次尝试设置hive.exec.dynamic.partition.mode=nonstrict有时在hive中,即使设置此属性,也会发生这种情况,因为它认为是严格模式,因此我建议您再次设置此属性

o8x7eapl

o8x7eapl4#

你能在Hive外壳上运行下面的命令吗。

hive>set hive.exec.dynamic.partition.mode=nonstrict;

相关问题