hadoop—将一个配置单元表与另一个外部表的中间列进行分区

9gm1akwq  于 2021-06-02  发布在  Hadoop
关注(0)|答案(1)|浏览(372)

我创建了一个外部表,如下所示:

create external table if not exists complaints (date_received string, product string, sub_product string, issue string, sub_issue string, consumer_complaint_narrative string, state string, company_public_response string, company varchar(50), zipcode int, tags string, consumer_consent_provided string, submitted_via string, date_sent_company string, company_response string, timely_response string, consumer_disputed string, complaint_id int) row format delimited fields terminated by ',' stored as textfile location 'hdfs:hostname:8020/complaints/';

现在我想创建另一个以分区为状态的新表,并拥有上面表中的所有数据。这怎么可能实现呢?
我尝试了以下方法:

create external table if not exists complaints_new (date_received string, product string, sub_product string, issue string, sub_issue string, consumer_complaint_narrative string, company_public_response string, company varchar(50), zipcode int, tags string, consumer_consent_provided string, submitted_via string, date_sent_company string, company_response string, timely_response string, consumer_disputed string, complaint_id int) partitioned by (state varchar(20)) row format delimited fields terminated by ',' stored as textfile location 'hdfs://hostname:8020/complaints/';

SET hive.exec.dynamic.partition = true;
SET hive.exec.dynamic.partition.mode = nonstrict;
SET hive.mapred.mode = nonstrict;

insert into table complaints_new partition(state) select * from complaints;

查询失败。

n7taea2i

n7taea2i1#

你有一些问题。。。您指向同一位置,这意味着您将读取并覆盖该位置。。。另一个问题是,hive希望th partition列是列表中的最后一个元素,这意味着您不能执行select*,而是必须选择一个字段到另一个字段,并将状态和select语句的结尾放在一起

相关问题