我有源文件csv和数据如下所示
“201814”,“39”,“0598824”,“黄色夹套陷阱w”,“piege guep.jau,ouest”,“act”,“7/20/2016”,“c/e”
“,”05“,”st“,”n“,”15“,”2484“,”985.3999999999999998“,”43.66“,”3762.36“,”53.05“,”n“,”5.83“,”7.9900“,”0000“,”0000“,”0000“,”3.82“,”3.8181“,”7162“,”英镑国际“,”d“,”12“,”yjtd-db12-w-“,”12“,”32“,”0“,”0“,”0“,”0“,”3.68“,”0“,”3.8181“,”7162“,”英镑
为了加载数据,我使用下面的create语句和serde
ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.OpenCSVSerde'
WITH SERDEPROPERTIES (
"separatorChar" = "|",
"quoteChar" = '\"',
"escapeChar" = '\\')
问题是在“\”之后,文件中的任何数据都将变为null
你能告诉我怎么处理吗?
我的完整ddl
CREATE EXTERNAL TABLE
excess_inventory
(
whole_record string,
yyyyww string,
excess_wks_num string,
product_num string,
eng_desc string,
fr_desc string,
status string,
corp_status_change_date string,
whse_region string,
whse_id string,
channel_cd string,
eap_ind string,
fwos string,
non_alloc_qty string,
excess_qty string,
excess_cube string,
excess_inventory_dollars string,
monthly_storage_cost string,
deal_600 string,
go_ind string,
next_5_deals string,
reg_adlr string,
reg_retail string,
r52_best_promo_adlr string,
r52_best_promo_retail string,
landed_cost string,
corp_cost string,
vendor_num string,
vendor_nm string,
vendor_origin string,
vendor_moq string,
vendor_part_num string,
vendor_lead_tm string,
total_lead_tm string,
ingate_qty string,
on_order_qty string,
dealer_restriction_cd string,
quote_cost string,
casting_charge string,
action_cd string,
action_yyyyww string,
action_qty string,
sugg_adlr string,
comments string,
create_yyyyww string,
user_nm string,
batch_ts timestamp
)
PARTITIONED BY (partition_batch_ts bigint)
ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.OpenCSVSerde'
WITH SERDEPROPERTIES (
"separatorChar" = "|",
"quoteChar" = '\"',
"escapeChar" = '\\')
STORED AS TEXTFILE
LOCATION
'db/excess_inventory/table'
TBLPROPERTIES('skip.header.line.count'='1','serialization.null.format'='');
还要让我知道“separatorchar”=“|”,是用来表示数据作为管道分隔符保存在hdfs中,还是我们必须在源文件中指定分隔符?
暂无答案!
目前还没有任何答案,快来回答吧!