配置单元csv行分隔符配置

2guxujil  于 2021-06-27  发布在  Hive
关注(0)|答案(1)|浏览(462)

使用配置单元在csv文件上创建外部表时,可以使用配置单元内部csv序列:

...
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ','
STORED AS TEXTFILE
LOCATION '...'
TBLPROPERTIES('serialization.null.format'='')

或opencsv服务器:

ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.OpenCSVSerde'
WITH SERDEPROPERTIES ( "separatorChar" = " ", "quoteChar" = '"', "escapeChar" = "\\" )

我的问题是,如果我有这样一个csv文件:

foo,bar,hello\rworld\rbaz,1\n
foo,bar,bye\rworld\rbaz,2\n
foo,bar,hi\rworld\rbaz,3\n
foo,bar,goodbye\rworld\rbaz,4\n

如何将行尾配置为 \n 而忽视 \r -把它留在田地里?
编辑:
->尝试使用时 LINES TERMINATED BY '\r\n' 出现以下错误:

org.apache.hive.service.cli.HiveSQLException: Error while compiling statement: FAILED: SemanticException 3:20 LINES TERMINATED BY only supports newline '\n' right now. Error encountered near token ''\r\n''
pbgvytdp

pbgvytdp1#

你可以用 LINES TERMINATED BY 在你的 create table 声明如下:

...
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ','
LINES TERMINATED BY '\n'
STORED AS TEXTFILE
LOCATION '...'
TBLPROPERTIES('serialization.null.format'='')

相关问题