配置单元查询csv文本分隔符问题

nhaq1z21  于 2021-06-02  发布在  Hadoop
关注(0)|答案(1)|浏览(389)

正在尝试在配置单元中导入以下数据。
姓名、电话、地址

  1. Arverne,(718) 634-4784,"*312 Beach 54 Street
  2. Arverne, NY 11692
  3. (40.59428994144626, -73.78442865540268)*"
  4. Astoria,(718) 278-2220,"*14 01 Astoria Boulevard
  5. Long Island City, NY 11102
  6. (40.77152402451418, -73.92643545073543)*"
  7. Auburndale,(718) 352-2027,"*25 55 Francis Lewis Boulevard
  8. Flushing, NY 11358
  9. (40.76035096822195, -73.79632645819947)*"

但是地址不正确,因此损坏了表数据,我猜行终止于的问题是(默认情况下,因为地址是3-4行),原因是当我在示例数据下面运行时

  1. a,b,"e,f"
  2. x,y,"l,m"

带以下查询

  1. create table test(c1 string, c2 string, c3 string)
  2. row format serde 'com.bizo.hive.serde.csv.CSVSerde'
  3. with serdeproperties(
  4. "separatorChar" = ",");

它的工作很好:
试验c1试验c2试验c3

  1. a b c,d
  2. e f g,z

我如何做到这一点?

3xiyfsfu

3xiyfsfu1#

我就是这样锻炼的。

  1. >>> CREATE TABLE Test(name string, phone string, address string) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' STORED AS TEXTFILE;
  2. >>> load data inpath 'file.csv' into table Test;
  3. >>> select name from hiveTest;
  4. +-------------+--+
  5. | name |
  6. +-------------+--+
  7. | Arverne |
  8. | Astoria |
  9. | Auburndale |
  10. +-------------+--+
  11. >>> select address from hiveTest;
  12. +--------------------------------------------+--+
  13. | address |
  14. +--------------------------------------------+--+
  15. | "312 Beach 54 Street Arverne |
  16. | "14 01 Astoria Boulevard Long Island City |
  17. | "25 55 Francis Lewis Boulevard Flushing |
  18. +--------------------------------------------+--+

我想这有帮助。

展开查看全部

相关问题