加载piglatin格式的csv文件

dgtucam1  于 2021-06-25  发布在  Pig
关注(0)|答案(1)|浏览(297)

我正试着用piglatin加载一个csv文件。记录格式如下: "ABBOTT,DEEDEE W",GRADES 9-12 TEACHER,"52,122.10",0,LBOE,ATLANTA INDEPENDENT SCHOOL SYSTEM,2010 我尝试了以下代码:

A = LOAD '/user/hduser/salaryTravel.csv' using PigStorage(',')  AS (name:chararray,job:chararray,salary:float,TA:float,type:chararray,org:chararray,year:int);

但输出如下:

("ABBOTT,DEEDEE W",,,122.10",0,)

这个 name 字段作为单独的字段读取,因为名称字段包含逗号(',')。我怎么看这张唱片?

vfwfrxfs

vfwfrxfs1#

建议使用csvexcelstorage或csvloader api来加载数据。

REGISTER piggybank.jar;

A = LOAD '/user/hduser/salaryTravel.csv' using org.apache.pig.piggybank.storage.CSVExcelStorage()  AS (name:chararray,job:chararray,salary:float,TA:float,type:chararray,org:chararray,year:int);

REGISTER piggybank.jar;

A = LOAD '/user/hduser/salaryTravel.csv' using org.apache.pig.piggybank.storage. CSVLoader()  AS (name:chararray,job:chararray,salary:float,TA:float,type:chararray,org:chararray,year:int);

参考:正则表达式\u提取错误在Pig,有几个代码共享样本。

相关问题