hdfs-最后一个预期列之后的额外数据

9udxz4iz  于 2021-05-29  发布在  Hadoop
关注(0)|答案(1)|浏览(375)

我们有源系统和目标系统。尝试使用talend工具将数据从SQLServer2012导入到关键的hadoop(PHD3.0)版本。
获取错误:

ERROR: extra data after last expected column  (seg0 slice1 datanode.domain.com:40000 pid=15035)
  Detail: External table pick_report_stg0, line 5472 of pxf://masternnode/path/to/hdfs?profile=HdfsTextSimple: "5472;2016-11-28 08:39:54.217;;2016-11-15 00:00:00.0;SAMPLES;0005525;MORGAN -EVENTS;254056;1;IHBL-NHO..."

我们试过了
我们已经确认坏线为[hdfs@mdw ~]$hdfs dfs-cat/path/to/hdfs | grep 3548

3548;2016-11-28 04:21:39.97;;2016-11-15 00:00:00.0;SAMPLES;0005525;MORGAN -EVENTS;254056;1;IHBL-NHO-13OZ-01;0;ROC NATION; NH;2016-11-15 00:00:00.0;2016-11-15 00:00:00.0;;2.0;11.99;SA;SC01;NH02;EA;1;F2;NEW PKG ONLY PLEASE!! BY NOON

外部表的结构和format子句

CREATE EXTERNAL TABLE schemaname.tablename
(
"ID" bigint,
  "time" timestamp without time zone,
  "ShipAddress4" character(40),
  "EntrySystemDate" timestamp without time zone,
  "CorpAcctName" character(40),
  "Customer" character(7),
  "CustomerName" character(30),
  "SalesOrder" character(6),
  "OrderStatus" character(1),
  "MStockCode" character(30),
  "ShipPostalCode" character(9),
  "CustomerPoNumber" character(30),
  "OrderDate" timestamp without time zone,
  "ReqShipDate" timestamp without time zone,
  "DateValue" timestamp without time zone,
  "MOrderQty" numeric(9,0),
  "MPrice" numeric(9,0),
  "CustomerClass" character(2),
  "ProductClass" character(4),
  "ProductGroup" character(10),
  "StockUom" character(3),
  "DispatchCount" integer,
  "MWarehouse" character(2),
  "AlphaValue" varchar(100)
)
 LOCATION (
    'pxf://path/to/hdfs?profile=HdfsTextSimple'
)
 FORMAT 'csv' (delimiter ';' null '' quote ';')
ENCODING 'UTF8';

发现:额外的分号出现,导致额外的数据。但我仍然无法提供正确的格式条款。请指导我如何删除额外的数据列错误。
我应该使用什么格式子句。
任何帮助都将不胜感激!

xcitsw88

xcitsw881#

如果在encoding子句之后将以下内容附加到外部表定义中,则有助于解决少数行由于此问题而失败的问题:

LOG ERRORS INTO my_err_table SEGMENT REJECT LIMIT 1 PERCENT;

以下是有关此语法的参考:http://gpdb.docs.pivotal.io/4320/ref_guide/sql_commands/create_external_table.html

相关问题