dump可以工作,但是store不能-pig-在哪里可以找到错误的详细信息?

o2gm4chl  于 2021-06-21  发布在  Pig
关注(0)|答案(0)|浏览(153)

我正在尝试加载apache日志,将其拆分为字段并保存到hcatalog中。

apache_log = LOAD 'httpd-www01-access.log.2014-02-09-*' USING TextLoader AS (line:chararray);

apache_row = FOREACH apache_log GENERATE FLATTEN (
REGEX_EXTRACT_ALL
(line,'^"(\\S+)" \\[(\\d{2}\\/\\w+\\/\\d{4}:\\d{2}:\\d{2}:\\d{2} \\+\\d{4}]) (\\S+) (\\S+) "(.+?)" (\\S+) (\\S+) "([^"]*)" "([^"]*)" "([^"]*)"'))
AS (ip: chararray, datetime: chararray, session_id: chararray, time_of_request:chararray, request: chararray, status: chararray, size: chararray, referer : chararray, cookie: chararray, user_agent: chararray);

如果我这样做了:

a = sample apache_row 0.001;
dump a

它起作用了。
但是

store apache_row into 'stage.apache_log' using org.apache.hcatalog.pig.HCatStorer();

不。
错误:

2014-02-17 08:17:13,812 [main] WARN  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Ooops! Some job has failed! Specify -stop_on_failure if you want Pig to stop immediately on failure.
2014-02-17 08:17:13,812 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - job job_201402120751_0117 has failed! Stop running all dependent jobs
2014-02-17 08:17:13,812 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 100% complete
2014-02-17 08:17:13,814 [main] ERROR org.apache.pig.tools.pigstats.PigStatsUtil - 1 map reduce job(s) failed!
2014-02-17 08:17:13,815 [main] INFO  org.apache.pig.tools.pigstats.SimplePigStats - Script Statistics:

HadoopVersion   PigVersion      UserId  StartedAt       FinishedAt      Features
1.2.0.1.3.2.0-111       0.11.1.1.3.2.0-111      pig     2014-02-17 08:16:24     2014-02-17 08:17:13     UNKNOWN

Failed!

Failed Jobs:
JobId   Alias   Feature Message Outputs
job_201402120751_0117   apache_log,apache_row   MAP_ONLY        Message: Job failed! Error - # of failed Map Tasks exceeded allowed limit. FailedCount: 1. LastFailedTask: task_201402120751_0117_m_000000    stage.atg_apache_log,

Input(s):
Failed to read data from "hdfs://hadoop1:8020/user/pig/httpd-www01-access.log.2014-02-09-*"

Output(s):
Failed to produce result in "stage.apache_log"

Counters:
Total records written : 0
Total bytes written : 0
Spillable Memory Manager spill count : 0
Total bags proactively spilled: 0
Total records proactively spilled: 0

Job DAG:
job_201402120751_0117

我在哪里可以找到这个问题的任何细节?我可以在下面找到更多详细信息:
hadoop1:50030/jobdetails.jsp?jobid=job\u 201402120751\u 0117
但当工作完成后,它就不起作用了。。。
当做
棘爪

暂无答案!

目前还没有任何答案,快来回答吧!

相关问题