我正在尝试加载apache日志,将其拆分为字段并保存到hcatalog中。
apache_log = LOAD 'httpd-www01-access.log.2014-02-09-*' USING TextLoader AS (line:chararray);
apache_row = FOREACH apache_log GENERATE FLATTEN (
REGEX_EXTRACT_ALL
(line,'^"(\\S+)" \\[(\\d{2}\\/\\w+\\/\\d{4}:\\d{2}:\\d{2}:\\d{2} \\+\\d{4}]) (\\S+) (\\S+) "(.+?)" (\\S+) (\\S+) "([^"]*)" "([^"]*)" "([^"]*)"'))
AS (ip: chararray, datetime: chararray, session_id: chararray, time_of_request:chararray, request: chararray, status: chararray, size: chararray, referer : chararray, cookie: chararray, user_agent: chararray);
如果我这样做了:
a = sample apache_row 0.001;
dump a
它起作用了。
但是
store apache_row into 'stage.apache_log' using org.apache.hcatalog.pig.HCatStorer();
不。
错误:
2014-02-17 08:17:13,812 [main] WARN org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Ooops! Some job has failed! Specify -stop_on_failure if you want Pig to stop immediately on failure.
2014-02-17 08:17:13,812 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - job job_201402120751_0117 has failed! Stop running all dependent jobs
2014-02-17 08:17:13,812 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 100% complete
2014-02-17 08:17:13,814 [main] ERROR org.apache.pig.tools.pigstats.PigStatsUtil - 1 map reduce job(s) failed!
2014-02-17 08:17:13,815 [main] INFO org.apache.pig.tools.pigstats.SimplePigStats - Script Statistics:
HadoopVersion PigVersion UserId StartedAt FinishedAt Features
1.2.0.1.3.2.0-111 0.11.1.1.3.2.0-111 pig 2014-02-17 08:16:24 2014-02-17 08:17:13 UNKNOWN
Failed!
Failed Jobs:
JobId Alias Feature Message Outputs
job_201402120751_0117 apache_log,apache_row MAP_ONLY Message: Job failed! Error - # of failed Map Tasks exceeded allowed limit. FailedCount: 1. LastFailedTask: task_201402120751_0117_m_000000 stage.atg_apache_log,
Input(s):
Failed to read data from "hdfs://hadoop1:8020/user/pig/httpd-www01-access.log.2014-02-09-*"
Output(s):
Failed to produce result in "stage.apache_log"
Counters:
Total records written : 0
Total bytes written : 0
Spillable Memory Manager spill count : 0
Total bags proactively spilled: 0
Total records proactively spilled: 0
Job DAG:
job_201402120751_0117
我在哪里可以找到这个问题的任何细节?我可以在下面找到更多详细信息:
hadoop1:50030/jobdetails.jsp?jobid=job\u 201402120751\u 0117
但当工作完成后,它就不起作用了。。。
当做
棘爪
暂无答案!
目前还没有任何答案,快来回答吧!