转储数据集时将数据从hive加载到pig时出错

rjee0c15  于 2021-05-31  发布在  Hadoop
关注(0)|答案(0)|浏览(229)

retail_db.categories 有58排

$pig -useHCatalog
grunt> pcategories = LOAD 'retail_db.categories' USING org.apache.hive.hcatalog.pig.HCatLoader();
grunt>b = limit pcategories 100;
grunt>dump b;

然后我得到所有的记录,但当我试图转储原始数据集

grunt>dump pcategories;

那我就错了
2018-04-15 16:27:46444[main]info org.apache.hadoop.conf.configuration.deprecation-yarn.resourcemanager.system-metrics-publisher.enabled已弃用。相反,使用yarn.system-metrics-publisher.enabled 2018-04-15 16:27:46723[main]info org.apache.hadoop.hive.metastore.objectstore-objectstore,初始化名为2018-04-15 16:27:47170[main]info org.apache.hadoop.hive.metastore.metastoredirectsql-使用直接sql,底层数据库是mysql 2018-04-15 16:27:47,171[main]info org.apache.hadoop.hive.metastore.objectstore-initialized objectstore 2018-04-15 16:27:47171[main]info org.apache.hadoop.hive.metastore.hivemetastore-0:get\ u databases:nonexistentdatabaseusedforhealthcheck 2018-04-15 16:27:47,171[main]info org.apache.hadoop.hive.metastore.hivemetastore.audit-ugi=jay ip=unknown ip addr cmd=get\u数据库:不存在用于HealthCheck 2018-04-15 16:27:47184[main]info org.apache.hadoop.hive.metastore.hivemetastore-0:get\u表:db=retail\u db tbl=categories 2018-04-15 16:27:47,184[main]info org.apache.hadoop.hive.metastore.hivemetastore.audit-ugi=jay ip=unknown ip addr cmd=get\u table:db=retail\u db tbl=categories 2018-04-15 16:27:47219[main]info org.apache.hadoop.conf.conf.configuration.deprecation-yarn.resourcemanager.system-metrics-publisher.enabled已弃用。相反,请使用yarn.system-metrics-publisher.enabled 2018-04-15 16:27:47244[main]info org.apache.hadoop.hive.metastore.hivemetastore-0:get\u databases:nonexistentdatabaseusedforhealthcheck 2018-04-15 16:27:47,244[main]info org.apache.hadoop.hive.metastore.hivemetastore.audit-ugi=jay ip=unknown ip addr cmd=get\u数据库:不存在用于HealthCheck 2018-04-15 16:27:47247[main]info org.apache.hadoop.hive.metastore.hivemetastore-0:get\u表:db=retail\u db tbl=departments 2018-04-15 16:27:47,247[main]info org.apache.hadoop.hive.metastore.hivemetastore.audit-ugi=jay ip=unknown ip addr cmd=get\u table:db=retail\u db tbl=departments 2018-04-15 16:27:47261[main]info org.apache.hadoop.conf.conf.configuration.deprecation-yarn.resourcemanager.system-metrics-publisher.enabled已弃用。相反,请使用yarn.system-metrics-publisher.enabled 2018-04-15 16:27:47284[main]info org.apache.hadoop.hive.metastore.hivemetastore-0:get\u databases:nonexistentdatabaseusedforhealthcheck 2018-04-15 16:27:47,284[main]info org.apache.hadoop.hive.metastore.hivemetastore.audit-ugi=jay ip=unknown ip addr cmd=get\u数据库:不存在用于HealthCheck 2018-04-15 16:27:47286[main]info org.apache.hadoop.hive.metastore.hivemetastore-0:get\u表:db=retail\u db tbl=categories 2018-04-15 16:27:47,286[main]info org.apache.hadoop.hive.metastore.hivemetastore.audit-ugi=jay ip=unknown ip addr cmd=get\u table:db=retail\u db tbl=categories 2018-04-15 16:27:47386[main]info org.apache.hadoop.conf.conf.configuration.deprecation-yarn.resourcemanager.system-metrics-publisher.enabled已弃用。相反,请使用yarn.system-metrics-publisher.enabled 2018-04-15 16:27:47388[main]info org.apache.pig.tools.pigstats.scriptstate-脚本中使用的pig功能:unknown 2018-04-15 16:27:47397[main]info org.apache.hadoop.conf.configuration.deprecation-yarn.resourcemanager.system-metrics-publisher.enabled已弃用。相反,请使用yarn.system-metrics-publisher.enabled 2018-04-15 16:27:47397[main]warn org.apache.pig.data.schematuplebackend-schematuplebackend已初始化2018-04-15 16:27:47397[main]info org.apache.pig.newplan.logical.optimizer.logicalplanoptimizer-{rules_enabled=[addforeach,columnmapkeyprune,constantcalculator,groupbyconstparallelsetter,limitoptimizer,loadtypecastinserter,mergefilter,mergeforeach,nestedlimitoptimizer,partitionfilteroptimizer,predicatepushdownoptimizer,pushdownforeachflatten,pushupfilter,splitfilter,streamtypecastinserter]}2018-04-15 16:27:47,398[main]info org.apache.pig.backend.hadoop.executionengine.mapreducelayer.mrcompiler-文件连接阈值:100?false 2018-04-15 16:27:47399[main]info org.apache.pig.backend.hadoop.executionengine.mapreducelayer.multiqueryoptimizer-优化前mr计划大小:1 2018-04-15 16:27:47399[main]info org.apache.pig.backend.hadoop.executionengine.mapreducelayer.multiqueryoptimizer-优化后mr计划大小:1 2018-04-15 16:27:47,406[main]info org.apache.hadoop.conf.configuration.deprecation-yarn.resourcemanager.system-metrics-publisher.enabled已弃用。相反,请使用yarn.system-metrics-publisher.enabled 2018-04-15 16:27:47407[main]info org.apache.hadoop.yarn.client.rmproxy-通过/0.0.0:8032 2018-04-15 16:27:47409[main]info org.apache.pig.tools.pigstats.mapreduce.mrscriptstate-pig脚本设置添加到作业2018-04-15 16:27:47,409[main]info org.apache.pig.backend.hadoop.executionengine.mapreducelayer.jobcontrolcompiler-mapred.job.reduce.markreset.buffer.percent未设置,设置为默认值0.3 2018-04-15 16:27:47435[main]info org.apache.hadoop.hive.metastore.hivemetastore-0:get\u数据库:不存在用于HealthCheck 2018-04-15 16:27:47,435[main]info org.apache.hadoop.hive.metastore.hivemetastore.audit-ugi=jay ip=unknown ip addr cmd=get\u数据库:不存在用于HealthCheck 2018-04-15 16:27:47437[main]info org.apache.hadoop.hive.metastore.hivemetastore-0:get\u表:db=retail\u db tbl=categories 2018-04-15 16:27:47,437[main]info org.apache.hadoop.hive.metastore.hivemetastore.audit-ugi=jay ip=unknown ip addr cmd=get\u table:db=retail\u db tbl=categories 2018-04-15 16:27:47458[main]info org.apache.hadoop.conf.conf.configuration.deprecation-yarn.resourcemanager.system-metrics-publisher.enabled已弃用。相反,请使用yarn.system-metrics-publisher.enabled 2018-04-15 16:27:47458[main]info org.apache.pig.backend.hadoop.executionengine.mapreducelayer.jobcontrolcompiler-无法转换此作业运行在进程2018-04-15 16:27:48中,419[main]info org.apache.pig.backend.hadoop.executionengine.mapreducelayer.jobcontrolcompiler-添加jar文件:/usr/local/apache-hive-2.3.2-bin/lib/hive-metastore-2.3.2.jar到distributedcache,通过/tmp/temp-1113251818/tmp122824794/hive-metastore-2.3.2.jar 2018-04-15 16:27:48,608[main]info org.apache.pig.backend.hadoop.executionengine.mapreducelayer.jobcontrolcompiler-添加jar文件:/usr/local/apache-hive-2.3.2-bin/lib/libthrift-0.9.3.jar到distributedcache,通过/tmp/temp-1113251818/tmp160861906/libthrift-0.9.3.jar 2018-04-15 16:27:49,708[main]info org.apache.pig.backend.hadoop.executionengine.mapreducelayer.jobcontrolcompiler-添加jar文件:/usr/local/apache-hive-2.3.2-bin/lib/hive-exec-2.3.2.jar到distributedcache,通过/tmp/temp-1113251818/tmp1023486409/hive-exec-2.3.2.jar 2018-04-15 16:27:50,352[main]info org.apache.pig.backend.hadoop.executionengine.mapreducelayer.jobcontrolcompiler-添加jar文件:/usr/local/apache-hive-2.3.2-bin/lib/libfb303-0.9.3.jar到distributedcache,通过/tmp/temp-1113251818/tmp-20730388/libfb303-0.9.3.jar 2018-04-15 16:27:51,375[main]info org.apache.pig.backend.hadoop.executionengine.mapreducelayer.jobcontrolcompiler-添加jar文件:/usr/local/apache-hive-2.3.2-bin/lib/jdo-api-3.0.1.jar到distributedcache,通过/tmp/temp-1113251818/tmp12057913/jdo-api-3.0.1.jar 2018-04-15 16:27:51,497[main]info org.apache.pig.backend.hadoop.executionengine.mapreducelayer.jobcontrolcompiler-添加jar文件:/usr/local/apache-hive-2.3.2-bin/lib/slf4j-api-1.7.25.jar到distributedcache,通过/tmp/temp-1113251818/tmp1251741235/slf4j-api-1.7.25.jar 2018-04-15 16:27:51,786[main]info org.apache.pig.backend.hadoop.executionengine.mapreducelayer.jobcontrolcompiler-添加jar文件:/usr/local/apache-hive-2.3.2-bin/lib/hive-hbase-handler-2.3.2.jar到distributedcache通过/tmp/temp-1113251818/tmp1351750668/hive-hbase-handler-2.3.2.jar 2018-04-15 16:27:52,653[main]info org.apache.pig.backend.hadoop.executionengine.mapreducelayer.jobcontrolcompiler-通过/tmp/temp-1113251818/tmp1548980484/pig-0.17.0-core-h2.jar 2018-04-15 16:27:53将jar文件:/usr/local/pig-0.17.0/pig-0.17.0-core-h2.jar添加到distributedcache,042[main]info org.apache.pig.backend.hadoop.executionengine.mapreducelayer.jobcontrolcompiler-添加jar文件:/usr/local/apache-hive-2.3.2-bin/hcatalog/share/hcatalog/hive-hcatalog-pig-adapter-2.3.2.jar到distributedcache,通过/tmp/temp-1113251818/tmp-2078279932/hive-hcatalog-pig-adapter-2.3.2.jar 2018-04-15 16:27:53,197[main]info org.apache.pig.backend.hadoop.executionengine.mapreducelayer.jobcontrolcompiler-通过/tmp/temp-1113251818/tmp1231439146/automaton-1.11-8.jar 2018-04-15 16:27:53将jar文件:/usr/local/pig-0.17.0/lib/automaton-1.11-8.jar添加到distributedcache,875[main]info org.apache.pig.backend.hadoop.executionengine.mapreducelayer.jobcontrolcompiler-添加jar文件:/usr/local/apache-hive-2.3.2-bin/lib/antlr-runtime-3.5.2.jar到distributedcache,通过/tmp/temp-1113251818/tmp970518288/antlr-runtime-3.5.2.jar 2018-04-15 16:27:53,900[main]info org.apache.pig.backend.hadoop.executionengine.mapreducelayer.jobcontrolcompiler-设置单店作业2018-04-15 16:27:53920[main]info org.apache.pig.backend.hadoop.executionengine.mapreducelayer.mapreducelauncher-1个map reduce作业正在等待提交。2018-04-15 16:27:53922[jobcontrol]info org.apache.hadoop.yarn.client.rmproxy-连接到resourcemanager at/0.0.0:8032 2018-04-15 16:27:54152[jobcontrol]info org.apache.hadoop.mapreduce.jobresourceuploader-禁用路径擦除编码:/tmp/hadoop-yarn/staging/jay/.staging/job_\u 0004 2018-04-15 16:27:54,197[jobcontrol]warn org.apache.hadoop.mapreduce.jobresourceuploader-未设置作业jar文件。可能找不到用户类。请参阅job或job#setjar(字符串)。2018-04-15 16:27:54232[jobcontrol]info org.apache.hadoop.mapred.fileinputformat-要处理的输入文件总数:1 2018-04-15 16:27:54232[jobcontrol]info org.apache.pig.backend.hadoop。

暂无答案!

目前还没有任何答案,快来回答吧!

相关问题