我当前正在使用hcatloader()从s3上的配置单元表中提取数据,并尝试使用hcatstorer()将数据写回s3上的配置单元表。我使用的是aws emr中的默认配置单元安装。
hcatloader工作正常,我可以在脚本中正确地转储数据(下面的dumpb工作正常)。
register '/home/hadoop/pig/lib/*.jar';
register '/home/hadoop/pig/*.jar';
A = LOAD '$db_name.$hive_table' USING org.apache.hcatalog.pig.HCatLoader();
PARTITION_FILTER = FILTER A BY date=='$dateVar' AND hour=='$hourVar';
B = LIMIT PARTITION_FILTER 10;
--DUMP B;
STORE B INTO '$output_db.$output_table' USING org.apache.hcatalog.pig.HCatStorer();
我还可以使用pigstorage和hive表分隔符将数据很好地写入s3 bucket,然后通过hive脚本添加分区—但是我们需要使用hcatalog动态分区。
我得到的错误意味着连接被拒绝,因为无法连接到任何元存储uri的提供。我的理解是hcatloader()和hcatstorer()使用同一个hive-site.xml文件,而我只有在包含hcatstorer()时才会出现错误。
hive-site.xml中的metastore uri参数
<property>
<name>hive.metastore.uris</name>
<value>thrift://localhost:9083</value>
</property>
以下是来自job tracker的日志:
Job commit failed: java.io.IOException: java.lang.reflect.InvocationTargetException
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputCommitter.commitJob(PigOutputCommitter.java:281)
at org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler$EventProcessor.handleJobCommit(CommitterEventHandler.java:253)
at org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler$EventProcessor.run(CommitterEventHandler.java:216)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputCommitter.commitJob(PigOutputCommitter.java:279)
... 5 more
Caused by: org.apache.hcatalog.common.HCatException : 2006 : Error adding partition to metastore. Cause : MetaException(message:Could not connect to meta store using any of the URIs provided. Most recent failure: org.apache.thrift.transport.TTransportException: java.net.ConnectException: Connection refused
at org.apache.thrift.transport.TSocket.open(TSocket.java:185)
at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.open(HiveMetaStoreClient.java:336)
at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.(HiveMetaStoreClient.java:214)
at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.(HiveMetaStoreClient.java:154)
at org.apache.hcatalog.common.HiveClientCache$CacheableHiveMetaStoreClient.(HiveClientCache.java:246)
at org.apache.hcatalog.common.HiveClientCache$4.call(HiveClientCache.java:170)
at org.apache.hcatalog.common.HiveClientCache$4.call(HiveClientCache.java:167)
at com.google.common.cache.LocalCache$LocalManualCache$1.load(LocalCache.java:4767)
at com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3568)
at com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2350)
at com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2313)
at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2228)
at com.google.common.cache.LocalCache.get(LocalCache.java:3965)
at com.google.common.cache.LocalCache$LocalManualCache.get(LocalCache.java:4764)
at org.apache.hcatalog.common.HiveClientCache.getOrCreate(HiveClientCache.java:167)
at org.apache.hcatalog.common.HiveClientCache.get(HiveClientCache.java:143)
at org.apache.hcatalog.common.HCatUtil.getHiveClient(HCatUtil.java:548)
at org.apache.hcatalog.mapreduce.FileOutputCommitterContainer.registerPartitions(FileOutputCommitterContainer.java:616)
at org.apache.hcatalog.mapreduce.FileOutputCommitterContainer.commitJob(FileOutputCommitterContainer.java:206)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputCommitter.commitJob(PigOutputCommitter.java:279)
at org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler$EventProcessor.handleJobCommit(CommitterEventHandler.java:253)
at org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler$EventProcessor.run(CommitterEventHandler.java:216)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.net.ConnectException: Connection refused
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200)
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:579)
at org.apache.thrift.transport.TSocket.open(TSocket.java:180)
... 28 more
)
at org.apache.hcatalog.mapreduce.FileOutputCommitterContainer.registerPartitions(FileOutputCommitterContainer.java:704)
at org.apache.hcatalog.mapreduce.FileOutputCommitterContainer.commitJob(FileOutputCommitterContainer.java:206)
... 10 more
Caused by: MetaException(message:Could not connect to meta store using any of the URIs provided. Most recent failure: org.apache.thrift.transport.TTransportException: java.net.ConnectException: Connection refused
at org.apache.thrift.transport.TSocket.open(TSocket.java:185)
at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.open(HiveMetaStoreClient.java:336)
at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.(HiveMetaStoreClient.java:214)
at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.(HiveMetaStoreClient.java:154)
at org.apache.hcatalog.common.HiveClientCache$CacheableHiveMetaStoreClient.(HiveClientCache.java:246)
at org.apache.hcatalog.common.HiveClientCache$4.call(HiveClientCache.java:170)
at org.apache.hcatalog.common.HiveClientCache$4.call(HiveClientCache.java:167)
at com.google.common.cache.LocalCache$LocalManualCache$1.load(LocalCache.java:4767)
at com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3568)
at com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2350)
at com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2313)
at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2228)
at com.google.common.cache.LocalCache.get(LocalCache.java:3965)
at com.google.common.cache.LocalCache$LocalManualCache.get(LocalCache.java:4764)
at org.apache.hcatalog.common.HiveClientCache.getOrCreate(HiveClientCache.java:167)
at org.apache.hcatalog.common.HiveClientCache.get(HiveClientCache.java:143)
at org.apache.hcatalog.common.HCatUtil.getHiveClient(HCatUtil.java:548)
at org.apache.hcatalog.mapreduce.FileOutputCommitterContainer.registerPartitions(FileOutputCommitterContainer.java:616)
at org.apache.hcatalog.mapreduce.FileOutputCommitterContainer.commitJob(FileOutputCommitterContainer.java:206)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputCommitter.commitJob(PigOutputCommitter.java:279)
at org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler$EventProcessor.handleJobCommit(CommitterEventHandler.java:253)
at org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler$EventProcessor.run(CommitterEventHandler.java:216)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.net.ConnectException: Connection refused
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200)
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:579)
at org.apache.thrift.transport.TSocket.open(TSocket.java:180)
... 28 more
)
at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.open(HiveMetaStoreClient.java:382)
at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.(HiveMetaStoreClient.java:214)
at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.(HiveMetaStoreClient.java:154)
at org.apache.hcatalog.common.HiveClientCache$CacheableHiveMetaStoreClient.(HiveClientCache.java:246)
at org.apache.hcatalog.common.HiveClientCache$4.call(HiveClientCache.java:170)
at org.apache.hcatalog.common.HiveClientCache$4.call(HiveClientCache.java:167)
at com.google.common.cache.LocalCache$LocalManualCache$1.load(LocalCache.java:4767)
at com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3568)
at com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2350)
at com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2313)
at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2228)
at com.google.common.cache.LocalCache.get(LocalCache.java:3965)
at com.google.common.cache.LocalCache$LocalManualCache.get(LocalCache.java:4764)
at org.apache.hcatalog.common.HiveClientCache.getOrCreate(HiveClientCache.java:167)
at org.apache.hcatalog.common.HiveClientCache.get(HiveClientCache.java:143)
at org.apache.hcatalog.common.HCatUtil.getHiveClient(HCatUtil.java:548)
at org.apache.hcatalog.mapreduce.FileOutputCommitterContainer.registerPartitions(FileOutputCommitterContainer.java:616)
任何洞察都是非常感激的。这让我难堪了好一阵子。我见过很多类似问题的旧jira门票,但都是2012年发行的。
暂无答案!
目前还没有任何答案,快来回答吧!