我正在用java开发hadoop中的一个项目。当我在本地集群上运行我的代码(jar)时,它工作正常,但当我在amazon多集群上运行它时,它会给出异常。。。
我的mapreduce作业代码。。。。
job.setJarByClass(ReadActivityDriver.class);
job.setMapperClass(ReadActivityLogMapper.class);
job.setReducerClass(ReadActivityLogReducer.class);
job.setMapOutputKeyClass(Text.class);
job.setMapOutputValueClass(Text.class);
job.setInputFormatClass(ColumnFamilyInputFormat.class);
job.setOutputFormatClass(TextOutputFormat.class);
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(Text.class);
ConfigHelper.setInputRpcPort(job.getConfiguration(), pro.getProperty("port"));
ConfigHelper.setInputInitialAddress(job.getConfiguration(), pro.getProperty("server"));
ConfigHelper.setInputPartitioner(job.getConfiguration(), "org.apache.cassandra.dht.Murmur3Partitioner");
ConfigHelper.setInputColumnFamily(job.getConfiguration(), keyspace, columnFamily);
SlicePredicate predicate = new SlicePredicate().setColumn_names(cn);
ConfigHelper.setInputSlicePredicate(job.getConfiguration(), predicate);
FileSystem.get(job.getConfiguration()).delete(new Path("ReadOutput"), true);
FileOutputFormat.setOutputPath(job, new Path("ReadOutput"));
job.waitForCompletion(true);
我得到的例外。。。
8020/home/ubuntu/hdfstmp/mapred/staging/ubuntu/.staging/job_201405080944_0010
java.lang.RuntimeException: org.apache.thrift.TApplicationException: Invalid method name: 'describe_local_ring'
at org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat.getRangeMap(AbstractColumnFamilyInputFormat.java:337)
at org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat.getSplits(AbstractColumnFamilyInputFormat.java:125)
at org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:1054)
at org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:1071)
at org.apache.hadoop.mapred.JobClient.access$700(JobClient.java:179)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:983)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:936)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:936)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:550)
at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:580)
at com.cassandra.readActivity.ReadActivityDriver.run(ReadActivityDriver.java:117)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at com.cassandra.readActivity.ReadActivityDriver.main(ReadActivityDriver.java:33)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.main(RunJar.java:160)
Caused by: org.apache.thrift.TApplicationException: Invalid method name: 'describe_local_ring'
at org.apache.thrift.TApplicationException.read(TApplicationException.java:111)
at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:71)
at org.apache.cassandra.thrift.Cassandra$Client.recv_describe_local_ring(Cassandra.java:1277)
at org.apache.cassandra.thrift.Cassandra$Client.describe_local_ring(Cassandra.java:1264)
at org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat.getRangeMap(AbstractColumnFamilyInputFormat.java:329)
... 20 more
java.io.FileNotFoundException: File does not exist: /user/ubuntu/ReadOutput/part-r-00000;
at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.fetchLocatedBlocks(DFSClient.java:2006)
at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.openInfo(DFSClient.java:1975)
at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.<init>(DFSClient.java:1967)
at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:735)
at org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:165)
at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:436)
at com.cassandra.readActivity.ReadActivityMySql.calculatePoint(ReadActivityMySql.java:65)
at com.cassandra.readActivity.ReadActivityDriver.main(ReadActivityDriver.java:36)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.main(RunJar.java:160)
java.io.FileNotFoundException: File does not exist: /user/ubuntu/ReadOutput/part-r-00000;
at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.fetchLocatedBlocks(DFSClient.java:2006)
at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.openInfo(DFSClient.java:1975)
at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.<init>(DFSClient.java:1967)
at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:735)
at org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:165)
at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:436)
at com.cassandra.readActivity.MySqlSavePoint.setSavePoint(MySqlSavePoint.java:66)
at com.cassandra.readActivity.ReadActivityDriver.main(ReadActivityDriver.java:37)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.main(RunJar.java:160)
3条答案
按热度按时间2g32fytz1#
看起来输入/输出格式jar和集群使用的是不同的cassandra版本。您需要修复jar,或者升级aws cassandra节点。
lnxxn5zx2#
我想你的cassandra分区有问题,试试随机分区
sauutmhj3#
最后我得到了答案。。
使用随机分区
而不是穆默