为什么我需要hadoop lib jar到hdfs中?

djmepvbi  于 2021-05-31  发布在  Hadoop
关注(0)|答案(0)|浏览(172)

我创建了一个mapreduce作业,并正在多群集环境中进行测试,但出现以下错误:

Exception in thread "main" java.io.FileNotFoundException: File does not exist: hdfs://bigcluster:9000/opt/hadoop/share/hadoop/common/hadoop-common-2.6.0.jar
    at org.apache.hadoop.hdfs.DistributedFileSystem$18.doCall(DistributedFileSystem.java:1122)
    at org.apache.hadoop.hdfs.DistributedFileSystem$18.doCall(DistributedFileSystem.java:1114)
    at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
    at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1114)
    at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:288)
    at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:224)
    at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestamps(ClientDistributedCacheManager.java:93)
    at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestampsAndCacheVisibilities(ClientDistributedCacheManager.java:57)
    at org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:269)
    at org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:390)
    at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:483)
    at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1296)
    at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1293)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:422)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
    at org.apache.hadoop.mapreduce.Job.submit(Job.java:1293)
    at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1314)
    at com.company.hbase.mapreduce.message.maestro.threadIndex.fakecolum.MockTestThreadIndexData.run(MockTestThreadIndexData.java:47)
    at com.company.hbase.mapreduce.MaestroUpdateJob.main(MaestroUpdateJob.java:49)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:497)
    at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:136)

我看到了 hadoop-common-2.6.0.jar 上缺少jar hdfs://bigcluster:9000/opt/hadoop/share/hadoop/common jar文件存在于/opt/hadoop/share/hadoop/common上,但我的工作是寻找hdfs内部。如果我将所有jar(有很多jar)复制到hdfs中,它就工作了。但问题是,我想明白,这真的有必要吗?有人能解释为什么吗?如果我想在生产中运行它,我需要做这个吗?对吗?
另外,我看到了为什么需要在hdfs中保留hbase/lib文件夹的答案?是的,如果我将mapreduce框架改为yarn,它也可以工作。但我不想和yarn一起工作,我只想理解为什么我必须将所有hadoop lib移动到hdfs来运行mapreduce作业。

更新

下面是如何示例化jobconf

Job job = Job.getInstance(config, "MyJob");

    Scan scan = createScan();
    Filter filter = createMyFilter();

    FilterList filters = createMyFilter();
    scan.setFilter(filters);

    TableMapReduceUtil.initTableMapperJob(
        MY_TABLE,
        scan,
        MyMapper.class,
        null,
        null,
        job
    );

    TableMapReduceUtil.initTableReducerJob(
        MY_TABLE,
        null,
        job
    );
    job.setNumReduceTasks(0);

这是我的 mapred-site.xml ```


mapred.job.tracker
myhost:9001


hadoop.ssl.enabled
true


hadoop.ssl.require.client.cert
false
true


hadoop.ssl.hostname.verifier
DEFAULT
true


hadoop.ssl.keystores.factory.class
org.apache.hadoop.security.ssl.FileBasedKeyStoresFactory
true


hadoop.ssl.server.conf
ssl-server.xml
true


hadoop.ssl.client.conf
ssl-client.xml
true

我如何运行它:

HADOOP_CLASSPATH=/opt/hbase/bin/hbase classpath /opt/hadoop/bin/hadoop jar /tmp/mymapred-1.0-SNAPSHOT-jar-with-dependencies.jar


## 解决方案

最后,我从这个评论中得到了答案:https://stackoverflow.com/a/31950822/13305602
在core-site.xml中,有两个属性用于配置hadoop中的默认文件系统。

暂无答案!

目前还没有任何答案,快来回答吧!

相关问题