从ide内部运行时出现hadoop路径异常

iqxoj9l9 于 2021-06-03 发布在 Hadoop

关注(0)|答案(1)|浏览(453)

Edit: IDE is Intellij IDEA
      OS: Mac OS X Lion
      Hadoop: 1.2.1

编辑：如果文件路径存在于当前文件系统位置，则此操作有效。所以问题就变成了如何让它在从ide运行时与hdfs一起工作。
从ide内部运行（intellij idea）获取异常，请参见以下内容：
在程序参数中，我指定“input-output”
当然，“input”确实存在于hdfs中，其中包含数据文件。
但是代码尝试从本地项目文件系统位置而不是从hdfs访问目录。
hdfs命令：

James-MacBook-Pro:conf james$ hadoop fs -ls input
Found 1 items
-rw-r--r--   1 james supergroup         15 2013-11-01 07:31 /user/james/input/simple.txt

java源代码：

public class WordCount extends Configured implements Tool {
    public static void main(String[] args) throws Exception {
        int res = ToolRunner.run(new Configuration(), new WordCount(), args);
        System.exit(res);
    }
    @Override
    public int run(String[] args) throws Exception {
        if (args.length != 2) {
            System.err.println("Usage: hadoop jar mrjob-1.0-SNAPSHOT-job.jar"
                                       + " [generic options] <in> <out>");
            System.out.println();
            ToolRunner.printGenericCommandUsage(System.err);
            return 1;
        }
        Job job = new Job(getConf(), "WordCount");
        job.setJarByClass(getClass());
        job.setMapperClass(TokenizingMapper.class);
        job.setCombinerClass(IntSumReducer.class);
        job.setReducerClass(IntSumReducer.class);
        job.setOutputKeyClass(Text.class);
        job.setOutputValueClass(IntWritable.class);
        FileInputFormat.addInputPath(job, new Path(args[0]));
        FileOutputFormat.setOutputPath(job, new Path(args[1]));
        boolean success = job.waitForCompletion(true);
        return success ? 0 : 1;
    }
}

配置：
core-site.xml文件

<configuration>
 <property>
    <name>fs.default.name</name>
    <value>hdfs://localhost:9000</value>
 </property>
</configuration>

hdfs-site.xml文件

<configuration>
     <property>
        <name>dfs.replication</name>
        <value>1</value>
     </property>
</configuration>

mapred-site.xml文件

<configuration>
    <property>
        <name>mapred.job.tracker</name>
        <value>localhost:9001</value>
     </property>
</configuration>

ide中的参数：

input output

例外情况：

Nov 03, 2013 9:46:00 AM org.apache.hadoop.security.UserGroupInformation doAs
SEVERE: PriviledgedActionException as:james cause:org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input path does not exist: file:/Users/james/work/projects/hadoop/mrjob/input
Exception in thread "main" org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input path does not exist: file:/Users/james/work/projects/hadoop/mrjob/input

我做错了什么？

Java hadoop hdfs intellij-idea

来源：https://stackoverflow.com/questions/19736924/hadoop-path-exception-when-running-from-inside-ide

1条答案

按热度按时间

kzmpq1sx1#

在本地eclipse中，我假设集群配置的hadoop配置文件（core site.xml）不在类路径上，而是被绑定到hadoop jar中的那些文件隐藏在类路径上。
在提交作业之前，可以通过在代码中手动设置作业配置属性“fs.default.name”来修改此设置：

job.getConf().set('fs.default.name', "hdfs://localhost:9000");

您可能还需要配置jobtracker，以便不使用本地jobtracker：

job.getConf().set('mapred.jobtracker.address', "localhost:9001");

请注意，对于您的环境或部署，主机名、端口甚至属性名可能不同。
或者只需将hadoop conf文件夹添加到类路径中（并确保其优先级高于hadoop jar）

赞(0）回复(0）举报 2021-06-04

我来回答

从ide内部运行时出现hadoop路径异常

1条答案

相关问题

热门标签

最新问答