filenotfoundexcepton

np8igboo  于 2021-06-02  发布在  Hadoop
关注(0)|答案(2)|浏览(314)

我在运行hadoop作业时遇到问题,在尝试从分布式缓存检索文件时收到filenotfoundexception,即使该文件存在。当我在本地文件系统上运行它时,它就工作了。
集群托管在amazonwebservices上,使用hadoop版本1.0.4和java版本1.7。我无法控制集群,也无法控制集群的设置方式。
在main函数中,我将文件添加到分布式缓存中。这似乎很管用。我想,至少它没有抛出任何异常。

....
JobConf conf = new JobConf(Driver.class);
conf.setJobName("mean");
conf.set("lookupfile", args[2]);
Job job = new Job(conf);
DistributedCache.addCacheFile(new Path(args[2]).toUri(), conf);
...

在Map前调用的setup函数中,我为文件创建了一个路径,并调用了一个将文件加载到哈希Map中的函数。

Configuration conf = context.getConfiguration();
String inputPath = conf.get("lookupfile");                          
Path dataFile = new Path(inputPath);
loadHashMap(dataFile, context);

异常发生在加载哈希Map的函数的第一行。

brReader = new BufferedReader(new FileReader(filePath.toString()));

我是这样开始工作的。

hadoop jar Driver.jar Driver /tmp/input output /tmp/DATA.csv

我得到以下错误

Error: Found class org.apache.hadoop.mapreduce.Counter, but interface was expected
attempt_201410300715_0018_m_000000_0: java.io.FileNotFoundException: /tmp/DATA.csv (No such file or directory)
attempt_201410300715_0018_m_000000_0:   at java.io.FileInputStream.open(Native Method)
attempt_201410300715_0018_m_000000_0:   at java.io.FileInputStream.<init>(FileInputStream.java:146)
attempt_201410300715_0018_m_000000_0:   at java.io.FileInputStream.<init>(FileInputStream.java:101)
attempt_201410300715_0018_m_000000_0:   at java.io.FileReader.<init>(FileReader.java:58)
attempt_201410300715_0018_m_000000_0:   at Map.loadHashMap(Map.java:49)
attempt_201410300715_0018_m_000000_0:   at Map.setup(Map.java:98)
attempt_201410300715_0018_m_000000_0:   at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
attempt_201410300715_0018_m_000000_0:   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:771)
attempt_201410300715_0018_m_000000_0:   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:375)
attempt_201410300715_0018_m_000000_0:   at org.apache.hadoop.mapred.Child$4.run(Child.java:259)
attempt_201410300715_0018_m_000000_0:   at java.security.AccessController.doPrivileged(Native Method)
attempt_201410300715_0018_m_000000_0:   at javax.security.auth.Subject.doAs(Subject.java:415)
attempt_201410300715_0018_m_000000_0:   at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1140)
attempt_201410300715_0018_m_000000_0:   at org.apache.hadoop.mapred.Child.main(Child.java:253)
14/11/01 02:12:49 INFO mapred.JobClient: Task Id : attempt_201410300715_0018_m_000001_0, Status : FAILED

我已经验证了该文件存在于hdfs和本地文件系统中。

hadoop@hostname:~$ hadoop fs -ls /tmp
Found 2 items
drwxr-xr-x   - hadoop supergroup          0 2014-10-30 11:19 /tmp/input
-rw-r--r--   1 hadoop supergroup     428796 2014-10-30 11:19 /tmp/DATA.csv

hadoop@hostname:~$ ls -al /tmp/
-rw-r--r--  1 hadoop hadoop 428796 Oct 30 11:30 DATA.csv

我真的看不出这里出了什么问题。异常列出了文件的正确路径。我已经验证了该文件在hdfs和本地文件系统上都存在。有什么我不知道的吗?

5fjcxozz

5fjcxozz1#

我也面临同样的问题,下面的代码对我有效:

Configuration conf = context.getConfiguration();  
URI[] uriList = DistributedCache.getCacheFiles(conf);
BufferedReader br = new BufferedReader(new FileReader(uriList[0].getPath()))

如您所见,我在这里使用getcachefiles方法,然后获取文件路径并读取文件。

af7jpaap

af7jpaap2#

bufferedreader的输入应该来自setup()中distributedcache.getlocalcachefiles()返回的路径。更像。。

Path[] localFiles = DistributedCache.getLocalCacheFiles();
if (localFiles.length > 0){
   brReader = new BufferedReader(new FileReader(localFiles[0].toString());      
}

相关问题