用于读写hdfs文件的bufferreader和bufferwriter

67up9zun 于 2021-06-03 发布在 Hadoop

关注(0)|答案(1)|浏览(404)

我试图逐行读取hdfs文件，然后创建一个hdfs文件并逐行写入。我使用的代码如下所示：

Path FileToRead=new Path(inputPath);
        FileSystem hdfs = FileToRead.getFileSystem(new Configuration());            
        FSDataInputStream fis = hdfs.open(FileToRead);
        BufferedReader reader = new BufferedReader(new InputStreamReader(fis));

        String line;
            line = reader.readLine(); 
            while (line != null){

                String[] lineElem = line.split(",");
                for(int i=0;i<10;i++){

                    MyMatrix[i][Integer.valueOf(lineElem[0])-1] = Double.valueOf(lineElem[i+1]);
                }

                line=reader.readLine();
        } 

        reader.close();
        fis.close();

        Path FileToWrite = new Path(outputPath+"/V"); 
        FileSystem fs = FileSystem.get(new Configuration());
        FSDataOutputStream fileOut = fs.create(FileToWrite);
        BufferedWriter writer = new BufferedWriter(new OutputStreamWriter(fileOut));
        writer.write("check");
        writer.close();
        fileOut.close();

当我在outputpath中运行此代码时，尚未创建文件v。但是，如果我用写文件的部分替换读文件的部分，文件将被创建，并且check被写入其中。有谁能帮助我理解如何正确使用它们，以便能够先读取整个文件，然后逐行写入文件？
我也尝试了另一个代码从一个文件读取和写入到另一个文件，但该文件将被创建，但没有写入它！
我用这样的东西：

hadoop jar main.jar program2.Main input output

然后，在我的第一个作业中，我使用map reduce类从arg[0]读取并写入args[1]+“/newv”中的文件，它可以正常工作。在我的另一个类（非map reduce）中，我使用args[1]+“/newv”作为输入路径，使用output+“/v\u 0”作为输出路径（我将这些字符串传递给构造函数）。下面是该类的代码：

public class Init_V {

String inputPath, outputPath;

public Init_V(String inputPath, String outputPath) throws Exception {

    this.inputPath = inputPath;
    this.outputPath = outputPath;

    try{            

        FileSystem fs = FileSystem.get(new Configuration());
        Path FileToWrite = new Path(outputPath+"/V.txt"); 
        Path FileToRead=new Path(inputPath);
        BufferedWriter output = new BufferedWriter
         (new OutputStreamWriter(fs.create(FileToWrite,
                 true)));  

        BufferedReader reader = new
            BufferedReader(new InputStreamReader(fs.open(FileToRead)));
                 String data;
                 data = reader.readLine();
                 while ( data != null ) 
                 {
                     output.write(data);
                     data = reader.readLine();
                 }
                 reader.close();                     
                 output.close(); }catch(Exception e){
}

}

}

hadoop hdfs

来源：https://stackoverflow.com/questions/16510672/bufferreader-and-bufferwriter-for-reading-and-writing-hdfs-files

1条答案

按热度按时间

wqlqzqxt1#

我认为，您需要了解hadoop是如何正常工作的。在hadoop中，很多事情都是由系统来完成的，您只需给出输入和输出路径，如果路径有效，它们就会被hadoop打开并创建。检查以下示例；

public int run (String[] args) throws Exception{

    if(args.length != 3){
        System.err.println("Usage: MapReduce <input path> <output path> ");
        ToolRunner.printGenericCommandUsage(System.err);
    }
    Job job = new Job();
    job.setJarByClass(MyClass.class);
    job.setNumReduceTasks(5);
    job.setJobName("myclass");
    FileInputFormat.addInputPath(job, new Path(args[0]) );
    FileOutputFormat.setOutputPath(job, new Path(args[1]));

    job.setMapperClass(MyMapper.class);
    job.setReducerClass(MyReducer.class);

    job.setOutputKeyClass(Text.class);
    job.setOutputValueClass(Text.class);

    return job.waitForCompletion(true) ? 0:1 ;
}

/* ----------------------main---------------------*/
public static void main(String[] args) throws Exception{    

    int exitCode = ToolRunner.run(new MyClass(), args);
    System.exit(exitCode);
}

正如您在这里看到的，您只需要初始化必要的变量，而读写是由hadoop完成的。
另外，在mapper类中 context.write(key, value) 在map中，同样地，在reduce类中，您也在做同样的事情，它为您编写。
如果使用bufferedwriter/reader，它将写入本地文件系统，而不是hdfs。要在hdfs中查看文件，您应该编写 hadoop fs -ls <path> ，您正在查找的文件 ls 命令在本地文件系统中
编辑：为了使用读/写，你应该知道以下几点：假设你的hadoop网络中有一台机器。当你想读的时候，你将不知道哪个Map器在读，同样的，在写。所以，所有的Map器和还原器都应该有那些不给出异常的路径。
我不知道您是否可以使用任何其他类，但您可以使用两种方法，因为您的具体原因： startup 以及 cleanup . 这些方法在每个Map中只使用一次，从而减少了工作量。所以，如果你想读写你可以使用这些文件。读写与普通java代码相同。例如，您希望看到每个键的内容，并希望将其写入txt。您可以执行以下操作：

//in reducer
BufferedReader bw ..;

void startup(...){
     bw  = new ....;
}

void reduce(...){
    while(iter.hasNext()){ ....;
    }
    bw.write(key, ...);
}
void cleanup(...){
    bw.close();
}

赞(0）回复(0）举报 2021-06-03

我来回答

用于读写hdfs文件的bufferreader和bufferwriter

1条答案

相关问题

热门标签

最新问答