使用scala将文件复制到hadoop hdfs?

s71maibg  于 2021-06-03  发布在  Hadoop
关注(0)|答案(2)|浏览(338)

我正在尝试将本地计算机上的文件复制到我的hdfs。但是,我不确定如何在scala中实现这一点,因为我正在编写的脚本当前写入本地csv文件。如何使用scala将此文件移动到hdfs?
编辑:我现在所做的:

val hiveServer = new HiveJDBC
    val file =  new File(TMP_DIR, fileName)
    val firstRow = getFirstRow(tableName, hiveServer)
    val restData = getRestData(tableName, hiveServer)
    withPrintWriter(file) { printWriter => 
      printWriter.write(firstRow) 
      printWriter.write("\n")
      printWriter.write(restData)}

我现在想在hdfs中存储“file”

qxsslcnc

qxsslcnc1#

scala可以直接调用hadoopapi。例如,

val conf = new Configuration()
    val fs= FileSystem.get(conf)
    val output = fs.create(new Path("/your/path"))
    val writer = new PrintWriter(output)
    try {
        writer.write(firstRow) 
        writer.write("\n")
        writer.write(restData)
    }
    finally {
        writer.close()
    }
tcomlyy6

tcomlyy62#

在run方法中添加代码内容。

val conf = getConf()
val hdfs = FileSystem.get(conf)
val localInputFilePath = arg(0)
val inputFileName = getFileName(localInputFilePath)

var hdfsDestinationPath = arg(1)
val hdfsDestFilePath = new Path(hdfsDestinationPath + File.separator + inputFileName)

try {
  val inputStream: InputStream = new FileInputStream(localInputFilePath);
  val fsdos: FSDataOutputStream = hdfs.create(hdfsDestFilePath);
  IOUtils.copyBytes(inputStream, fsdos, conf, true);

} catch {
  case fnfe: FileNotFoundException => fnfe.printStackTrace();
  case ioe: IOException            => ioe.printStackTrace();
}

相关问题