我正在尝试使用spark将数据从hdfs加载到hive,下面是spark配置
import org.apache.spark.sql.Dataset;
import org.apache.spark.sql.Row;
import org.apache.spark.sql.SaveMode;
import org.apache.spark.sql.SparkSession;
public class MigrationdatainoHive1 {
public static void main(String[] args) {
// TODO Auto-generated method stub
SparkSession spark = SparkSession
.builder()
.appName("Read csv File to DataSet")
.config("spark.driver.bindAddress", "0.0.0.0")
.config("spark.master", "local")
.enableHiveSupport()
.getOrCreate();
spark.sparkContext().setLogLevel("ERROR");
String files = "D:\\zomato-india-data-set\\Agra\\1-Agrahotels.csv";
Dataset<Row> df = spark.read().format("csv")
.option("header", "true")
.option("multiline", true)
.option("sep", "|")
.option("quote", "\"")
.option("dateFormat", "M/d/y")
.option("inferSchema", true)
.csv(files);
df.write().mode(SaveMode.Append).saveAsTable("hadoop.Sample");
df.printSchema();
}
}
当我在spark conf中添加.enablehivesupport()时,我遇到了下面的错误
**Exception in thread "main" org.apache.spark.sql.AnalysisException: java.lang.RuntimeException: java.lang.RuntimeException: The root scratch dir: D:/tmp/hive on HDFS should be writable. Current permissions are: rw-rw-rw-;**
我还运行了命令c:\hadoop\hadoop-2.7.1\bin>winutils.exe chmod 777 c:\tmp\hive,之后我也面临同样的问题。请帮助解决这个问题。
如果可能的话,请分享任何使用spark从hdfs加载到hive的示例代码。
暂无答案!
目前还没有任何答案,快来回答吧!