当试图将数据从jdbc(oracle)加载到spark中时,十进制字段中似乎有精度损失,据我所知spark支持十进制(38,18)。甲骨文中的字段是十进制(38,14),而spark舍入了最后四位,使其精度达到十进制(38,10)。这只发生在Dataframe中的一个字段上,而在同一查询中,另一个字段填充正确的模式。
import oracle.jdbc.OracleDriver
jdbcDF = spark.read \
.format("jdbc") \
.option("url","ORACLE") \
.option("dbtable","QUERY") \
.option("user","USERNAME") \
.option("password","PASSWORD") \
.option("driver", "oracle.jdbc.OracleDriver") \
.load()
jdbc.write.mode("overwrite")
.parquet("/path")
错误堆栈跟踪:
WARN TaskSetManager: Lost task 1.0 in stage 0.0 (TID 1, 10...***, executor 0): java.lang.IllegalArgumentException: requirement failed: Decimal precision 39 exceeds max precision 38 at scala.Predef$.require(Predef.scala:224) at org.apache.spark.sql.types.Decimal.set(Decimal.scala:113) at org.apache.spark.sql.types.Decimal$.apply(Decimal.scala:426) at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$org$apache$spark$sql$execution$datasources$jdbc$JdbcUtils$$makeGetter$3$$anonfun$9.apply(JdbcUtils.scala:337) at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$org$apache$spark$sql$execution$datasources$jdbc$JdbcUtils$$makeGetter$3$$anonfun$9.apply(JdbcUtils.scala:337) at
暂无答案!
目前还没有任何答案,快来回答吧!