spark从另一个表更新delta中的多列

fslejnso  于 2021-05-27  发布在  Spark
关注(0)|答案(1)|浏览(454)

我正在尝试根据从另一个增量表获取的值更新一个增量表中的多个列。下面的更新sql在oracle中工作,但在spark delta中不工作,您能帮忙吗?

deptDf = sqlContext.createDataFrame(
    [(10, "IT", "Seattle"), (20, "Accounting", "Renton"), (30, "Finance", "Bellevue"), (40, "Manufacturing", "Tacoma"), (50, "Inventory", "Bothell")],
    ("dno", "dname", "location"))

updateddeptlocDf = sqlContext.createDataFrame(
    [(20, "Accounting and Finance", "SODO"), (10, "Technology", "SODO")], ("dno", "updated_name", "updated_location"))

deptDf.write.format("delta").mode("Overwrite").save("/mnt/delta/dept")
updateddeptlocDf.write.mode("Overwrite").format("delta").save("/mnt/delta/updatedDept")
spark.sql("DROP TABLE IF EXISTS deptdelta")
spark.sql("DROP TABLE IF EXISTS updated_dept_location")
spark.sql("CREATE TABLE deptdelta USING DELTA LOCATION '/mnt/delta/dept'")
spark.sql("CREATE TABLE updated_dept_location USING DELTA LOCATION '/mnt/delta/updatedDept'")

我要发布的update语句失败了:

UPDATE deptdelta d
SET (d.dname, d.location) = (SELECT ud.updated_name, ud.updated_location FROM updated_dept_location u WHERE d.dno = u.dno )
WHERE EXISTS (SELECT 'a' from updated_dept_location u1 WHERE d.dno = u1.dno )

错误:
sql语句中的错误:parseexception:输入不匹配','应为eq(第2行,位置11)
==sql==update deptdelta d set d.dname,d.location=(选择ud.updated\u name,ud.updated\u location from updated\u dept\u location u where d.dno=u.dno)----^^^ where exists(从updated\u dept\u location u1 where d.dno=u1.dno选择“a”)

oyxsuwqo

oyxsuwqo1#

合并成功了

MERGE INTO deptdelta AS maindept
USING updated_dept_location AS upddept
ON upddept.dno = maindept.dno
WHEN MATCHED THEN UPDATE SET maindept.dname = upddept.updated_name, maindept.location = upddept.updated_location

相关问题