在使用scala时从json值中删除额外的“”

nwsw7zdq  于 2021-05-29  发布在  Spark
关注(0)|答案(1)|浏览(349)

我一直在尝试使用scala清理json对象,但无法从json值示例“last\u nm”中删除多余的“:”smith“libby”mary“
字符串中多余的逗号产生了问题。
这是我用来清理json文件的代码

  1. val readjson = sparkSession.sparkContext.textFile("dev.json")
  2. val json=readjson.map(element=>element.replace("\"\":\"\"","\":\"")
  3. .replace("\"\",\"\"","\",\"")
  4. .replace("\"\":","\":")
  5. .replace(",\"\"",",\"")
  6. .replace("\"{\"\"","{\"")
  7. .replace("\"\"}\"","\"}")
  8. .replaceAll("\\u0009"," "))
  9. .saveAsTextFile("JSON")

下面是我要清理的json字符串(为便于阅读,添加了空格):

  1. {
  2. "SEQ_NO":597216,
  3. "PROV_DEMOG_SK":597216,
  4. "PROV_ID":"QMP000003371283",
  5. "FRST_NM":"",
  6. "LAST_NM":"SMITH "LIBBY" MARY",
  7. "FUL_NM":"",
  8. "GENDR_CD":"",
  9. "PROV_NPI":"",
  10. "PROV_STAT":"Incomplete",
  11. "PROV_TY":"03",
  12. "DT_OF_BRTH":"",
  13. "PROFPROFL_DESGTN":"",
  14. "ETL_LAST_UPDT_DT_TM":"2020-04-28 11:43:31.000000",
  15. "PROV_CLSFTN_CD":"A",
  16. "SRC_DATA_KEY":50,
  17. "OPRN_CD":"I",
  18. "REC_SET":"F"
  19. }

我应该在代码中添加什么来从json字符串的最后一个值中删除额外的“”。

lg40wkob

lg40wkob1#

检查以下代码

  1. df.map(_.replaceAll(" \""," ").replaceAll("\" "," ")).show(false)
  2. +----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
  3. |value |
  4. +----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
  5. |{"SEQ_NO":597216,"PROV_DEMOG_SK":597216,"PROV_ID":"QMP000003371283","FRST_NM":"","LAST_NM":"SMITH LIBBY MARY","FUL_NM":"","GENDR_CD":"","PROV_NPI":"","PROV_STAT":"Incomplete","PROV_TY":"03","DT_OF_BRTH":"","PROFPROFL_DESGTN":"","ETL_LAST_UPDT_DT_TM":"2020-04-28 11:43:31.000000","PROV_CLSFTN_CD":"A","SRC_DATA_KEY":50,"OPRN_CD":"I","REC_SET":"F"}|
  6. +----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

相关问题