我一直在尝试使用scala清理json对象,但无法从json值示例“last\u nm”中删除多余的“:”smith“libby”mary“
字符串中多余的逗号产生了问题。
这是我用来清理json文件的代码
val readjson = sparkSession.sparkContext.textFile("dev.json")
val json=readjson.map(element=>element.replace("\"\":\"\"","\":\"")
.replace("\"\",\"\"","\",\"")
.replace("\"\":","\":")
.replace(",\"\"",",\"")
.replace("\"{\"\"","{\"")
.replace("\"\"}\"","\"}")
.replaceAll("\\u0009"," "))
.saveAsTextFile("JSON")
下面是我要清理的json字符串(为便于阅读,添加了空格):
{
"SEQ_NO":597216,
"PROV_DEMOG_SK":597216,
"PROV_ID":"QMP000003371283",
"FRST_NM":"",
"LAST_NM":"SMITH "LIBBY" MARY",
"FUL_NM":"",
"GENDR_CD":"",
"PROV_NPI":"",
"PROV_STAT":"Incomplete",
"PROV_TY":"03",
"DT_OF_BRTH":"",
"PROFPROFL_DESGTN":"",
"ETL_LAST_UPDT_DT_TM":"2020-04-28 11:43:31.000000",
"PROV_CLSFTN_CD":"A",
"SRC_DATA_KEY":50,
"OPRN_CD":"I",
"REC_SET":"F"
}
我应该在代码中添加什么来从json字符串的最后一个值中删除额外的“”。
1条答案
按热度按时间lg40wkob1#
检查以下代码