我有一个要求,我必须在不同类型的Map。
样本数据包括:
+----------+--------------------+-------+----------+-----------+---------------+---------------+-------------+----+----+
|ADDRESS_ID| CAS_ID|VERSION|RG_ADDRESS|IS_VERIFIED|GEOFENCE_RADIUS|ADDRESS_RECENCY|ADDRESS_SCORE| LAT| LNG|
+----------+--------------------+-------+----------+-----------+---------------+---------------+-------------+----+----+
| 75199688|tmdsfggds|6| | false| 1000| 1| 85|null|null|
我尝试了以下操作,但由于类型不匹配,无法正常工作:
column_list_string = ['ADDRESS_ID', 'CAS_ID','RG_ADDRESS']
column_list_int = ['VERSION', 'GEOFENCE_RADIUS', 'ADDRESS_RECENCY', 'ADDRESS_SCORE']
column_list_double = ['LAT', 'LNG']
column_list_bool = ['IS_VERIFIED']
def convert(s, c = column_list):
print(s,c)
return {c[0]: s[0], c[1] : s[1], c[2]: s[2], c[3]: s[3], c[4] : s[4], c[5]: s[5] ,c[6]: s[6], c[7] : s[7], c[8]: s[8] ,c[9]: s[9]}
convert_udf_str = F.udf(convert, MapType(StringType(), StringType()))
convert_udf_int = F.udf(convert, MapType(StringType(), IntegerType()))
convert_udf_double = F.udf(convert, MapType(StringType(), DoubleType()))
convert_udf_bool = F.udf(convert, MapType(StringType(), BooleanType()))
dfs = dfs.withColumn('value_dict_string', convert_udf_str(F.struct(column_list_string)))
dfs = dfs.withColumn('value_dict_int', convert_udf_int(F.struct(column_list_int)))
dfs = dfs.withColumn('value_dict_double', convert_udf_double(F.struct(column_list_double)))
dfs = dfs.withColumn('value_dict_booleean', convert_udf_bool(F.struct(column_list_bool)))
dfs = dfs.withColumn('value_dict', map_concat(dfs['value_dict_string'],dfs['value_dict_int'],dfs['value_dict_double'],dfs['value_dict_booleean']))
我看到的错误是
py4jjavaerror:调用o1179.withcolumn时出错:org.apache.spark.sql.analysisexception:无法解析“map\u concat”( value_dict_string
, value_dict_int
, value_dict_double
, value_dict_booleean
)'由于数据类型不匹配:函数map_concat的输入应该是相同的类型,但它是[map<string,string>,map<string,int>,map<string,double>,map<string,boolean>];'项目[项目项目项目[项目项目项目项目[地址idţ994,casţidţ995,版本996,RG6,RG7,项目项目项目[地址idţ994,地址id,cas项目项目[地址id,cas项目项目[地址id,cas 1442,value dict double 1461,value dict booleean 1480)作为value dict 1498,值dict字符串1423,值dict int 1442,值dict double 1461,值dict booleean 1480]
暂无答案!
目前还没有任何答案,快来回答吧!