在python命令行上使用特殊字符创建pysparkDataframe

y53ybaqx  于 2021-05-27  发布在  Spark
关注(0)|答案(1)|浏览(321)

我有一本python字典,如下所示:

data = [{"cust_decision": "buy", "cust_details": "Easy to use"}, {"cust_decision": "buy", "cust_details": "econoimical"}, {"cust_decision":"no buy", "cust_details": "Didn’t like Product"}]

我正在创建一个pyspark df和temp视图,如下所示:

from pyspark.sql import SparkSession, Row
spark.createDataFrame([Row(**i) for i in data]).createOrReplaceTempView("cust")

现在,当我看到这个临时视图的数据时,特殊字符'(这不是一个单引号'it's')被更改为一个不同的字符â. 下面是结果

spark.table("cust").show(10,False)
+-------------+---------------------+                                           
|cust_decision|cust_details         |
+-------------+---------------------+
|buy          |Easy to use          |
|buy          |econoimical          |
|no buy       |Didn’t like Product|
+-------------+---------------------+

但我想得到每个值中的角色。我怎样才能做到??预期结果如下:

+-------------+---------------------+                                           
|cust_decision|cust_details         |
+-------------+---------------------+
|buy          |Easy to use          |
|buy          |econoimical          |
|no buy       |Didn’t like Product  |
+-------------+---------------------+

谢谢。。

xdyibdwo

xdyibdwo1#

试用者 decoding 您的数据字典 utf-8 ```
data = [{"cust_decision": "buy", "cust_details": "Easy to use"}, {"cust_decision": "buy", "cust_details": "econoimical"}, {"cust_decision":"no buy", "cust_details": "Didn’t like Product"}]

decode_data=[{k: v.decode("utf-8") for k,v in i.items() } for i in data]

from pyspark.sql import SparkSession, Row
spark.createDataFrame([Row(**i) for i in decode_data]).createOrReplaceTempView("cust")

spark.table("cust").show(10,False)

+-------------+-------------------+

|cust_decision|cust_details |

+-------------+-------------------+

|buy |Easy to use |

|buy |econoimical |

|no buy |Didn’t like Product|

+-------------+-------------------+

相关问题