wrrgggsh

wrrgggsh1#

createDataFrame方法需要一个RDD,但textFile返回一个字符串的RDD。在创建DataFrame之前,您需要将每行转换为元组或结构。

# Convert each line to a tuple (or a structure of your choice)
sequences_rdd = sequences_rdd.map(lambda x: (x,))

# Create a DataFrame with a column named "text"
df = spark.createDataFrame(sequences_rdd, ["text"])

字符串

相关问题