我正在尝试选择\u损坏的\u记录,但找不到它。在我的情况下,所有列都应该损坏。列不显示。我试过过滤但没有成功。
schema = StructType([
StructField("TestID", DecimalType(25,10), True),
StructField("Key", DecimalType(25,10), True),
StructField("Company", DecimalType(25,10), True),
StructField("Client", DecimalType(25,10), True),
StructField("Project", DecimalType(25,10), True),
StructField("ingestdatetime", DecimalType(25,10), True),
StructField("_corrupt_record", StringType(), True)
])
df = spark.read.csv(
'/mnt/jakichan/isgood/ingestdatetime=20210202231912',
schema=location_schema,
header=True,
sep=",",
mode="PERMISSIVE",
columnNameOfCorruptRecord="_corrupt_record",
).cache()
df.select("_corrupt_record")
暂无答案!
目前还没有任何答案,快来回答吧!