找不到\u corrupt\u record column pyspark

avwztpqn 于 2021-07-14 发布在 Spark

关注(0)|答案(0)|浏览(294)

我正在尝试选择\u损坏的\u记录，但找不到它。在我的情况下，所有列都应该损坏。列不显示。我试过过滤但没有成功。

schema = StructType([
      StructField("TestID", DecimalType(25,10), True),
      StructField("Key", DecimalType(25,10), True),
      StructField("Company", DecimalType(25,10), True),
       StructField("Client", DecimalType(25,10), True),
       StructField("Project", DecimalType(25,10), True),
       StructField("ingestdatetime", DecimalType(25,10), True),
       StructField("_corrupt_record", StringType(), True)
    ])
    df = spark.read.csv(
        '/mnt/jakichan/isgood/ingestdatetime=20210202231912',
        schema=location_schema,
        header=True,
        sep=",",
        mode="PERMISSIVE",
        columnNameOfCorruptRecord="_corrupt_record",
    ).cache()
    df.select("_corrupt_record")