使用条件筛选pyspark中的非相等值\where(array\u contains())

whhtz7ly  于 2021-07-14  发布在  Spark
关注(0)|答案(1)|浏览(397)

我有一个Pypark密码

  1. condition_no_hypertension = condition.\
  2. where(array_contains('clinicalStatus.coding.code', 'active')).\
  3. where(array_contains('verificationStatus.coding.code', 'confirmed')).\
  4. where(array_contains('code.coding.code', '38341003')).\
  5. where(condition.onsetDateTime > '1900-01-01').\
  6. withColumn('condition_status', condition['clinicalStatus.coding.code'].getItem(0)).\
  7. withColumn('verification_status', condition['verificationStatus.coding.code'].getItem(0)).\
  8. withColumn('snomed_code', condition['code.coding.code'].getItem(0)).\
  9. withColumn('snomed_name', condition['code.coding.display'].getItem(0)).\
  10. select(\
  11. (condition['subject.reference'].substr(10, 40).alias('patient_id')),
  12. 'condition_status',\
  13. 'verification_status',\
  14. 'snomed_code', \
  15. 'snomed_name',\
  16. to_date(condition['onsetDateTime']).alias('first_observation_date'))

如何更改此代码并获取除代码以外的所有内容?
我试过了

  1. where(array_contains('code.coding.code', !='38341003')).\

但它不起作用。

tct7dpnv

tct7dpnv1#

你可以用 ~ (非):

  1. where(~array_contains('code.coding.code', '38341003'))

相关问题