我正在将一个r脚本转换为pyspark脚本,但有一点卡住了,需要一些帮助,下面是r脚本的代码:
## Transposing the stacked trt section to wider set for Outcome ID set
## c("Trial.ID", "Arm.ID", "Re.randomized.arm.id", "Phase.ID", "Period.ID")
TRT <- select(trt_stacked, -Planned.Treatment.ID) %>%
renameCol(c("Treatment.Administration.days.x", "Treatment.Administration.days.y"),
c("Treatment.Administration.days.plan", "Treatment.Administration.days.act") ) %>%
reshape(direction = "wide",
idvar = c("Trial.ID", "Arm.ID", "Re.randomized.arm.id", "Phase.ID", "Period.ID",
"Phase", "Phase.Duration", "Phase.Duration.unit", "Phase.Description",
"Period", "Period.Duration", "Period.Dur.Unit", "Period.Description"),
timevar="Treatment.ID")
我需要将这段代码转换成pyspark,虽然spark中有一个pivot函数进行转置,但我不知道这个“重塑”函数的功能。我知道这个整形函数的输出是将除idvar中的所有列转置为treatment\u id的所有不同值的行。它还将treatment\u id连接到所有transpose列,如下所示:
Titration.1
Titration.Duration.1
Titration.Duration.Unit.1
Titration.Target.1
Titration.Value.1
Titration.unit.1
Treatment.name.1
Treatment.Class.1
Treatment.Description.1
Treatment.Start.Time.1
Treatment.End.Time.1
Titration.2
Titration.Duration.2
Titration.Duration.Unit.2
Titration.Target.2
Titration.Value.2
Titration.unit.2
Treatment.name.2
Treatment.Class.2
Treatment.Description.2
Treatment.Start.Time.2
Treatment.End.Time.2
r中的重塑函数是否也会删除空值?有人能帮我在spark或python中找到类似的函数吗。
输入:
|treatment_id|arm_id|re_randomized_arm_id|trial_id|phase_id| phase|phase_duration|phase_duration_unit|phase_description|titration|titration_duration|titration_duration_unit|titration_target|titration_value|titration_unit|
1|1|-999|16|1|Active|NA|NA|NA|titration|NA|NA|NA|NA|NA
2|1|-999|16|1|Active|NA|NA|NA|titration|NA|NA|NA|NA|NA
2|1|-999|16|1|Active|NA|NA|NA|No titration|NA|NA|NA|NA|NA
转置后的预期输出:
|treatment_id|arm_id|re_randomized_arm_id|trial_id|phase_id| phase|phase_duration|phase_duration_unit|phase_description|titration_1|titration_duration_1|titration_duration_unit_1|titration_target_1|titration_value_1|titration_unit_1|titration_2|titration_duration_2|titration_duration_unit_2|titration_target_2|titration_value_2|titration_unit_2|
1|1|-999|16|1|Active|NA|NA|NA|titration|NA|NA|NA|NA|NA
2|1|-999|16|1|Active|NA|NA|NA|titration|NA|NA|NA|NA|NA|No titration|NA|NA|NA|NA|NA
任何人都可以帮助,不能想出逻辑或库来执行这项任务。
谢谢
暂无答案!
目前还没有任何答案,快来回答吧!