R中使用多个条件的子集变量名

sc4hvdpw  于 2024-01-03  发布在  其他
关注(0)|答案(2)|浏览(198)

我有以下代码:

  1. df <- data.frame(
  2. check.names = FALSE,
  3. `Att-Bissen P [mm]` = c(57937.8),
  4. `Att-Bissen PET [mm]` = c(39472.9),
  5. `Att-Bissen Q [mm]` = c(26501.2),
  6. `Rau. Merl P [mm]` = c(53443.6),
  7. `Rau. Merl PET [mm]` = c(40535.45),
  8. `Rau. Merl Q [mm]` = c(15489.5),
  9. `Syre Felsmuhle/Mertert P [mm]` = c(46020.3),
  10. `Syre Felsmuhle/Mertert PET [mm]` = c(42196.4),
  11. `Syre Felsmuhle/Mertert Q [mm]` = c(16210.69079),
  12. `Wiltz-Winseler P [mm]` = c(63389.7),
  13. `Wiltz-Winseler PET [mm]` = c(42703.3),
  14. `Wiltz-Winseler Q [mm]` = c(33576.8),
  15. `Our-Gemund/Vianden P [mm]` = c(63389.7),
  16. `Our-Gemund/Vianden PET [mm]` = c(42834.5),
  17. `Our-Gemund/Vianden Q [mm]` = c(12588.9))
  18. # Define the formula as a function
  19. calc_formula <- function(P, PET, Q, n) {
  20. 1 - (1 + (P / PET)^n) ^ -((n + 1) / (n + Q))
  21. }
  22. # Define the n value
  23. n <- 2.5
  24. # Extract the site names from the column names
  25. site_names <- sub(" .*", "", names(df)[-1])
  26. # Loop through each site and calculate the formula
  27. results <- list()
  28. for (site in site_names) {
  29. site_data <- df[, grepl(site, names(df))]
  30. results[[site]] <- calc_formula(site_data[[paste0(site, " P [mm]")]],
  31. site_data[[paste0(site, " PET [mm]")]],
  32. site_data[[paste0(site, " Q [mm]")]], n)
  33. }
  34. # Combine the results into a data frame
  35. results_df <- data.frame(Site = names(results), Result = unlist(results))

字符串
以下错误:

  1. Error in data.frame(Site = names(results), Result = unlist(results)) :
  2. arguments imply differing number of rows: 5, 3


我认为这是因为我不能在“site_names”中指定它也应该考虑“.”和““(空格)。但是我不能确定R如何识别空格,例如,在某些名称处,而不是在P,PET或Q之前。因此,它在“Rau. Merl”和“Syre Felsmuhle/Mertert”的站点中返回NA。
我可以很容易地更改.csv文件中的名称,但当我有一个强大的数据集是相当繁琐的。
我如何修复这部分代码?
任何帮助将不胜感激。谢谢!!

r6vfmomb

r6vfmomb1#

你有两个问题。首先,让我们看看你的网站名称:

  1. site_names
  2. # [1] "Att-Bissen" "Att-Bissen" "Rau." "Rau."
  3. # [5] "Rau." "Syre" "Syre" "Syre"
  4. # [9] "Wiltz-Winseler" "Wiltz-Winseler" "Wiltz-Winseler" "Our-Gemund/Vianden"
  5. # [13] "Our-Gemund/Vianden" "Our-Gemund/Vianden"
  6. # Loop through each site and calculate the formula

字符串
这里有两个问题:1)重复是不好的,因为你只想每个网站一个结果。我们可以在最后放一个... |> unique()来删除重复的内容。2)你已经通过删除第一个空格后的所有内容来提取网站名称。正如你所说,你的一些网站名称中有空格,比如"Rau. Merl"
让我们通过删除字符串中你真正想要删除的部分来解决这个问题。我们将使用正则表达式或:|和合适的方括号转义来完成这个操作。

  1. site_names <- sub(" P \\[mm\\]| PET \\[mm\\]| Q \\[mm\\]", "", names(df)[-1]) |>
  2. unique()
  3. site_names
  4. # [1] "Att-Bissen" "Rau. Merl" "Syre Felsmuhle/Mertert" "Wiltz-Winseler"
  5. # [5] "Our-Gemund/Vianden"


运行剩下的代码,它现在可以工作了:

  1. # ...
  2. results_df
  3. # Site Result
  4. # Att-Bissen Att-Bissen 0.0001695120
  5. # Rau. Merl Rau. Merl 0.0002478667
  6. # Syre Felsmuhle/Mertert Syre Felsmuhle/Mertert 0.0001742918
  7. # Wiltz-Winseler Wiltz-Winseler 0.0001359271
  8. # Our-Gemund/Vianden Our-Gemund/Vianden 0.0003609042

展开查看全部
1cklez4t

1cklez4t2#

我的感觉是,以长格式保存数据可能是有利的。
请注意,reshape()或来自{tidyr}的函数(如pivot_longer())当然能够以更健壮的方式做到这一点。

重塑

  1. new = data.frame(
  2. unique(sub(" P \\[mm\\]| PET \\[mm\\]| Q \\[mm\\]", "", names(df))),
  3. matrix(as.numeric(df[1L, ]), ncol = 3L, byrow = TRUE)) |>
  4. setNames(object = _, nm = c("stations", "P", "PET", "Q"))

字符串
sub()-方法是从@GregorThomas的answer中偷来的。然后我们逐行应用calc_formula()的修改版本。

申请

  1. # slightly modified
  2. calc_formula <- function(row, n = 2.5) {
  3. 1L - (1L + (row[["P"]] / row[["PET"]])^n) ^ -((n + 1L) / (n + row[["Q"]]))
  4. }
  5. new$result = apply(X = new[-1L], MARGIN = 1L, FUN = calc_formula)

结果

  1. > new
  2. stations P PET Q result
  3. 1 Att-Bissen 57937.8 39472.90 26501.20 0.0001695120
  4. 2 Rau. Merl 53443.6 40535.45 15489.50 0.0002478667
  5. 3 Syre Felsmuhle/Mertert 46020.3 42196.40 16210.69 0.0001742918
  6. 4 Wiltz-Winseler 63389.7 42703.30 33576.80 0.0001359271
  7. 5 Our-Gemund/Vianden 63389.7 42834.50 12588.90 0.0003609042

数据

  1. df <- data.frame(
  2. check.names = FALSE,
  3. `Att-Bissen P [mm]` = c(57937.8),
  4. `Att-Bissen PET [mm]` = c(39472.9),
  5. `Att-Bissen Q [mm]` = c(26501.2),
  6. `Rau. Merl P [mm]` = c(53443.6),
  7. `Rau. Merl PET [mm]` = c(40535.45),
  8. `Rau. Merl Q [mm]` = c(15489.5),
  9. `Syre Felsmuhle/Mertert P [mm]` = c(46020.3),
  10. `Syre Felsmuhle/Mertert PET [mm]` = c(42196.4),
  11. `Syre Felsmuhle/Mertert Q [mm]` = c(16210.69079),
  12. `Wiltz-Winseler P [mm]` = c(63389.7),
  13. `Wiltz-Winseler PET [mm]` = c(42703.3),
  14. `Wiltz-Winseler Q [mm]` = c(33576.8),
  15. `Our-Gemund/Vianden P [mm]` = c(63389.7),
  16. `Our-Gemund/Vianden PET [mm]` = c(42834.5),
  17. `Our-Gemund/Vianden Q [mm]` = c(12588.9))

展开查看全部

相关问题