R语言 如何对数据框进行排序以使其在数值上增加?

lndjwyie  于 2023-05-26  发布在  其他
关注(0)|答案(3)|浏览(203)

我有下面的table。我想让它以数字升序从第2波到第31波。

Wave                     mean_B
   <chr>                     <dbl>
 1 Wave 10 - December 2020    5.49 
 2 Wave 11 - January 2021     5.52
 3 Wave 12 - February 2021    5.52
 4 Wave 13 - March 2021       5.45
 5 Wave 14 - April 2021       5.53
 6 Wave 15 - May 2021         5.46
 7 Wave 16 - June 2021        5.51
 8 Wave 17 - July 2021        5.63
 9 Wave 18 - August 2021      5.54
10 Wave 19 - September 2021   5.49
11 Wave 2 - April 2020        5.67
12 Wave 20 - October 2021     5.55
13 Wave 21 - November 2021    5.43
14 Wave 22 - December 2021    5.35
15 Wave 23 - January 2022     5.46
16 Wave 24 - February 2022    5.41
17 Wave 25 - March 2022       5.30
18 Wave 26 - April 2022       5.38
19 Wave 27 - May 2022         5.39
20 Wave 28 - June 2022        5.55
21 Wave 29 - July 2022        5.51
22 Wave 3 - May 2020          5.72
23 Wave 30 - August 2022      5.52
24 Wave 31 - September 2022   5.54
25 Wave 4 - June 2020         5.62
26 Wave 5 - July 2020         5.54
27 Wave 6 - August 2020       5.61
28 Wave 7 - September 2020    5.60
29 Wave 8 - October 2020      5.54
30 Wave 9 - November 2020     5.60

我创建了一个表,希望使用以下代码自动对它进行数字排序:

mean_values <- Trial %>% 
  group_by(Wave) %>%
  summarise(mean_B = mean(Weight_Answer, na.rm = TRUE))

然后我试着代码:

mean_values <- Trial %>%
  group_by(Wave) %>%
  summarise(mean_B = mean(Weight_Answer, na.rm = TRUE)) %>%
  arrange(as.numeric(gsub("Wave ", "", Wave)))

然而,它发出了警告:
警告消息:arrange()中有1个警告。在论证中:..1 = as.numeric(gsub("Wave ", "", Wave)).由警告引起:!通过强制引入的NA
我对R还很陌生,所以不知道这意味着什么以及如何解决它

4urapxun

4urapxun1#

您可以使用gtools包中的mixedsort

library(gtools)

df[match(mixedsort(df$Wave), df$Wave),]

或者在上面提取Wavearrange

library(dplyr)

df |> 
  arrange(as.integer(sub("Wave (\\d+).*", "\\1", Wave)))

输出

Wave mean_B
11      Wave 2 - April 2020   5.67
22        Wave 3 - May 2020   5.72
25       Wave 4 - June 2020   5.62
26       Wave 5 - July 2020   5.54
27     Wave 6 - August 2020   5.61
28  Wave 7 - September 2020   5.60
29    Wave 8 - October 2020   5.54
30   Wave 9 - November 2020   5.60
1   Wave 10 - December 2020   5.49
2    Wave 11 - January 2021   5.52
3   Wave 12 - February 2021   5.52
4      Wave 13 - March 2021   5.45
5      Wave 14 - April 2021   5.53
6        Wave 15 - May 2021   5.46
7       Wave 16 - June 2021   5.51
8       Wave 17 - July 2021   5.63
9     Wave 18 - August 2021   5.54
10 Wave 19 - September 2021   5.49
12   Wave 20 - October 2021   5.55
13  Wave 21 - November 2021   5.43
14  Wave 22 - December 2021   5.35
15   Wave 23 - January 2022   5.46
16  Wave 24 - February 2022   5.41
17     Wave 25 - March 2022   5.30
18     Wave 26 - April 2022   5.38
19       Wave 27 - May 2022   5.39
20      Wave 28 - June 2022   5.55
21      Wave 29 - July 2022   5.51
23    Wave 30 - August 2022   5.52
24 Wave 31 - September 2022   5.54

数据

df <- structure(list(Wave = c("Wave 10 - December 2020", "Wave 11 - January 2021", 
"Wave 12 - February 2021", "Wave 13 - March 2021", "Wave 14 - April 2021", 
"Wave 15 - May 2021", "Wave 16 - June 2021", "Wave 17 - July 2021", 
"Wave 18 - August 2021", "Wave 19 - September 2021", "Wave 2 - April 2020", 
"Wave 20 - October 2021", "Wave 21 - November 2021", "Wave 22 - December 2021", 
"Wave 23 - January 2022", "Wave 24 - February 2022", "Wave 25 - March 2022", 
"Wave 26 - April 2022", "Wave 27 - May 2022", "Wave 28 - June 2022", 
"Wave 29 - July 2022", "Wave 3 - May 2020", "Wave 30 - August 2022", 
"Wave 31 - September 2022", "Wave 4 - June 2020", "Wave 5 - July 2020", 
"Wave 6 - August 2020", "Wave 7 - September 2020", "Wave 8 - October 2020", 
"Wave 9 - November 2020"), mean_B = c(5.49, 5.52, 5.52, 5.45, 
5.53, 5.46, 5.51, 5.63, 5.54, 5.49, 5.67, 5.55, 5.43, 5.35, 5.46, 
5.41, 5.3, 5.38, 5.39, 5.55, 5.51, 5.72, 5.52, 5.54, 5.62, 5.54, 
5.61, 5.6, 5.54, 5.6)), class = "data.frame", row.names = c(NA, 
-30L))
bzzcjhmw

bzzcjhmw2#

@benson23给出了一个很好的答案。这是一个“R”的答案。

# extract wave numbers
waveNumbers <- sub("Wave (\\d+) \\- \\w+ \\d+", "\\1", dat$Wave) |> as.numeric()
# permute accordingly
dat2 <- dat[order(waveNumbers), ]

请注意,行“names”(1,2,...)也被置换。您可以使用rownames(dat2) <- NULL来避免这种情况。

cbjzeqam

cbjzeqam3#

既然你已经在tidyverse中,我们可以使用readr::parse_number

readr::parse_number("Wave 9 - November 2020     5.60")
# [1] 9

这很容易添加到你的dplyr管道:

library(dplyr)
df %>%
  arrange(readr::parse_number(Wave))
#                        Wave mean_B
# 1       Wave 2 - April 2020   5.67
# 2         Wave 3 - May 2020   5.72
# 3        Wave 4 - June 2020   5.62
# 4        Wave 5 - July 2020   5.54
# 5      Wave 6 - August 2020   5.61
# 6   Wave 7 - September 2020   5.60
# 7     Wave 8 - October 2020   5.54
# 8    Wave 9 - November 2020   5.60
# 9   Wave 10 - December 2020   5.49
# 10   Wave 11 - January 2021   5.52
# 11  Wave 12 - February 2021   5.52
# 12     Wave 13 - March 2021   5.45
# 13     Wave 14 - April 2021   5.53
# 14       Wave 15 - May 2021   5.46
# 15      Wave 16 - June 2021   5.51
# 16      Wave 17 - July 2021   5.63
# 17    Wave 18 - August 2021   5.54
# 18 Wave 19 - September 2021   5.49
# 19   Wave 20 - October 2021   5.55
# 20  Wave 21 - November 2021   5.43
# 21  Wave 22 - December 2021   5.35
# 22   Wave 23 - January 2022   5.46
# 23  Wave 24 - February 2022   5.41
# 24     Wave 25 - March 2022   5.30
# 25     Wave 26 - April 2022   5.38
# 26       Wave 27 - May 2022   5.39
# 27      Wave 28 - June 2022   5.55
# 28      Wave 29 - July 2022   5.51
# 29    Wave 30 - August 2022   5.52
# 30 Wave 31 - September 2022   5.54

相关问题