R语言 left_join出现问题:“联接列必须存在”

6jygbczu  于 2023-01-10  发布在  其他
关注(0)|答案(1)|浏览(644)

我试图总结出现在"中间" position s的f频率数据,即,在第一个和最后一个position之间。我完成此任务的方法是过滤这些数据,执行summarise,然后将新数据与过滤它们的 Dataframe 重新连接。这对于训练数据很有效:

library(tidyverse)
df %>%
  group_by(rowid) %>%
  # summarize frequencies for middle postions:
  filter(position != first(position) & position != last(position)) %>%
  # summarise:
  summarize(across(position),
            middle_position = mean(f, na.rm = TRUE),
            word = str_c(word, collapse=" ")
            ) %>%
  left_join(df, ., by = c("rowid", "position"))

但是,应用到实际数据时,我得到以下错误消息:

Error in `left_join()`:
! Join columns must be present in data.
✖ Problem with `position`.
Run `rlang::last_error()` to see where the error occurred.
> rlang::last_error()
<error/rlang_error>
Error in `left_join()`:
! Join columns must be present in data.
✖ Problem with `position`.
---
Backtrace:
 1. ... %>% left_join(bnc_X, ., by = c("rowid", "position"))
 3. dplyr:::left_join.data.frame(bnc_X, ., by = c("rowid", "position"))
Run `rlang::last_trace()` to see the full context.
> rlang::last_trace()
<error/rlang_error>
Error in `left_join()`:
! Join columns must be present in data.
✖ Problem with `position`.
---
Backtrace:
    ▆
 1. ├─... %>% left_join(bnc_X, ., by = c("rowid", "position"))
 2. ├─dplyr::left_join(bnc_X, ., by = c("rowid", "position"))
 3. └─dplyr:::left_join.data.frame(bnc_X, ., by = c("rowid", "position"))
 4.   └─dplyr:::join_mutate(...)
 5.     └─dplyr:::join_cols(...)
 6.       └─dplyr:::standardise_join_by(...)
 7.         └─dplyr:::check_join_vars(by$x, x_names, error_call = error_call)
 8.           └─rlang::abort(bullets, call = error_call)

主要的问题似乎是变量position-为什么它不能被识别?我已经花了几个小时试图解决这个问题,但不能,并将感谢帮助!
数据:

df <- data.frame(
  size = c(3,3,3,
              3,3,3,
              4,4,4,4,
              5,5,5,5,5,
              3,3,3),
  rowid = c(1,1,1,2,2,2,3,3,3,3,4,4,4,4,4,5,5,5),
  turn = c(rep("How are you?",3),
           rep("I'm fine.",3),
           rep("How's the weather?",4),
           rep("It's really very cold.",5),
           rep("I love you",3)),
  word = c("how","are","you",
           "i","'m","fine",
           "how","'s","the","weather",
           "it","'s","really", "very","cold",
           "i","love","you"),
  f = c(400,300,250,
        600,555,1,
        400,500,700,20,
        390,500,177,200,35,
        600,199,400),
  position = c(1,2,3,
               1,2,3,
               1,2,3,4,
               1,2,3,4,5,
               1,2,3)
)
mwg9r5ms

mwg9r5ms1#

这对我在data.table中有效。不需要连接。

library(data.table)
# set to data.table
setDT(df)
# get id's of middle rows by rowid
idx = df[, .(idx = .I[-c(1L, .N)]), by = .(rowid)]$idx
# update these middle rows
df[idx, `:=`(middle_position = mean(f),
             word_midddel = paste0(word, collapse = " ")),
   by = .(rowid)]

    size rowid                   turn    word   f position middle_position   word_midddel
 1:    3     1           How are you?     how 400        1              NA           <NA>
 2:    3     1           How are you?     are 300        2        300.0000            are
 3:    3     1           How are you?     you 250        3              NA           <NA>
 4:    3     2              I'm fine.       i 600        1              NA           <NA>
 5:    3     2              I'm fine.      'm 555        2        555.0000             'm
 6:    3     2              I'm fine.    fine   1        3              NA           <NA>
 7:    4     3     How's the weather?     how 400        1              NA           <NA>
 8:    4     3     How's the weather?      's 500        2        600.0000         's the
 9:    4     3     How's the weather?     the 700        3        600.0000         's the
10:    4     3     How's the weather? weather  20        4              NA           <NA>
11:    5     4 It's really very cold.      it 390        1              NA           <NA>
12:    5     4 It's really very cold.      's 500        2        292.3333 's really very
13:    5     4 It's really very cold.  really 177        3        292.3333 's really very
14:    5     4 It's really very cold.    very 200        4        292.3333 's really very
15:    5     4 It's really very cold.    cold  35        5              NA           <NA>
16:    3     5             I love you       i 600        1              NA           <NA>
17:    3     5             I love you    love 199        2        199.0000           love
18:    3     5             I love you     you 400        3              NA           <NA>

相关问题