提取与R中超出列范围的字符串匹配的值

hgc7kmma  于 2023-11-14  发布在  其他
关注(0)|答案(1)|浏览(112)

在R中编写相关代码。我有一个 Dataframe (称为realData),具有10个名为“RealAttribute_X”的变量,其中X是1到10之间的数字(包括)。每列包含以下十个属性之一:“智慧”、“吸引力”、“魅力”、“野心”、“懒惰”、“慷慨”、“快乐”、“友好”、“傲慢”“冷静”。属性被随机分配给10个“RealAttribute_X”中的1个。

dt<-structure(list(session_id = c("17472631", "17472632", "17472633", 
"17472635", "17472636", "17472638"), RealAttribute_1 = c("Moderately ugly", 
"Very dull", "Very distant", "Very joyful", "Moderately joyful", 
"Very distant"), RealAttribute_2 = c("Very nervous", "Very gloomy", 
"Very generous", "Moderately charismatic", "Moderately hard working", 
"Moderately modest"), RealAttribute_3 = c("Slightly generous", 
"Moderately ugly", "Moderately arrogant", "Moderately calm", 
"Moderately charismatic", "Moderately charismatic"), RealAttribute_4 = c("Moderately arrogant", 
"Slightly generous", "Very dull", "Moderately distant", "Slightly distant", 
"Slightly ambitious"), RealAttribute_5 = c("Slightly unambitious", 
"Moderately calm", "Moderately unambitious", "Moderately lazy", 
"Very modest", "Moderately intelligent"), RealAttribute_6 = c("Slightly dull", 
"Slightly ambitious", "Very calm", "Moderately ambitious", "Moderately generous", 
"Moderately generous"), RealAttribute_7 = c("Very intelligent", 
"Slightly distant", "Very intelligent", "Slightly ugly", "Very good-looking", 
"Very lazy"), RealAttribute_8 = c("Very joyful", "Slightly modest", 
"Slightly joyful", "Very arrogant", "Very ambitious", "Slightly good-looking"
), RealAttribute_9 = c("Very distant", "Very lazy", "Slightly good-looking", 
"Very generous", "Moderately intelligent", "Moderately gloomy"
), RealAttribute_10 = c("Slightly lazy", "Moderately intelligent", 
"Slightly hard working", "Moderately intelligent", "Very calm", 
"Moderately nervous")), class = "data.frame", row.names = c(NA, 
-6L))

head(dt)

字符串
每个属性都可以有六个值之一。属性及其可选值为:智力=“非常不聪明”、“中等不聪明”、“稍微不聪明”、“稍微聪明”、“中等聪明”、“非常聪明”吸引力=“非常丑”、“中等丑”、“稍微丑”、“稍微好看”、“中等好看”,“非常好看”懒惰=“非常懒惰”、“中等懒惰”、“稍微懒惰”、“稍微努力工作”、“中等努力工作”、“非常努力工作”友好=“非常疏远”、“中等疏远”、“稍微疏远”、“稍微友好”、“中等友好”、“非常友好”魅力=“非常迟钝”,“适度迟钝”、“稍微迟钝”、“稍微有魅力”、“稍微有魅力”、“非常有魅力”冷静=“非常紧张”、“适度紧张”、“稍微紧张”、“稍微冷静”、“适度冷静”、“非常冷静”慷慨=“非常吝啬”、“适度吝啬”、“稍微吝啬”、“稍微慷慨”、“适度慷慨”,“非常慷慨”的快乐=“非常沮丧”,“适度沮丧”,“稍微沮丧”,“稍微快乐”,“适度快乐”,“非常快乐”的傲慢=“非常傲慢”,“适度傲慢”,“稍微傲慢”,“稍微谦虚”,“适度谦虚”,“非常谦虚”的雄心=“非常没有雄心”,“适度没有雄心”,“稍微没有雄心”,“稍微雄心勃勃”、“中等雄心勃勃”、“非常雄心勃勃”
我尝试用R编写一个代码,它将创建10个新列,每个属性一个列(“智慧”、“吸引力”、“魅力”、“野心”、“懒惰”、“慷慨”、“快乐”、“友好”、“傲慢”、“冷静”),在包含相关属性的现有 Dataframe 中查找正确的变量,并将该属性的值分配给正确的新变量变量
例如,这是数据框的前四行:RealAttribute_1:“适度丑陋”、“非常沉闷”、“非常疏远”、“非常快乐”<br RealAttribute_2:“非常紧张”、“非常阴郁”、“非常慷慨”、“适度魅力”<br RealAttribute_3:“稍微慷慨”、“适度丑陋”、“适度傲慢”、“适度平静”<br RealAttribute_4:“适度傲慢”、“稍微大方”、“非常迟钝”、“适度疏远”<br RealAttribute_5:“稍微没有野心”、“适度冷静”、“适度没有野心”、“适度懒惰”<br RealAttribute_6:“稍微迟钝”、“稍微有野心”、“非常冷静”、“适度有野心”<br RealAttribute_7:“非常聪明”、“有点疏远”、“非常聪明”、“有点丑”<br RealAttribute_8:“非常快乐”、“有点谦虚”、“有点快乐”、“非常傲慢”<br RealAttribute_9:“非常疏远”、“非常懒惰”、“有点好看”、“非常大方”<br RealAttribute_10:“稍懒”、“中智”、“稍努力”、“中智”
因此,前四行的最终结果应该是这样的:智力:“非常聪明”、“中等聪明”、“非常聪明”、“中等聪明”<br吸引力:“中等丑”、“中等丑”、“稍微好看”、“稍微丑”<br懒惰:“稍微懒惰”、“非常懒惰”、“稍微勤奋”、“稍微懒惰”<br友好:“非常疏远”、“稍微疏远”、“非常疏远“、“中等疏远”“很高兴”、“很郁闷”、“有点高兴”、“很高兴”
我定义了所有属性值

attribute_values <- list(
  intelligence = c("Very unintelligent", "Moderately unintelligent", "Slightly unintelligent", 
                   "Slightly intelligent", "Moderately intelligent", "Very intelligent"),
  attractiveness = c("Very ugly", "Moderately ugly", "Slightly ugly", 
                     "Slightly good-looking", "Moderately good-looking", "Very good-looking"),
  laziness = c("Very lazy", "Moderately lazy", "Slightly lazy", 
               "Slightly hard working", "Moderately hard working", "Very hard working"),
  friendliness = c("Very distant", "Moderately distant", "Slightly distant", 
                   "Slightly friendly", "Moderately friendly", "Very friendly"),
  charisma = c("Very dull", "Moderately dull", "Slightly dull", 
               "Slightly charismatic", "Moderately charismatic", "Very charismatic"),
  calmness = c("Very nervous", "Moderately nervous", "Slightly nervous", 
               "Slightly calm", "Moderately calm", "Very calm"),
  generosity = c("Very stingy", "Moderately stingy", "Slightly stingy", 
                 "Slightly generous", "Moderately generous", "Very generous"),
  joyfullness = c("Very gloomy", "Moderately gloomy", "Slightly gloomy", 
                  "Slightly joyful", "Moderately joyful", "Very joyful"),
  arrogance = c("Very arrogant", "Moderately arrogant", "Slightly arrogant", 
                "Slightly modest", "Moderately modest", "Very modest"),
  ambition = c("Very unambitious", "Moderately unambitious", "Slightly unambitious", 
               "Slightly ambitious", "Moderately ambitious", "Very ambitious")
)


但开始检查我的代码中的第一个变量(intelligence):

intelligence_values <- c("Very unintelligent", "Moderately unintelligent", "Slightly unintelligent", 
                         "Slightly intelligent", "Moderately intelligent", "Very intelligent")

realData$Intelligence <- apply(dt[, grep("RealAttribute_", colnames(realData), value = TRUE)], 1, function(row) {
  match_value <- which(row %in% intelligence_values)[1]
  if (is.na(match_value)) {
    return(NA)
  } else {
    return(intelligence_values[match_value])
  }
})


这段代码只向“intelligence”列返回NA。

realData <- realData %>%
  rowwise() %>%
  mutate(Intelligence = intelligence_values[match(c_across(starts_with("RealAttribute")), intelligence_values)])


但得到以下错误:c_across()中的错误:!只能在数据掩码 predicate (如mutate()filter()group_by())内部使用。
并且在运行rlang::last_trace():<error/rlang_error> c_across()中出错:!只能在数据掩码 predicate (如mutate()filter()group_by())内部使用。**- Backtrace:x
1.+-realData %>%按行方式()%>%...
1.-plyr::突变(...)
1.-base::eval(列,.数据,父帧())
1.

\-base::eval(cols[[col]], .data, parent.frame())


1.

+-base::match(c_across(starts_with("RealAttribute")), intelligence_values)


1.

\-dplyr::c_across(starts_with("RealAttribute"))


运行rlang::last_trace(drop = FALSE)查看4个隐藏帧。

有什么想法出了问题或我应该怎么写?提前感谢!

c9qzyr3d

c9qzyr3d1#

我认为这是更简单的做一个表连接我们可能会遇到的属性(它看起来像1或2每个类别)与他们各自的类别。然后我们可以重塑数据长,提取属性,将其连接到查找表,并使用它来定义条目应该放在哪一列。

library(tidyverse)

attribute_table <- data.frame(
  attribute = c("ugly", "nervous", "generous", "arrogant", "unambitious", "dull", 
                "intelligent", "joyful", "distant", "lazy", "gloomy", "calm", 
                "ambitious", "modest", "good-looking", "hard working", "charismatic"),
  category = c("attractiveness", "calmness", "generosity", "arrogance", "ambition", "charisma",
               "intelligence", "joyfulness", "friendliness", "laziness", "joyfulness", "calmness",
               "ambition", "arrogance", "attractiveness", "laziness", "charisma")
)

dt |>
  pivot_longer(-session_id) |>
  separate(value, c("degree", "attribute"), sep = " ", extra = "merge", remove = FALSE) |>
  left_join(attribute_table) |>
  select(session_id, value, category) |>
  pivot_wider(names_from = category, values_from = value)

字符串
结果

# A tibble: 6 × 11
  session_id attractiveness        calmness           generosity          arrogance           ambition             charisma intelligence joyfulness friendliness laziness
  <chr>      <chr>                 <chr>              <chr>               <chr>               <chr>                <chr>    <chr>        <chr>      <chr>        <chr>   
1 17472631   Moderately ugly       Very nervous       Slightly generous   Moderately arrogant Slightly unambitious Slightl… Very intell… Very joyf… Very distant Slightl…
2 17472632   Moderately ugly       Moderately calm    Slightly generous   Slightly modest     Slightly ambitious   Very du… Moderately … Very gloo… Slightly di… Very la…
3 17472633   Slightly good-looking Very calm          Very generous       Moderately arrogant Moderately unambiti… Very du… Very intell… Slightly … Very distant Slightl…
4 17472635   Slightly ugly         Moderately calm    Very generous       Very arrogant       Moderately ambitious Moderat… Moderately … Very joyf… Moderately … Moderat…
5 17472636   Very good-looking     Very calm          Moderately generous Very modest         Very ambitious       Moderat… Moderately … Moderatel… Slightly di… Moderat…
6 17472638   Slightly good-looking Moderately nervous Moderately generous Moderately modest   Slightly ambitious   Moderat… Moderately … Moderatel… Very distant Very la…

相关问题