在R中编写相关代码。我有一个 Dataframe (称为realData),具有10个名为“RealAttribute_X”的变量,其中X是1到10之间的数字(包括)。每列包含以下十个属性之一:“智慧”、“吸引力”、“魅力”、“野心”、“懒惰”、“慷慨”、“快乐”、“友好”、“傲慢”“冷静”。属性被随机分配给10个“RealAttribute_X”中的1个。
dt<-structure(list(session_id = c("17472631", "17472632", "17472633",
"17472635", "17472636", "17472638"), RealAttribute_1 = c("Moderately ugly",
"Very dull", "Very distant", "Very joyful", "Moderately joyful",
"Very distant"), RealAttribute_2 = c("Very nervous", "Very gloomy",
"Very generous", "Moderately charismatic", "Moderately hard working",
"Moderately modest"), RealAttribute_3 = c("Slightly generous",
"Moderately ugly", "Moderately arrogant", "Moderately calm",
"Moderately charismatic", "Moderately charismatic"), RealAttribute_4 = c("Moderately arrogant",
"Slightly generous", "Very dull", "Moderately distant", "Slightly distant",
"Slightly ambitious"), RealAttribute_5 = c("Slightly unambitious",
"Moderately calm", "Moderately unambitious", "Moderately lazy",
"Very modest", "Moderately intelligent"), RealAttribute_6 = c("Slightly dull",
"Slightly ambitious", "Very calm", "Moderately ambitious", "Moderately generous",
"Moderately generous"), RealAttribute_7 = c("Very intelligent",
"Slightly distant", "Very intelligent", "Slightly ugly", "Very good-looking",
"Very lazy"), RealAttribute_8 = c("Very joyful", "Slightly modest",
"Slightly joyful", "Very arrogant", "Very ambitious", "Slightly good-looking"
), RealAttribute_9 = c("Very distant", "Very lazy", "Slightly good-looking",
"Very generous", "Moderately intelligent", "Moderately gloomy"
), RealAttribute_10 = c("Slightly lazy", "Moderately intelligent",
"Slightly hard working", "Moderately intelligent", "Very calm",
"Moderately nervous")), class = "data.frame", row.names = c(NA,
-6L))
head(dt)
字符串
每个属性都可以有六个值之一。属性及其可选值为:智力=“非常不聪明”、“中等不聪明”、“稍微不聪明”、“稍微聪明”、“中等聪明”、“非常聪明”吸引力=“非常丑”、“中等丑”、“稍微丑”、“稍微好看”、“中等好看”,“非常好看”懒惰=“非常懒惰”、“中等懒惰”、“稍微懒惰”、“稍微努力工作”、“中等努力工作”、“非常努力工作”友好=“非常疏远”、“中等疏远”、“稍微疏远”、“稍微友好”、“中等友好”、“非常友好”魅力=“非常迟钝”,“适度迟钝”、“稍微迟钝”、“稍微有魅力”、“稍微有魅力”、“非常有魅力”冷静=“非常紧张”、“适度紧张”、“稍微紧张”、“稍微冷静”、“适度冷静”、“非常冷静”慷慨=“非常吝啬”、“适度吝啬”、“稍微吝啬”、“稍微慷慨”、“适度慷慨”,“非常慷慨”的快乐=“非常沮丧”,“适度沮丧”,“稍微沮丧”,“稍微快乐”,“适度快乐”,“非常快乐”的傲慢=“非常傲慢”,“适度傲慢”,“稍微傲慢”,“稍微谦虚”,“适度谦虚”,“非常谦虚”的雄心=“非常没有雄心”,“适度没有雄心”,“稍微没有雄心”,“稍微雄心勃勃”、“中等雄心勃勃”、“非常雄心勃勃”
我尝试用R编写一个代码,它将创建10个新列,每个属性一个列(“智慧”、“吸引力”、“魅力”、“野心”、“懒惰”、“慷慨”、“快乐”、“友好”、“傲慢”、“冷静”),在包含相关属性的现有 Dataframe 中查找正确的变量,并将该属性的值分配给正确的新变量变量
例如,这是数据框的前四行:RealAttribute_1:“适度丑陋”、“非常沉闷”、“非常疏远”、“非常快乐”<br RealAttribute_2:“非常紧张”、“非常阴郁”、“非常慷慨”、“适度魅力”<br RealAttribute_3:“稍微慷慨”、“适度丑陋”、“适度傲慢”、“适度平静”<br RealAttribute_4:“适度傲慢”、“稍微大方”、“非常迟钝”、“适度疏远”<br RealAttribute_5:“稍微没有野心”、“适度冷静”、“适度没有野心”、“适度懒惰”<br RealAttribute_6:“稍微迟钝”、“稍微有野心”、“非常冷静”、“适度有野心”<br RealAttribute_7:“非常聪明”、“有点疏远”、“非常聪明”、“有点丑”<br RealAttribute_8:“非常快乐”、“有点谦虚”、“有点快乐”、“非常傲慢”<br RealAttribute_9:“非常疏远”、“非常懒惰”、“有点好看”、“非常大方”<br RealAttribute_10:“稍懒”、“中智”、“稍努力”、“中智”
因此,前四行的最终结果应该是这样的:智力:“非常聪明”、“中等聪明”、“非常聪明”、“中等聪明”<br吸引力:“中等丑”、“中等丑”、“稍微好看”、“稍微丑”<br懒惰:“稍微懒惰”、“非常懒惰”、“稍微勤奋”、“稍微懒惰”<br友好:“非常疏远”、“稍微疏远”、“非常疏远“、“中等疏远”“很高兴”、“很郁闷”、“有点高兴”、“很高兴”
我定义了所有属性值
attribute_values <- list(
intelligence = c("Very unintelligent", "Moderately unintelligent", "Slightly unintelligent",
"Slightly intelligent", "Moderately intelligent", "Very intelligent"),
attractiveness = c("Very ugly", "Moderately ugly", "Slightly ugly",
"Slightly good-looking", "Moderately good-looking", "Very good-looking"),
laziness = c("Very lazy", "Moderately lazy", "Slightly lazy",
"Slightly hard working", "Moderately hard working", "Very hard working"),
friendliness = c("Very distant", "Moderately distant", "Slightly distant",
"Slightly friendly", "Moderately friendly", "Very friendly"),
charisma = c("Very dull", "Moderately dull", "Slightly dull",
"Slightly charismatic", "Moderately charismatic", "Very charismatic"),
calmness = c("Very nervous", "Moderately nervous", "Slightly nervous",
"Slightly calm", "Moderately calm", "Very calm"),
generosity = c("Very stingy", "Moderately stingy", "Slightly stingy",
"Slightly generous", "Moderately generous", "Very generous"),
joyfullness = c("Very gloomy", "Moderately gloomy", "Slightly gloomy",
"Slightly joyful", "Moderately joyful", "Very joyful"),
arrogance = c("Very arrogant", "Moderately arrogant", "Slightly arrogant",
"Slightly modest", "Moderately modest", "Very modest"),
ambition = c("Very unambitious", "Moderately unambitious", "Slightly unambitious",
"Slightly ambitious", "Moderately ambitious", "Very ambitious")
)
型
但开始检查我的代码中的第一个变量(intelligence):
intelligence_values <- c("Very unintelligent", "Moderately unintelligent", "Slightly unintelligent",
"Slightly intelligent", "Moderately intelligent", "Very intelligent")
realData$Intelligence <- apply(dt[, grep("RealAttribute_", colnames(realData), value = TRUE)], 1, function(row) {
match_value <- which(row %in% intelligence_values)[1]
if (is.na(match_value)) {
return(NA)
} else {
return(intelligence_values[match_value])
}
})
型
这段代码只向“intelligence”列返回NA。
realData <- realData %>%
rowwise() %>%
mutate(Intelligence = intelligence_values[match(c_across(starts_with("RealAttribute")), intelligence_values)])
型
但得到以下错误:c_across()
中的错误:!只能在数据掩码 predicate (如mutate()
、filter()
和group_by()
)内部使用。
并且在运行rlang::last_trace():<error/rlang_error> c_across()
中出错:!只能在数据掩码 predicate (如mutate()
、filter()
和group_by()
)内部使用。**- Backtrace:x
1.+-realData %>%按行方式()%>%...
1.-plyr::突变(...)
1.-base::eval(列列,.数据,父帧())
1.
\-base::eval(cols[[col]], .data, parent.frame())
型
1.
+-base::match(c_across(starts_with("RealAttribute")), intelligence_values)
型
1.
\-dplyr::c_across(starts_with("RealAttribute"))
型
运行rlang::last_trace(drop = FALSE)查看4个隐藏帧。
有什么想法出了问题或我应该怎么写?提前感谢!
1条答案
按热度按时间c9qzyr3d1#
我认为这是更简单的做一个表连接我们可能会遇到的属性(它看起来像1或2每个类别)与他们各自的类别。然后我们可以重塑数据长,提取属性,将其连接到查找表,并使用它来定义条目应该放在哪一列。
字符串
结果
型