我将PostgreSQL查询的结果作为data.frame
存储在R中。其中一个“列”是integer[]
数组类型。在R中,这是由RPostgreSQL
包作为字符串导入的。
如何在data.frame
中将字符串类型转换为数字列表类型的列(或单独的数字列)?
连接并获取数据
require(RPostgreSQL)
drv = dbDriver("PostgreSQL")
con = dbConnect(drv, host = ..., post =..., dbname =..., user=..., password=...)
df = dbGetQuery(con, query_string)
dbDisconnect(con)
最小工作示例
library(dplyr)
# randomized arrays of 10 numbers
set.seed(10)
df = data.frame(id = c(1:10)) %>%
mutate(arrcol = lapply(id, function(X) sample(1:99, 10, replace=T)),
arrcol = gsub("c(","{{",arrcol,fixed=T),
arrcol = gsub(")","}}",arrcol,fixed=T))
移除支架
df$arrcol = gsub(fixed=T, "{", "", df$arrcol)
df$arrcol = gsub(fixed=T, "}", "", df$arrcol)
转换为数字列表
# Attempt 1:
df$arrcol = as.numeric(df$arrcol)
# Error: (list) object cannot be coerced to type 'double'
# Attempt 2:
df$arrcol = lapply(df$arrcol,
function(x) strsplit(x, ",", fixed=T))
# no error, but now the data appears to be stored as a list of character lists:
# arrcol[1]: list(c("1", "2", "3", "4", "5",...
# Attempt 3:
df$arrcol = lapply(df$arrcol,
function(x) as.numeric(
unlist(
strsplit(x, ",", fixed=T))
)
)
# this one seems to work
1条答案
按热度按时间vzgqcmou1#
我自己的最佳答案:
[更新为一次性执行所有步骤]
或者,等同于:
或者,(只要每个数组的长度相同)你可以使用这个技巧(Splitting a dataframe string column into multiple different columns)将字符串解析成单独的列,注意
read.table
足够聪明,可以识别每个新变量为整数。此外,您可以轻松地将这些列作为它们自己的列附加到原始
data.frame
上:或者,知道将产生多少新列: