我有一个相当大的数据集,它被组织成如下的列表:
set.seed(0)
v <- rnorm(5000)
names(v) <- seq(1001, 6000, 1)
dates <- seq.Date(as.Date('2023-01-01'), by='day', length.out=365)
ls <- list()
ls <- sapply(dates, function(d) {ls[[length(ls) + 1]] <- v; ls})
names(ls) <- dates
str(ls[1:5])
List of 5
$ 2023-01-01: Named num [1:5000] 1.263 -0.326 1.33 1.272 0.415 ...
..- attr(*, "names")= chr [1:5000] "1001" "1002" "1003" "1004" ...
$ 2023-01-02: Named num [1:5000] 1.263 -0.326 1.33 1.272 0.415 ...
..- attr(*, "names")= chr [1:5000] "1001" "1002" "1003" "1004" ...
$ 2023-01-03: Named num [1:5000] 1.263 -0.326 1.33 1.272 0.415 ...
..- attr(*, "names")= chr [1:5000] "1001" "1002" "1003" "1004" ...
$ 2023-01-04: Named num [1:5000] 1.263 -0.326 1.33 1.272 0.415 ...
..- attr(*, "names")= chr [1:5000] "1001" "1002" "1003" "1004" ...
$ 2023-01-05: Named num [1:5000] 1.263 -0.326 1.33 1.272 0.415 ...
..- attr(*, "names")= chr [1:5000] "1001" "1002" "1003" "1004" ...
字符串
如您所见,这是一年中每天的5000个数据点,总计1,825,000个数据点。我们将它们称为“x”。我想对每个数据点x执行以下操作:max(c(0.5 - x,0))。以下代码可以工作,但需要很长时间。
new <- sapply(names(ls), function(d)
lapply(names(v), function(n) max(c(0.5 - ls[[d]][n], 0))))
rownames(new) <- names(v)
new[1:5, 1:5]
2023-01-01 2023-01-02 2023-01-03 2023-01-04 2023-01-05
1001 0 0 0 0 0
1002 0.8262334 0.8262334 0.8262334 0.8262334 0.8262334
1003 0 0 0 0 0
1004 0 0 0 0 0
1005 0.08535857 0.08535857 0.08535857 0.08535857 0.08535857
型
有更快的路吗??
2条答案
按热度按时间svmlkihl1#
使用
pmax
代替lapply字符串
hlswsv352#
您可以尝试:
字符串
该公式给出:
型
基准:
型