list 如何更有效地将巨大的向量列表转换为矩阵?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/13224553/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-11 01:55:23  来源:igfitidea点击:

How to convert a huge list-of-vector to a matrix more efficiently?

rlistmatrixperformance

提问by user1787675

I have a list of length 130,000 where each element is a character vector of length 110. I would like to convert this list to a matrix with dimension 1,430,000*10. How can I do it more efficiently?\ My code is :

我有一个长度为 130,000 的列表,其中每个元素都是一个长度为 110 的字符向量。我想将此列表转换为维度为 1,430,000*10 的矩阵。我怎样才能更有效地做到这一点?\ 我的代码是:

output=NULL
for(i in 1:length(z)) {
 output=rbind(output,
              matrix(z[[i]],ncol=10,byrow=TRUE))
}

回答by flodel

This should be equivalent to your current code, only a lot faster:

这应该等同于您当前的代码,只是要快得多:

output <- matrix(unlist(z), ncol = 10, byrow = TRUE)

回答by Ben Bolker

I thinkyou want

你想要

output <- do.call(rbind,lapply(z,matrix,ncol=10,byrow=TRUE))

i.e. combining @BlueMagister's use of do.call(rbind,...)with an lapplystatement to convert the individual list elements into 11*10 matrices ...

即结合@BlueMagister 的使用do.call(rbind,...)lapply语句将单个列表元素转换为 11*10 矩阵......

Benchmarks (showing @flodel's unlistsolution is 5x faster than mine, and 230x faster than the original approach ...)

基准测试(显示@flodel 的unlist解决方案比我的解决方案快 5 倍,比原始方法快 230 倍......)

n <- 1000
z <- replicate(n,matrix(1:110,ncol=10,byrow=TRUE),simplify=FALSE)
library(rbenchmark)
origfn <- function(z) {
    output <- NULL 
    for(i in 1:length(z))
        output<- rbind(output,matrix(z[[i]],ncol=10,byrow=TRUE))
}
rbindfn <- function(z) do.call(rbind,lapply(z,matrix,ncol=10,byrow=TRUE))
unlistfn <- function(z) matrix(unlist(z), ncol = 10, byrow = TRUE)

##          test replications elapsed relative user.self sys.self 
## 1   origfn(z)          100  36.467  230.804    34.834    1.540  
## 2  rbindfn(z)          100   0.713    4.513     0.708    0.012 
## 3 unlistfn(z)          100   0.158    1.000     0.144    0.008 

If this scales appropriately (i.e. you don't run into memory problems), the full problem would take about 130*0.2 seconds = 26 seconds on a comparable machine (I did this on a 2-year-old MacBook Pro).

如果这适当地扩展(即您没有遇到内存问题),完整的问题将需要大约 130*0.2 秒 = 26 秒在同类机器上(我在 2 岁的 MacBook Pro 上这样做)。

回答by Blue Magister

It would help to have sample information about your output. Recursively using rbindon bigger and bigger things is not recommended. My first guess at something that would help you:

获得有关输出的示例信息会有所帮助。不推荐rbind在越来越大的事物上递归使用。我第一个猜测对你有帮助的东西:

z <- list(1:3,4:6,7:9)
do.call(rbind,z)

See a related questionfor more efficiency, if needed.

如果需要,请参阅相关问题以提高效率。

回答by csta

You can also use,

您还可以使用,

output <- as.matrix(as.data.frame(z))

The memory usage is very similar to

内存使用情况非常类似于

output <- matrix(unlist(z), ncol = 10, byrow = TRUE)

Which can be verified, with mem_changed()from library(pryr).

可以验证,mem_changed()library(pryr).

回答by Ahmed Gehad

you can use as.matrix as below:

您可以使用 as.matrix 如下:

output <- as.matrix(z)