string 将列表列表转换为字符向量

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/34624289/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-08 16:29:51  来源:igfitidea点击:

Convert a list of lists to a character vector

rstringlistcharactersapply

提问by dan

I have a list of lists of characters. For example:

我有一个字符列表列表。例如:

l <- list(list("A"),list("B"),list("C","D"))

So as you can see some elements are lists of length > 1.

因此,如您所见,某些元素是长度 > 1 的列表。

I want to convert this list of lists to a character vector, but I'd like the lists with length > 1 to appear as a single element in the character vector.

我想将此列表列表转换为字符向量,但我希望长度 > 1 的列表显示为字符向量中的单个元素。

the unlistfunction does not achieve that but rather:

unlist功能没有实现,而是:

> unlist(l)
[1] "A" "B" "C" "D"

Is there anything faster than:

有什么比:

sapply(l,function(x) paste(unlist(x),collapse=""))

To get my desired result:

为了得到我想要的结果:

"A"  "B"  "CD"

回答by IRTFM

You can skip the unlist step. You already figured out that paste0needs collapse = TRUEto "bind" sequential elements of a vector together:

您可以跳过取消列出步骤。您已经发现paste0需要collapse = TRUE将向量的顺序元素“绑定”在一起:

> sapply( l, paste0, collapse="")
[1] "A"  "B"  "CD"

回答by A5C1D2H2I1M1N2O1R2T1

Here's a variation of @thela's suggestion, if you don't mind a multi-line approach:

如果您不介意多行方法,这是@thela 建议的变体:

x <- lengths(l)                                     ## Get the lengths of each list
l[x > 1] <- lapply(l[x > 1], paste0, collapse = "") ## Paste only those together
unlist(l, use.names = FALSE)                        ## Unlist the result
# [1] "A"  "B"  "CD"

Alternatively, if you don't mind using a package, look at the "stringi" package, specifically stri_flatten, as suggested by @Jota.

或者,如果您不介意使用包,请查看@Jotastri_flatten建议的“stringi”包,特别是 。



Here's a performance comparison:

下面是性能对比:

l <- list(list("A"), list("B"), list("B"), list("B"), list("B"),
          list("C","D"), list("E","F", "G", "H"), 
          as.list(rep(letters,10)), as.list(rep(letters,2)))
l <- unlist(replicate(1000, l, FALSE), recursive = FALSE)

funop <- function() sapply(l,function(x) paste(unlist(x),collapse=""))
fun42 <- function() sapply(l, paste0, collapse="")
funv  <- function() vapply(l, paste0, character(1L), collapse = "")
funam <- function() {
  x <- lengths(l)
  l[x > 1] <- lapply(l[x > 1], paste0, collapse = "")
  unlist(l, use.names = FALSE)
}
funj <- function() sapply(l, stri_flatten)
funamj <- function() {
  x <- lengths(l)
  l[x > 1] <- lapply(l[x > 1], stri_flatten)
  unlist(l, use.names = FALSE)
}

library(microbenchmark)
microbenchmark(funop(), fun42(), funv(), funam(), funj(), times = 20)
# Unit: milliseconds
#      expr      min       lq     mean   median       uq      max neval   cld
#   funop() 78.21822 84.79588 85.30055 85.36399 86.90540 90.48321    20     e
#   fun42() 56.16938 57.35735 61.60008 58.04969 65.82836 81.46482    20    d 
#    funv() 54.64101 56.23245 60.07896 57.26049 63.96815 78.58043    20    d 
#   funam() 45.89760 46.89890 48.99810 47.29617 48.28764 56.92544    20   c  
#    funj() 28.73405 29.94041 32.00676 30.56711 31.11448 39.93765    20  b   
#  funamj() 18.64829 19.01328 21.05989 19.12468 19.52516 32.87569    20 a 


Note: The relative efficiency of this approach would depend on how many list items are going to have length(x) > 1. If most of them are going to be > 1anyway, then just go with @42-'s approach. stri_flattenonly improves performance if you have long character vectors to paste together as in sample list used for the above benchmark, otherwise, it doesn't help.

注意:这种方法的相对效率取决于将有多少个列表项length(x) > 1。如果他们中的大多数人> 1无论如何都会成为,那么就采用@42- 的方法。 stri_flatten仅当您将长字符向量粘贴在一起时才能提高性能,如用于上述基准测试的示例列表,否则无济于事。