list 如何避免 R 中的循环：从列表中选择项目

Question

提问by JD Long

I could solve this using loops, but I am trying think in vectors so my code will be more R-esque.

我可以使用循环来解决这个问题，但我尝试在向量中思考，因此我的代码将更加 R 风格。

I have a list of names. The format is firstname_lastname. I want to get out of this list a separate list with only the first names. I can't seem to get my mind around how to do this. Here's some example data:

我有一个名字列表。格式为名字_姓氏。我想从这个列表中删除一个只有名字的单独列表。我似乎无法理解如何做到这一点。以下是一些示例数据：

t <- c("bob_smith","mary_jane","jose_chung","michael_marx","charlie_ivan")
tsplit <- strsplit(t,"_")

which looks like this:

看起来像这样：

> tsplit
[[1]]
[1] "bob"   "smith"

[[2]]
[1] "mary" "jane"

[[3]]
[1] "jose"  "chung"

[[4]]
[1] "michael" "marx"   

[[5]]
[1] "charlie" "ivan"

I could get out what I want using loops like this:

我可以使用这样的循环得到我想要的：

for (i in 1:length(tsplit)){
    if (i==1) {t_out <- tsplit[[i]][1]} else{t_out <- append(t_out, tsplit[[i]][1])} 
}

which would give me this:

这会给我这个：

t_out
[1] "bob"     "mary"    "jose"    "michael" "charlie"

So how can I do this without loops?

那么我怎么能在没有循环的情况下做到这一点呢？

Answer 1

采纳答案by liebke

You can use apply(or sapply)

您可以使用apply（或sapply）

t <- c("bob_smith","mary_jane","jose_chung","michael_marx","charlie_ivan")
f <- function(s) strsplit(s, "_")[[1]][1]
sapply(t, f)

bob_smith    mary_jane   jose_chung michael_marx charlie_ivan 

       "bob"       "mary"       "jose"    "michael"    "charlie"

See: A brief introduction to “apply” in R

请参阅：R 中“应用”的简要介绍

Answer 2

回答by hadley

And one more approach:

还有一种方法：

t <- c("bob_smith","mary_jane","jose_chung","michael_marx","charlie_ivan")
pieces <- strsplit(t,"_")
sapply(pieces, "[", 1)

In words, the last line extracts the first element of each component of the list and then simplifies it into a vector.

换句话说，最后一行提取列表每个组件的第一个元素，然后将其简化为向量。

How does this work? Well, you need to realise an alternative way of writing x[1]is "["(x, 1), i.e. there is a function called [that does subsetting. The sapplycall applies calls this function once for each element of the original list, passing in two arguments, the list element and 1.

这是如何运作的？好吧，您需要实现另一种写法x[1]是"["(x, 1)，即有一个称为[子集的函数。该sapply调用对原始列表的每个元素调用一次此函数，传入两个参数，列表元素和 1。

The advantage of this approach over the others is that you can extract multiple elements from the list without having to recompute the splits. For example, the last name would be sapply(pieces, "[", 2). Once you get used to this idiom, it's pretty easy to read.

与其他方法相比，这种方法的优势在于您可以从列表中提取多个元素，而无需重新计算拆分。例如，姓氏将是sapply(pieces, "[", 2)。一旦你习惯了这个习语，它就很容易阅读。

Answer 3

回答by William Doane

How about:

怎么样：

tlist <- c("bob_smith","mary_jane","jose_chung","michael_marx","charlie_ivan")
fnames <- gsub("(_.*)$", "", tlist)
# _.* matches the underscore followed by a string of characters
# the $ anchors the search at the end of the input string
# so, underscore followed by a string of characters followed by the end of the input string

for the RegEx approach?

对于 RegEx 方法？

Answer 4

回答by Karsten

what about:

关于什么：

t <- c("bob_smith","mary_jane","jose_chung","michael_marx","charlie_ivan")

sub("_.*", "", t)

Answer 5

回答by Matt Parker

I doubt this is the most elegant solution, but it beats looping:

我怀疑这是最优雅的解决方案，但它胜过循环：

t.df <- data.frame(tsplit)
t.df[1, ]

Converting lists to data frames is about the only way I can get them to do what I want. I'm looking forward to reading answers by people who actually understand how to handle lists.

将列表转换为数据框是我让它们做我想做的唯一方法。我期待阅读真正了解如何处理列表的人的答案。

Answer 6

回答by Dirk Eddelbuettel

You almost had it. It reallyis just a matter of

你几乎拥有它。这真的只是一个问题

using one of the *applyfunctions to loop over your existing list, I often start with lapplyand sometimes switch to sapply
add an anonymous function that operates on one of the list elements at a time
you already knew it was strsplit(string, splitterm)and that you need the odd [[1]][1]to pick off the first term of the answer
just put it all together, starting with a preferred variable namne (as we stay clear of tor cand friends)

使用其中一个*apply函数循环遍历现有列表，我经常开始lapply有时会切换到sapply
添加一次对列表元素之一进行操作的匿名函数
你已经知道它是strsplit(string, splitterm)，你需要奇数[[1]][1]来挑选答案的第一项
只需将它们放在一起，从首选变量 namne 开始（因为我们远离torc和朋友）

which gives

这使

> tlist <- c("bob_smith","mary_jane","jose_chung","michael_marx","charlie_ivan") 
> fnames <- sapply(tlist, function(x) strsplit(x, "_")[[1]][1]) 
> fnames 
  bob_smith    mary_jane   jose_chung michael_marx charlie_ivan   
      "bob"       "mary"       "jose"    "michael"    "charlie" 
>

Answer 7

回答by brentonk

You could use unlist():

你可以使用unlist()：

> tsplit <- unlist(strsplit(t,"_"))
> tsplit
 [1] "bob"     "smith"   "mary"    "jane"    "jose"    "chung"   "michael"
 [8] "marx"    "charlie" "ivan"   
> t_out <- tsplit[seq(1, length(tsplit), by = 2)]
> t_out
[1] "bob"     "mary"    "jose"    "michael" "charlie"

There might be a better way to pull out only the odd-indexed entries, but in any case you won't have a loop.

可能有更好的方法来仅提取奇数索引条目，但在任何情况下都不会出现循环。

Answer 8

回答by William Doane

And one other approach, based on brentonk's unlist example...

还有另一种方法，基于 brentonk 的 unlist 示例......

tlist <- c("bob_smith","mary_jane","jose_chung","michael_marx","charlie_ivan")
tsplit <- unlist(strsplit(tlist,"_"))
fnames <- tsplit[seq(1:length(tsplit))%%2 == 1]

Answer 9

回答by jmc200

I would use the following unlist()-based method:

我将使用以下基于 unlist() 的方法：

> t <- c("bob_smith","mary_jane","jose_chung","michael_marx","charlie_ivan")
> tsplit <- strsplit(t,"_")
> 
> x <- matrix(unlist(tsplit), 2)
> x[1,]
[1] "bob"     "mary"    "jose"    "michael" "charlie"

The big advantage of this method is that it solves the equivalent problem for surnames at the same time:

这种方法的一大优点是它同时解决了姓氏的等价问题：

> x[2,]
[1] "smith" "jane"  "chung" "marx"  "ivan"

The downside is that you'll need to be certain that all of the names conform to the firstname_lastnamestructure; if any don't then this method will break.

缺点是您需要确定所有名称都符合firstname_lastname结构；如果没有，则此方法将中断。

Answer 10

回答by Virginie

from the original tsplitlist object given at the beginning, this command will do:

从tsplit开头给出的原始列表对象，此命令将执行以下操作：

unlist(lapply(tsplit,function(x) x[1]))

it extracts the first element of all list elements, then transforms a list to a vector. Unlisting first to a matrix, then extracting the fist column is also ok, but then you are dependent on the fact that all list elements have the same length. Here is the output:

它提取所有列表元素的第一个元素，然后将列表转换为向量。首先取消列出矩阵，然后提取第一列也可以，但是您依赖于所有列表元素都具有相同长度的事实。这是输出：

> tsplit

[[1]]
[1] "bob"   "smith"

[[2]]
[1] "mary" "jane"

[[3]]
[1] "jose"  "chung"

[[4]]
[1] "michael" "marx"   

[[5]]
[1] "charlie" "ivan"   

> lapply(tsplit,function(x) x[1])

[[1]]
[1] "bob"

[[2]]
[1] "mary"

[[3]]
[1] "jose"

[[4]]
[1] "michael"

[[5]]
[1] "charlie"

> unlist(lapply(tsplit,function(x) x[1]))

[1] "bob"     "mary"    "jose"    "michael" "charlie"

list 如何避免 R 中的循环：从列表中选择项目

提问by JD Long

采纳答案by liebke

回答by hadley

回答by William Doane

回答by Karsten

回答by Matt Parker

回答by Dirk Eddelbuettel

回答by brentonk

回答by William Doane

回答by jmc200

回答by Virginie

相关推荐

最近更新

标签

list 如何避免 R 中的循环：从列表中选择项目

提问by JD Long

采纳答案by liebke

回答by hadley

回答by William Doane

回答by Karsten

回答by Matt Parker

回答by Dirk Eddelbuettel

回答by brentonk

回答by William Doane

回答by jmc200

回答by Virginie

相关推荐

list 如何检查groovy数组/哈希/集合/列表中的元素？

list 从 Scala 中的列表返回一个元素

list 从链表中有效地选择一组随机元素

list 如何将公告列表/webpart 添加到发布门户

相关推荐

最近更新

标签