list 使用特定列表对数据帧 R 中的变量进行分组

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/20176656/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-11 02:07:13  来源:igfitidea点击:

Group variables in a dataframe R using a specific list

rlistvariablesgroupingaggregate

提问by user2904120

I have the following lists:

我有以下列表:

  group1<-c("A", "B", "D")
  group2<-c("C", "E")
  group3<-c("F")

and a dataframe with values and corresponding names:

以及一个带有值和相应名称的数据框:

  df <- data.frame (name=c("A","B","C","D","E","F"),value=c(1,2,3,4,5,6))
  df
    name value
  1    A     1
  2    B     2
  3    C     3
  4    D     4
  5    E     5
  6    F     6

I'd like to group the data based on the lists, using the name column;

我想使用 name 列根据列表对数据进行分组;

  df
    name value    group
  1    A     1   group1
  2    B     2   group1
  3    C     3   group2
  4    D     4   group1
  5    E     5   group2
  6    F     6   group3

and sum the values for each group.

并对每组的值求和。

  df
       group sum
  1   group1   7
  2   group2   8
  3   group3   6

I've searched for similar posts, but failed using them for my problem.

我搜索过类似的帖子,但未能解决我的问题。

采纳答案by Jilber Urbina

Here's an approach. First, use ifelseto assign groups to each name, then use aggregateto get the sum for each group.

这是一个方法。首先,用于ifelse将组分配给 each name,然后用于aggregate获取每个 的总和group

> df$group <- with(df, ifelse(name %in% group1, "group1",
                              ifelse(name %in% group2, "group2", "group3" )))
> aggregate(value ~ group, sum, data=df)
   group value
1 group1     7
2 group2     8
3 group3     6

回答by alexis_laz

Another idea:

另一个想法:

df$X <- factor(df$name)
levels(df$X) <- list(group1 = group1, group2 = group2, group3 = group3)
aggregate(df$value, list(group = df$X), sum)
#   group x
#1 group1 7
#2 group2 8
#3 group3 6

EDIT

编辑

As noted by @thelatemail in the comments below you can mget-in a list- all the objects in your workspace called "group_", like this:

正如@thelatemail 在下面的评论中所指出的,您可以mget- 在一个列表中 - 工作区中名为“group_”的所有对象,如下所示:

mget(ls(pattern="group\d+"))

In case, though, you have loaded -say- a function called "group4", this function will be selected too in ls(). A way to avoid this is to use something like:

但是,如果您已经加载了——比如说——一个名为“group4”的函数,这个函数也将在ls(). 避免这种情况的一种方法是使用以下内容:

.ls <- ls(pattern="group\d+")
mget(.ls[!.ls %in% apropos("group", mode = "function")])  #`mget` only non-functions.
                                                      #You can, of course, avoid any 
                                                     #other `mode`, besides "function".

The list returned from mgetcan, then, be used as the levels(df$X).

mget然后,从返回的列表可以用作levels(df$X).

回答by TheComeOnMan

I would suggest having your grouping as a data.frame, something along these lines -

我建议将您的分组作为 data.frame,沿着这些方向进行 -

grouping <- data.frame(name=c("A","B","C","D","E","F"),groupno=c(1,1,1,2,2,3))
df2 <- merge(df,grouping, by = 'name')
aggregate(value ~ groupno, sum, data=df2)