list 使用特定列表对数据帧 R 中的变量进行分组
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/20176656/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Group variables in a dataframe R using a specific list
提问by user2904120
I have the following lists:
我有以下列表:
group1<-c("A", "B", "D")
group2<-c("C", "E")
group3<-c("F")
and a dataframe with values and corresponding names:
以及一个带有值和相应名称的数据框:
df <- data.frame (name=c("A","B","C","D","E","F"),value=c(1,2,3,4,5,6))
df
name value
1 A 1
2 B 2
3 C 3
4 D 4
5 E 5
6 F 6
I'd like to group the data based on the lists, using the name column;
我想使用 name 列根据列表对数据进行分组;
df
name value group
1 A 1 group1
2 B 2 group1
3 C 3 group2
4 D 4 group1
5 E 5 group2
6 F 6 group3
and sum the values for each group.
并对每组的值求和。
df
group sum
1 group1 7
2 group2 8
3 group3 6
I've searched for similar posts, but failed using them for my problem.
我搜索过类似的帖子,但未能解决我的问题。
采纳答案by Jilber Urbina
Here's an approach. First, use ifelse
to assign groups to each name
, then use aggregate
to get the sum for each group
.
这是一个方法。首先,用于ifelse
将组分配给 each name
,然后用于aggregate
获取每个 的总和group
。
> df$group <- with(df, ifelse(name %in% group1, "group1",
ifelse(name %in% group2, "group2", "group3" )))
> aggregate(value ~ group, sum, data=df)
group value
1 group1 7
2 group2 8
3 group3 6
回答by alexis_laz
Another idea:
另一个想法:
df$X <- factor(df$name)
levels(df$X) <- list(group1 = group1, group2 = group2, group3 = group3)
aggregate(df$value, list(group = df$X), sum)
# group x
#1 group1 7
#2 group2 8
#3 group3 6
EDIT
编辑
As noted by @thelatemail in the comments below you can mget
-in a list- all the objects in your workspace called "group_", like this:
正如@thelatemail 在下面的评论中所指出的,您可以mget
- 在一个列表中 - 工作区中名为“group_”的所有对象,如下所示:
mget(ls(pattern="group\d+"))
In case, though, you have loaded -say- a function called "group4", this function will be selected too in ls()
. A way to avoid this is to use something like:
但是,如果您已经加载了——比如说——一个名为“group4”的函数,这个函数也将在ls()
. 避免这种情况的一种方法是使用以下内容:
.ls <- ls(pattern="group\d+")
mget(.ls[!.ls %in% apropos("group", mode = "function")]) #`mget` only non-functions.
#You can, of course, avoid any
#other `mode`, besides "function".
The list returned from mget
can, then, be used as the levels(df$X)
.
mget
然后,从返回的列表可以用作levels(df$X)
.
回答by TheComeOnMan
I would suggest having your grouping as a data.frame, something along these lines -
我建议将您的分组作为 data.frame,沿着这些方向进行 -
grouping <- data.frame(name=c("A","B","C","D","E","F"),groupno=c(1,1,1,2,2,3))
df2 <- merge(df,grouping, by = 'name')
aggregate(value ~ groupno, sum, data=df2)