将 Pandas groupby 操作的输出保存为 CSV

Question

提问by Tom_Hanks

I would like to ask a question about Pandas groupby. I am using ipython notebook (python3).

我想问一个关于 Pandas groupby 的问题。我正在使用 ipython 笔记本（python3）。

For example, there is a dataframe like this.

例如，有一个这样的数据框。

df1 = pd.DataFrame( { "Score" : ["A", "B", "C", "A", "B", "A"] ,"Class":
["Physics", "Science", "Chemistry", "Biology", "History", "English"] } )

Then, I want to groupby with Score.

然后，我想与 Score 分组。

df1.groupby("Score")

I need a output file of this and I tried

我需要一个输出文件，我试过了

df1.groupby("Score").to_csv("Score.txt",sep="\t")

but this does not work. Does anyone know how to make output file?

但这不起作用。有谁知道如何制作输出文件？

Answer 1

回答by piRSquared

What you're asking makes no sense. You may not realize it though. groupbycreates a staging area for which to perform aggregation or transformations across groups of data. Like, if we wanted to count the number of observations for each group, that'd be an aggregation.

你问的毫无意义。不过你可能没有意识到。 groupby创建一个临时区域，用于跨数据组执行聚合或转换。就像，如果我们想计算每个组的观察次数，那就是聚合。

Because you thought that you could output as some table, I'm going to guess that you thought groupbyactually grouped the rows together. That isn't bad interpretation of the term if you had never seen it used before, even if it is incorrect. The way to do that would be to sort using the method sort_values.

因为您认为可以输出为某个表，所以我猜您认为groupby实际上是将行分组在一起。如果您以前从未见过使用过该术语，即使它是不正确的，这也不是对这个术语的错误解释。这样做的方法是使用 method 进行排序sort_values。

df1.sort_values('Score')

       Class Score
0    Physics     A
3    Biology     A
5    English     A
1    Science     B
4    History     B
2  Chemistry     C

If Score were something else that wasn't already ordered lexicographically, we could use the categoricaltype to handle it for us.

如果 Score 是其他尚未按字典顺序排序的内容，我们可以使用该categorical类型为我们处理它。

score = df1.Score.astype('category', categories=list('ABCDF'), ordered=True)
df1.assign(Score=score).sort_values('Score')

       Class Score
0    Physics     A
3    Biology     A
5    English     A
1    Science     B
4    History     B
2  Chemistry     C

Finally, you output the data to the file as you expected

最后，按预期将数据输出到文件

df1.sort_values('Score').to_csv("Score.txt", sep="\t")

Answer 2

回答by YOBEN_S

Here is the solution ,I think is close to what you want

这是解决方案，我认为接近您想要的

df1=df1.reset_index()
df1=df1.groupby(['Score','index']).Class.apply(sum).to_frame()
df1

Out[102]: 
                 Class
Score index           
A     0        Physics
      3        Biology
      5        English
B     1        Science
      4        History
C     2      Chemistry

Answer 3

回答by u7102456

You need to tell what you want to groupby counts, means or others.

您需要通过计数、手段或其他方式告诉您要分组的内容。

 df1.groupby("Score").count().to_csv('d.csv')

将 Pandas groupby 操作的输出保存为 CSV

提问by Tom_Hanks

回答by piRSquared

回答by YOBEN_S

回答by u7102456

相关推荐

最近更新

标签

将 Pandas groupby 操作的输出保存为 CSV

提问by Tom_Hanks

回答by piRSquared

回答by YOBEN_S

回答by u7102456

相关推荐

Python Pandas - 选择等于的数据框列

pandas 使用 df.query() 从 DataFrame 中提取行

pandas 提取熊猫列中列表的元素

带有更多分隔符的 Pandas 数据框 to_csv

相关推荐

最近更新

标签