pandas 根据另一列计算值的出现次数
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/39607540/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Count the number of Occurrence of Values based on another column
提问by Niche.P
I have a question regarding creating pandas dataframe according to the sum of other column.
我有一个关于根据其他列的总和创建Pandas数据框的问题。
For example, I have this dataframe
例如,我有这个数据框
Country | Accident
England Car
England Car
England Car
USA Car
USA Bike
USA Plane
Germany Car
Thailand Plane
I want to make another dataframe based on the sum value of all accident based on the country. We will disregard the type of the accident, while summing them all based on the country.
我想根据基于国家的所有事故的总和来制作另一个数据框。我们将不考虑事故的类型,同时根据国家/地区对它们进行汇总。
My desire dataframe would look like this
我想要的数据框看起来像这样
Country | Sum of Accidents
England 3
USA 3
Germany 1
Thailand 1
回答by piRSquared
回答by Kamehameha
You can use the groupby
method.
您可以使用该groupby
方法。
Example -
例子 -
In [36]: df.groupby(["country"]).count().sort_values(["accident"], ascending=False).rename(columns={"accident" : "Sum of accidents"}).reset_index()
Out[36]:
country Sum of accidents
0 England 3
1 USA 3
2 Germany 1
3 Thailand 1
Explanation -
解释 -
df.groupby(["country"]). # Group by country
count(). # Aggregation function which counts the number of occurences of country
sort_values( # Sorting it
["accident"],
ascending=False).
rename(columns={"accident" : "Sum of accidents"}). # Renaming the columns
reset_index() # Resetting the index, it takes the country as the index if you don't do this.