pandas 根据另一列计算值的出现次数

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/39607540/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 02:03:34  来源:igfitidea点击:

Count the number of Occurrence of Values based on another column

pythonpandas

提问by Niche.P

I have a question regarding creating pandas dataframe according to the sum of other column.

我有一个关于根据其他列的总和创建Pandas数据框的问题。

For example, I have this dataframe

例如,我有这个数据框

 Country    |    Accident
 England           Car
 England           Car
 England           Car
  USA              Car
  USA              Bike
  USA              Plane
 Germany           Car
 Thailand          Plane

I want to make another dataframe based on the sum value of all accident based on the country. We will disregard the type of the accident, while summing them all based on the country.

我想根据基于国家的所有事故的总和来制作另一个数据框。我们将不考虑事故的类型,同时根据国家/地区对它们进行汇总。

My desire dataframe would look like this

我想要的数据框看起来像这样

  Country    |    Sum of Accidents
  England              3
    USA                3
  Germany              1
  Thailand             1

回答by piRSquared

Option 1
Use value_counts

选项 1
使用value_counts

df.Country.value_counts().reset_index(name='Sum of Accidents')

enter image description here

在此处输入图片说明

Option 2
Use groupbythen size

选项 2
使用groupbythensize

df.groupby('Country').size().sort_values(ascending=False) \
  .reset_index(name='Sum of Accidents')

enter image description here

在此处输入图片说明

回答by Kamehameha

You can use the groupbymethod.

您可以使用该groupby方法。

Example -

例子 -

In [36]: df.groupby(["country"]).count().sort_values(["accident"], ascending=False).rename(columns={"accident" : "Sum of accidents"}).reset_index()
Out[36]:
    country  Sum of accidents
0   England                 3
1       USA                 3
2   Germany                 1
3  Thailand                 1

Explanation -

解释 -

df.groupby(["country"]).                               # Group by country
    count().                                           # Aggregation function which counts the number of occurences of country
    sort_values(                                       # Sorting it 
        ["accident"],                                  
        ascending=False).        
    rename(columns={"accident" : "Sum of accidents"}). # Renaming the columns
    reset_index()                                      # Resetting the index, it takes the country as the index if you don't do this.