数据框中特定行的总和(Pandas)

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/50218532/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 05:32:27  来源:igfitidea点击:

Sum of specific rows in a dataframe (Pandas)

pythonpython-3.xpandas

提问by Fares K. A.

I'm given a set of the following data:

我得到了一组以下数据:

week  A      B      C      D      E
1     243    857    393    621    194
2     644    576    534    792    207
3     946    252    453    547    436
4     560    100    864    663    949
5     712    734    308    385    303

I'm asked to find the sum of each column for specified rows/a specified number of weeks, and then plot those numbers onto a bar chart to compare A-E.

我被要求找到指定行/指定周数的每列的总和,然后将这些数字绘制到条形图上以比较 AE。

Assuming I have the rows I need (e.g. df.iloc[2:4,:]), what should I do next? My assumption is that I need to create a mask with a single row that includes the sum of each column, but I'm not sure how I go about doing that.

假设我有我需要的行(例如df.iloc[2:4,:]),接下来我应该做什么?我的假设是我需要创建一个包含每列总和的单行掩码,但我不确定如何去做。

I know how to do the final step (i.e. .plot(kind='bar'), I just need to know what the middle step is to obtain the sums I need.

我知道如何做最后一步(即.plot(kind='bar'),我只需要知道中间步骤是什么以获得我需要的总和。

采纳答案by jezrael

You can use for select by positions iloc, sumand Series.plot.bar:

您可以使用按位置选择ilocsumSeries.plot.bar

df.iloc[2:4].sum().plot.bar()

graph1

图1

Or if want select by names of index (here weeks) use loc:

或者,如果要按索引名称(此处为周)进行选择,请使用loc

df.loc[2:4].sum().plot.bar()

graph2

图2

Difference is ilocexclude last position:

差异是iloc排除最后一个位置:

print (df.loc[2:4])
        A    B    C    D    E
week                         
2     644  576  534  792  207
3     946  252  453  547  436
4     560  100  864  663  949

print (df.iloc[2:4])
        A    B    C    D    E
week                         
3     946  252  453  547  436
4     560  100  864  663  949


And if need also filter columns by positions:

如果需要还按位置过滤列:

df.iloc[2:4, :4].sum().plot.bar()  

And by names (weeks):

并按名称(周):

df.loc[2:4, list('ABCD')].sum().plot.bar()

回答by sacuL

All you need to do is call .sum()on your subset of the data:

您需要做的就是调用.sum()您的数据子集:

df.iloc[2:4,:].sum()

Returns:

返回:

week       7
A       1506
B        352
C       1317
D       1210
E       1385
dtype: int64

Furthermore, for plotting, I think you can probably get rid of the weekcolumn (as the sum of week numbers is unlikely to mean anything):

此外,对于绘图,我认为您可以去掉该week列(因为周数的总和不太可能有任何意义):

df.iloc[2:4,1:].sum().plot(kind='bar')
# or
df[list('ABCDE')].iloc[2:4].sum().plot(kind='bar')

plot

阴谋