数据框中特定行的总和(Pandas)
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/50218532/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Sum of specific rows in a dataframe (Pandas)
提问by Fares K. A.
I'm given a set of the following data:
我得到了一组以下数据:
week A B C D E
1 243 857 393 621 194
2 644 576 534 792 207
3 946 252 453 547 436
4 560 100 864 663 949
5 712 734 308 385 303
I'm asked to find the sum of each column for specified rows/a specified number of weeks, and then plot those numbers onto a bar chart to compare A-E.
我被要求找到指定行/指定周数的每列的总和,然后将这些数字绘制到条形图上以比较 AE。
Assuming I have the rows I need (e.g. df.iloc[2:4,:]
), what should I do next? My assumption is that I need to create a mask with a single row that includes the sum of each column, but I'm not sure how I go about doing that.
假设我有我需要的行(例如df.iloc[2:4,:]
),接下来我应该做什么?我的假设是我需要创建一个包含每列总和的单行掩码,但我不确定如何去做。
I know how to do the final step (i.e. .plot(kind='bar'
), I just need to know what the middle step is to obtain the sums I need.
我知道如何做最后一步(即.plot(kind='bar'
),我只需要知道中间步骤是什么以获得我需要的总和。
采纳答案by jezrael
You can use for select by positions iloc
, sum
and Series.plot.bar
:
您可以使用按位置选择iloc
,sum
和Series.plot.bar
:
df.iloc[2:4].sum().plot.bar()
Or if want select by names of index (here weeks) use loc
:
或者,如果要按索引名称(此处为周)进行选择,请使用loc
:
df.loc[2:4].sum().plot.bar()
Difference is iloc
exclude last position:
差异是iloc
排除最后一个位置:
print (df.loc[2:4])
A B C D E
week
2 644 576 534 792 207
3 946 252 453 547 436
4 560 100 864 663 949
print (df.iloc[2:4])
A B C D E
week
3 946 252 453 547 436
4 560 100 864 663 949
And if need also filter columns by positions:
如果需要还按位置过滤列:
df.iloc[2:4, :4].sum().plot.bar()
And by names (weeks):
并按名称(周):
df.loc[2:4, list('ABCD')].sum().plot.bar()
回答by sacuL
All you need to do is call .sum()
on your subset of the data:
您需要做的就是调用.sum()
您的数据子集:
df.iloc[2:4,:].sum()
Returns:
返回:
week 7
A 1506
B 352
C 1317
D 1210
E 1385
dtype: int64
Furthermore, for plotting, I think you can probably get rid of the week
column (as the sum of week numbers is unlikely to mean anything):
此外,对于绘图,我认为您可以去掉该week
列(因为周数的总和不太可能有任何意义):
df.iloc[2:4,1:].sum().plot(kind='bar')
# or
df[list('ABCDE')].iloc[2:4].sum().plot(kind='bar')