pandas 熊猫：求和两行数据帧而不重新排列数据帧？

Question

提问by ale19

I have a dataframe and I'm trying to sum two rows without messing up the order of the rows.

我有一个数据框，我试图在不弄乱行顺序的情况下对两行求和。

> test = {'counts' : pd.Series([10541,4143,736,18,45690], index=['Daylight','Dawn','Other / unknown','Uncoded & errors','Total']), 'percents' : pd.Series([23.07,9.07,1.61,0.04,100], index=['Daylight','Dawn','Other / unknown','Uncoded & errors','Total'])}

> testdf = pd.DataFrame(test)

                  counts  percents
Daylight           10541     23.07
Dawn                4143      9.07
Other / unknown      736      1.61
Uncoded & errors      18      0.04
Total              45690    100.00

I want this output:

我想要这个输出：

                  counts  percents
Daylight           10541     23.07
Dawn                4143      9.07
Other / unknown      754      1.65   <-- sum of 'other/unknown' and 'uncoded & errors'
Total              45690    100.00

This is as close as I've been able to get:

这是我所能得到的最接近的：

> sum_ = testdf.loc[['Other / unknown', 'Uncoded & errors']].sum().to_frame().transpose()

     counts   percents
0    754.00   1.65       

> sum_ = sum_.rename(index={0: 'Other / unknown'})

                counts   percents
Other / unknown 754.00   1.65   

> testdf.drop(['Other / unknown', 'Uncoded & errors'],inplace=True)
> testdf = testdf.append(sum_)

Daylight         10541  23.07
Dawn             4143   9.07
Total            45690  100
Other / unknown  754    1.65

But this does not preserve the order of the original rows

但这不会保留原始行的顺序

I could insert the row by slicing the dataframe and inserting the sum_ row between 'Dawn' and 'Total', but that will not work if the row labels ever change, or if the order of the rows change, etc. (this is an annual brochure so the table design might change from year to year), so I'm trying to do this robustly.

我可以通过切片数据框并在 'Dawn' 和 'Total' 之间插入 sum_ 行来插入行，但是如果行标签发生变化，或者行的顺序发生变化等，这将不起作用（这是一个年度小册子，因此表格设计可能会逐年变化），所以我正在努力做到这一点。

Answer 1

回答by MaxU

use groupby(..., sort=False).sum():

使用groupby(..., sort=False).sum()：

In [84]: (testdf.reset_index()
   ....:        .replace({'index': {'Uncoded & errors':'Other / unknown'}})
   ....:        .groupby('index', sort=False).sum()
   ....: )
Out[84]:
                 counts  percents
index
Daylight          10541     23.07
Dawn               4143      9.07
Other / unknown     754      1.65
Total             45690    100.00

Answer 2

回答by peterfields

Although I prefer MaxU's answer, you can also try summing in-place:

虽然我更喜欢 MaxU 的答案，但您也可以尝试就地求和：

testdf.loc['Other / unknown'] += testdf.loc['Uncoded & errors']

And then deleting the row by index:

然后按索引删除行：

testdf.drop(['Uncoded & errors'], inplace=True)

In [28]: testdf
Out[28]: 
                 counts  percents
Daylight          10541     23.07
Dawn               4143      9.07
Other / unknown     754      1.65
Total             45690    100.00

pandas 熊猫：求和两行数据帧而不重新排列数据帧？

提问by ale19

回答by MaxU

回答by peterfields

相关推荐

最近更新

标签

pandas 熊猫：求和两行数据帧而不重新排列数据帧？

提问by ale19

回答by MaxU

回答by peterfields

相关推荐

pandas 如何将图添加到子图 matplotlib

pandas 熊猫读取sql整数变成浮点数

pandas ValueError: num 必须是 1 <= num <= 2，而不是 3

pandas 从熊猫数据框中删除非工作日行

相关推荐

最近更新

标签