修改 Python Pandas 的输出描述

Question

提问by KHibma

Is there a way to omit some of the output from the pandas describe? This command gives me exactly what I want with a table output (count and mean of executeTime's by a simpleDate)

有没有办法省略熊猫描述的一些输出？这个命令给了我我想要的表格输出（通过 simpleDate 计算 executeTime 的计数和平均值）

df.groupby('simpleDate').executeTime.describe().unstack(1)

However that's all I want, count and mean. I want to drop std, min, max, etc... So far I've only read how to modify column size.

然而，这就是我想要的，数数和意思。我想删除 std、min、max 等......到目前为止，我只阅读了如何修改列大小。

I'm guessing the answer is going to be to re-write the line, not using describe, but I haven't had any luck grouping by simpleDate andgetting the count with a mean on executeTime.

我猜答案将是重新编写该行，而不是使用描述，但是我没有通过 simpleDate 进行分组并在 executeTime 上获得平均值的计数。

I can do count by date:

我可以按日期计算：

df.groupby(['simpleDate']).size()

or executeTime by date:

或按日期执行时间：

df.groupby(['simpleDate']).mean()['executeTime'].reset_index()

But can't figure out the syntax to combine them.

但无法弄清楚将它们组合起来的语法。

My desired output:

我想要的输出：

            count  mean  
09-10-2013      8  20.523   
09-11-2013      4  21.112  
09-12-2013      3  18.531
...            ..  ...

Answer 1

采纳答案by Jeff

Describe returns a series, so you can just select out what you want

描述返回一个系列，所以你可以选择你想要的

In [6]: s = Series(np.random.rand(10))

In [7]: s
Out[7]: 
0    0.302041
1    0.353838
2    0.421416
3    0.174497
4    0.600932
5    0.871461
6    0.116874
7    0.233738
8    0.859147
9    0.145515
dtype: float64

In [8]: s.describe()
Out[8]: 
count    10.000000
mean      0.407946
std       0.280562
min       0.116874
25%       0.189307
50%       0.327940
75%       0.556053
max       0.871461
dtype: float64

In [9]: s.describe()[['count','mean']]
Out[9]: 
count    10.000000
mean      0.407946
dtype: float64

Answer 2

回答by Rafa

.describe()attribute generates a dataframe where count,std,max... are values of the index, so according to the documentationyou should use, for example:

.describe()属性生成一个数据帧，其中 count,std,max... 是索引的值，因此根据您应该使用的文档，例如：

df.describe().loc[['count','max']]

Answer 3

回答by st19297

The solution @Jeff provided just works for series.

@Jeff 提供的解决方案仅适用于系列。

@Rafa is on the point: df.describe().info()reveals that the resulting dataframe has Index: 8 entries, count to max

@Rafa 说到点子上了：df.describe().info()揭示结果数据帧有Index: 8 entries, count to max

df.describe().loc[['count','max']]does work, but df.groupby('simpleDate').describe().loc[['count','max']], which is what the OP asked, does not work.

df.describe().loc[['count','max']]确实有效，但是df.groupby('simpleDate').describe().loc[['count','max']]，这是 OP 所要求的，不起作用。

I think a solution may be this:

我认为一个解决方案可能是这样的：

df = pd.DataFrame({'Y': ['A', 'B', 'B', 'A', 'B'],
                    'Z': [10, 5, 6, 11, 12],
                                        })

grouping the df by Y:

将 df 分组为Y：

df_grouped=df.groupby(by='Y')     


In [207]df_grouped.agg([np.mean, len])

Out[207]: 
        Z    
     mean len
Y            
A  10.500   2
B   7.667   3

Answer 4

回答by Geoff Counihan

Sticking with describe, you can unstack the indexes and then slice normally too

坚持使用describe，您可以拆开索引，然后也可以正常切片

df.describe().unstack()[['count','max']]

修改 Python Pandas 的输出描述

提问by KHibma

采纳答案by Jeff

回答by Rafa

回答by st19297

回答by Geoff Counihan

相关推荐

最近更新

标签

修改 Python Pandas 的输出描述

提问by KHibma

采纳答案by Jeff

回答by Rafa

回答by st19297

回答by Geoff Counihan

相关推荐

Python 在 Pandas DafaFrame 中舍入条目

Python 修改pandas条形图的图例

python字符串格式化列中的行

Python 如何在 Windows 10 上安装 ChromeDriver 并使用 Chrome 运行 Selenium 测试？

相关推荐

最近更新

标签