pandas 计算特定组的百分位数

Question

提问by Anu

I have 3 columns. Product Id, Price, Group(values A, B, C, D)

我有 3 列。Product Id, Price, Group（值 A、B、C、D）

I want to get price percentile for each group and I am running the following code.

我想获得每个组的价格百分位数，我正在运行以下代码。

for group, price in df.groupby(['group']):
    df['percentile'] = np.percentile(df['price'],60)

the column percentile has only one value 3.44 for each group. The expected values for each group were 2.12, 3.43, 3.65, 4.76. 8.99.

每一组的列百分位数只有一个值 3.44。每组的预期值为2.12, 3.43, 3.65, 4.76. 8.99。

What is going wrong here, please let me know.

这里出了什么问题，请告诉我。

Answer 1

回答by jezrael

I think you can use in loop not all DataFramedfwith column price, but group pricewith column price:

我认为你不能在循环中DataFramedf使用 column price，而是price使用 column price：

import pandas as pd
import numpy as np

np.random.seed(1)
df = pd.DataFrame(np.random.randint(10, size=(5,3)))
df.columns = ['Product Id','group','price']
print df
   Product Id  group  price
0           5      8      9
1           5      0      0
2           1      7      6
3           9      2      4
4           5      2      4

for group, price in df.groupby(['group']):
    print np.percentile(df['price'],60)
4.8
4.8
4.8
4.8
group   

for group, price in df.groupby(['group']):
    print np.percentile(price['price'],60)
0.0
4.0
6.0
9.0

Another solution for np.percentilewhere is output Serie:

np.percentile输出的另一种解决方案是Serie：

print df.groupby(['group'])['price'].apply(lambda x: np.percentile(x,60))
group
0    0.0
2    4.0
7    6.0
8    9.0
Name: price, dtype: float64

Solution with DataFrameGroupBy.quantile:

解决方案DataFrameGroupBy.quantile：

print df.groupby(['group'])['price'].quantile(.6)
group
0    0.0
2    4.0
7    6.0
8    9.0
Name: price, dtype: float64

EDIT by comment:

通过评论编辑：

If you need new column use transform, docs:

如果您需要使用新列transform，文档：

>>> np.random.seed(1)
>>> df = pd.DataFrame(np.random.randint(10,size=(20,3)))
>>> df.columns = ['Product Id','group','price']
>>> df
    Product Id  group  price
0            5      8      9
1            5      0      0
2            1      7      6
3            9      2      4
4            5      2      4
5            2      4      7
6            7      9      1
7            7      0      6
8            9      9      7
9            6      9      1
10           0      1      8
11           8      3      9
12           8      7      3
13           6      5      1
14           9      3      4
15           8      1      4
16           0      3      9
17           2      0      4
18           9      2      7
19           7      9      8
>>> df['percentil'] = df.groupby(['group'])['price'].transform(lambda x: x.quantile(.6))

>>> df
    Product Id  group  price  percentil
0            5      8      9        9.0
1            5      0      0        4.4
2            1      7      6        4.8
3            9      2      4        4.6
4            5      2      4        4.6
5            2      4      7        7.0
6            7      9      1        5.8
7            7      0      6        4.4
8            9      9      7        5.8
9            6      9      1        5.8
10           0      1      8        6.4
11           8      3      9        9.0
12           8      7      3        4.8
13           6      5      1        1.0
14           9      3      4        9.0
15           8      1      4        6.4
16           0      3      9        9.0
17           2      0      4        4.4
18           9      2      7        4.6
19           7      9      8        5.8

Answer 2

回答by Sam

You could try pandas quantile

你可以试试Pandas分位数

df[['group', 'price']].groupby('group').quantile(.6)

Return values at the given quantile over requested axis, a la numpy.percentile.

返回请求轴上给定分位数的值，一个 numpy.percentile。

pandas 计算特定组的百分位数

提问by Anu

回答by jezrael

回答by Sam

相关推荐

最近更新

标签

pandas 计算特定组的百分位数

提问by Anu

回答by jezrael

回答by Sam

相关推荐

Pandas DateOffset，倒退一天

pandas python datetime快速提取小时分钟

Python Pandas：绘制 100% 堆叠图问题

pandas 如何使用pandas python将行转换为列表？

相关推荐

最近更新

标签