在 Python Pandas Dataframe 中计算百分位数
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/44611347/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Calculating Percentile in Python Pandas Dataframe
提问by mattblack
I'm trying to calculate the percentile of each number within a dataframe and add it to a new column called 'percentile'.
我正在尝试计算数据框中每个数字的百分位数,并将其添加到名为“百分位数”的新列中。
This is my attempt:
这是我的尝试:
import pandas as pd
from scipy import stats
data = {'symbol':'FB','date':['2012-05-18','2012-05-21','2012-05-22','2012-05-23'],'close':[38.23,34.03,31.00,32.00]}
df = pd.DataFrame(data)
close = df['close']
for i in df:
df['percentile'] = stats.percentileofscore(close,df['close'])
The column is not being filled and results in 'NaN'. This should be fairly easy, but I'm not sure where I'm going wrong.
该列未填充,结果为“NaN”。这应该相当容易,但我不确定我哪里出错了。
Thanks in advance for the help.
在此先感谢您的帮助。
采纳答案by Scott Boston
df.close.apply(lambda x: stats.percentileofscore(df.close.sort_values(),x))
or
或者
df.close.rank(pct=True)
Output:
输出:
0 1.00
1 0.75
2 0.25
3 0.50
Name: close, dtype: float64