Python Pandas Dataframe 替换低于阈值的值

Question

提问by J-H

How can I apply a function element-wise to a pandas DataFrame and pass a column-wise calculated value (e.g. quantile of column)? For example, what if I want to replace all elements in a DataFrame (with NaN) where the value is lower than the 80th percentile of the column?

如何将函数逐元素应用于 Pandas DataFrame 并传递逐列计算值（例如列的分位数）？例如，如果我想替换 DataFrame 中的所有元素（用NaN），其中值低于列的第 80 个百分位怎么办？

def _deletevalues(x, quantile):
if x < quantile:
    return np.nan
else:
    return x

df.applymap(lambda x: _deletevalues(x, x.quantile(0.8)))

Using applymaponly allows one to access each value individually and throws (of course) an AttributeError: ("'float' object has no attribute 'quantile'

Using applymaponly 允许单独访问每个值并抛出（当然）一个AttributeError: ("'float' object has no attribute 'quantile'

Thank you in advance.

先感谢您。

Answer 1

采纳答案by MaxU

In [139]: df
Out[139]:
   a  b  c
0  1  7  3
1  1  2  6
2  3  0  5
3  8  2  1
4  7  3  5
5  6  7  2
6  0  2  1
7  8  4  1
8  5  0  6
9  7  7  6

for allcolumns:

对于所有列：

In [145]: df.apply(lambda x: np.where(x < x.quantile(),np.nan,x))
Out[145]:
     a    b    c
0  NaN  7.0  NaN
1  NaN  NaN  6.0
2  NaN  NaN  5.0
3  8.0  NaN  NaN
4  7.0  3.0  5.0
5  6.0  7.0  NaN
6  NaN  NaN  NaN
7  8.0  4.0  NaN
8  NaN  NaN  6.0
9  7.0  7.0  6.0

or

或者

In [149]: df[df < df.quantile()] = np.nan

In [150]: df
Out[150]:
     a    b    c
0  NaN  7.0  NaN
1  NaN  NaN  6.0
2  NaN  NaN  5.0
3  8.0  NaN  NaN
4  7.0  3.0  5.0
5  6.0  7.0  NaN
6  NaN  NaN  NaN
7  8.0  4.0  NaN
8  NaN  NaN  6.0
9  7.0  7.0  6.0

Answer 2

回答by jezrael

Use DataFrame.mask:

使用DataFrame.mask：

df = df.mask(df < df.quantile())
print (df)
     a    b    c
0  NaN  7.0  NaN
1  NaN  NaN  6.0
2  NaN  NaN  5.0
3  8.0  NaN  NaN
4  7.0  3.0  5.0
5  6.0  7.0  NaN
6  NaN  NaN  NaN
7  8.0  4.0  NaN
8  NaN  NaN  6.0
9  7.0  7.0  6.0

Python Pandas Dataframe 替换低于阈值的值

提问by J-H

采纳答案by MaxU

回答by jezrael

相关推荐

最近更新

标签

Python Pandas Dataframe 替换低于阈值的值

提问by J-H

采纳答案by MaxU

回答by jezrael

相关推荐

pandas 在不同的列名上合并两个不同的数据框

错误：pandas 哈希表 keyerror

pandas 从熊猫列中删除字符

将 Pandas 数据透视表转换为常规数据框

相关推荐

最近更新

标签