如何将 lambda 函数正确应用到 Pandas 数据框列中

Question

提问by Amani

I have a pandas data frame, sample, with one of the columns called PRto which am applying a lambda function as follows:

我有一个 Pandas 数据框，sample其中一列被调用PR，正在应用 lambda 函数，如下所示：

sample['PR'] = sample['PR'].apply(lambda x: NaN if x < 90)

I then get the following syntax error message:

然后我收到以下语法错误消息：

sample['PR'] = sample['PR'].apply(lambda x: NaN if x < 90)
                                                         ^
SyntaxError: invalid syntax

What am I doing wrong?

我究竟做错了什么？

Answer 1

回答by jezrael

You need mask:

你需要mask：

sample['PR'] = sample['PR'].mask(sample['PR'] < 90, np.nan)

Another solution with locand boolean indexing:

使用loc和的另一种解决方案boolean indexing：

sample.loc[sample['PR'] < 90, 'PR'] = np.nan

Sample:

样本：

import pandas as pd
import numpy as np

sample = pd.DataFrame({'PR':[10,100,40] })
print (sample)
    PR
0   10
1  100
2   40

sample['PR'] = sample['PR'].mask(sample['PR'] < 90, np.nan)
print (sample)
      PR
0    NaN
1  100.0
2    NaN

sample.loc[sample['PR'] < 90, 'PR'] = np.nan
print (sample)
      PR
0    NaN
1  100.0
2    NaN

EDIT:

编辑：

Solution with apply:

解决方案apply：

sample['PR'] = sample['PR'].apply(lambda x: np.nan if x < 90 else x)

Timingslen(df)=300k:

时间len(df)=300k：

sample = pd.concat([sample]*100000).reset_index(drop=True)

In [853]: %timeit sample['PR'].apply(lambda x: np.nan if x < 90 else x)
10 loops, best of 3: 102 ms per loop

In [854]: %timeit sample['PR'].mask(sample['PR'] < 90, np.nan)
The slowest run took 4.28 times longer than the fastest. This could mean that an intermediate result is being cached.
100 loops, best of 3: 3.71 ms per loop

Answer 2

回答by kali prasad deverasetti

You need to add elsein your lambda function. Because you are telling what to do in case your condition(here x < 90) is met, but you are not telling what to do in case the condition is not met.

您需要添加elselambda 函数。因为您是在告诉在满足条件（此处 x < 90）的情况下该做什么，但您没有告诉在不满足条件的情况下该做什么。

sample['PR'] = sample['PR'].apply(lambda x: 'NaN' if x < 90 else x)

如何将 lambda 函数正确应用到 Pandas 数据框列中

提问by Amani

回答by jezrael

回答by kali prasad deverasetti

相关推荐

最近更新

标签

如何将 lambda 函数正确应用到 Pandas 数据框列中

提问by Amani

回答by jezrael

回答by kali prasad deverasetti

相关推荐

vba 从文本文件中读取数据并定界

EXCEL VBA 时间值格式 (hh:mm am/pm)

pandas 熊猫条形图更改日期格式

Microsoft Excel 数据连接 - 通过 VBA 更改连接字符串

相关推荐

最近更新

标签