Python Pandas .apply() 函数中的异常处理

Question

提问by RukTech

If I have a DataFrame:

如果我有一个数据帧：

myDF = DataFrame(data=[[11,11],[22,'2A'],[33,33]], columns = ['A','B'])

Gives the following dataframe (Starting out on stackoverflow and don't have enough reputation for an image of the DataFrame)

提供以下数据帧（从 stackoverflow 开始并且没有足够的 DataFrame 图像声誉）

   | A  | B  |

0  | 11 | 11 |

1  | 22 | 2A |

2  | 33 | 33 |

If i want to convert column B to int values and drop values that can't be converted I have to do:

如果我想将 B 列转换为 int 值并删除无法转换的值，我必须这样做：

def convertToInt(cell):
    try:
        return int(cell)
    except:
        return None
myDF['B'] = myDF['B'].apply(convertToInt)

If I only do:

如果我只做：

myDF['B'].apply(int)

the error obviously is:

错误显然是：

C:\WinPython-32bit-2.7.5.3\python-2.7.5\lib\site-packages\pandas\lib.pyd in pandas.lib.map_infer (pandas\lib.c:42840)()
ValueError: invalid literal for int() with base 10: '2A'

C:\WinPython-32bit-2.7.5.3\python-2.7.5\lib\site-packages\pandas\lib.pyd in pandas.lib.map_infer (pandas\lib.c:42840)()
ValueError：int() 的无效文字，基数为 10：'2A'

Is there a way to add exception handling to myDF['B'].apply()

有没有办法向 myDF['B'].apply() 添加异常处理

Thank you in advance!

先感谢您！

Answer 1

采纳答案by Jeff

much better/faster to do:

做得更好/更快：

In [1]: myDF = DataFrame(data=[[11,11],[22,'2A'],[33,33]], columns = ['A','B'])

In [2]: myDF.convert_objects(convert_numeric=True)
Out[2]: 
    A   B
0  11  11
1  22 NaN
2  33  33

[3 rows x 2 columns]

In [3]: myDF.convert_objects(convert_numeric=True).dtypes
Out[3]: 
A      int64
B    float64
dtype: object

This is a vectorized method of doing just this. The coerceflag say to mark as nananything that cannot be converted to numeric.

这是执行此操作的矢量化方法。该coerce旗说，以纪念为nan任何无法转换为数字。

You can of course do this to a single column if you'd like.

如果您愿意，您当然可以对单个列执行此操作。

Answer 2

回答by Amit Verma

A way to achieve that with lambda:

一种实现这一目标的方法lambda：

myDF['B'].apply(lambda x: int(x) if str(x).isdigit() else None)

For your input:

对于您的输入：

>>> myDF
    A   B
0  11  11
1  22  2A
2  33  33

[3 rows x 2 columns]

>>> myDF['B'].apply(lambda x: int(x) if str(x).isdigit() else None)
0    11
1   NaN
2    33
Name: B, dtype: float64

Answer 3

回答by atkat12

I had the same question, but for a more general case where it was hard to tell if the function would generate an exception (i.e. you couldn't explicitly check this condition with something as straightforward as isdigit).

我有同样的问题，但对于更一般的情况，很难判断函数是否会生成异常（即，您无法使用像那样简单的方法明确检查此条件isdigit）。

After thinking about it for a while, I came up with the solution of embedding the try/exceptsyntax in a separate function. I'm posting a toy example in case it helps anyone.

想了想，想到了将try/except语法嵌入到单独的函数中的解决方案。我正在发布一个玩具示例，以防它对任何人有所帮助。

import pandas as pd
import numpy as np

x=pd.DataFrame(np.array([['a','a'], [1,2]]))

def augment(x):
    try:
        return int(x)+1
    except:
        return 'error:' + str(x)

x[0].apply(lambda x: augment(x))

Python Pandas .apply() 函数中的异常处理

提问by RukTech

采纳答案by Jeff

回答by Amit Verma

回答by atkat12

相关推荐

最近更新

标签

Python Pandas .apply() 函数中的异常处理

提问by RukTech

采纳答案by Jeff

回答by Amit Verma

回答by atkat12

相关推荐

在 Python 3.4 中重新加载模块

Python uwsgi + nginx + flask：上游提前关闭

使用 python win32com Outlook 清楚地记录了电子邮件功能的阅读

Python 属性错误：无法设置属性

相关推荐

最近更新

标签